All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [RFC]drm: add syncobj timeline support v5
@ 2018-09-14 10:37 Chunming Zhou
  2018-09-14 10:49 ` Christian König
  2018-09-17  8:37 ` Christian König
  0 siblings, 2 replies; 12+ messages in thread
From: Chunming Zhou @ 2018-09-14 10:37 UTC (permalink / raw)
  To: dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Daniel Rakos, Daniel Vetter, Dave Airlie, Christian Konig

This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side:
This extension introduces a new type of syncobj that has an integer payload
identifying a point in a timeline. Such timeline syncobjs support the
following operations:
   * CPU query - A host operation that allows querying the payload of the
     timeline syncobj.
   * CPU wait - A host operation that allows a blocking wait for a
     timeline syncobj to reach a specified value.
   * Device wait - A device operation that allows waiting for a
     timeline syncobj to reach a specified value.
   * Device signal - A device operation that allows advancing the
     timeline syncobj to a specified value.

Since it's a timeline, that means the front time point(PT) always is signaled before the late PT.
a. signal PT design:
Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled,
the timeline will increase to value of PT[N].
b. wait PT design:
Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare
wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be
signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline,
so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to
perform that.

v2:
1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
2. move unexposed denitions to .c file. (Daniel Vetter)
3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian)
4. split up the change to drm_syncobj_replace_fence() in a separate patch.
5. drop the submission_fence implementation and instead use wait_event() for that. (Christian)
6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)

v3:
1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
    a. normal syncobj signal op will create a signal PT to tail of signal pt list.
    b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT.
2. many bug fix and clean up
3. stub fence moving is moved to other patch.

v4:
1. fix RB tree loop with while(node=rb_first(...)). (Christian)
2. fix syncobj lifecycle. (Christian)
3. only enable_signaling when there is wait_pt. (Christian)
4. fix timeline path issues.
5. write a timeline test in libdrm

v5: (Christian)
1. semaphore is called syncobj in kernel side.
2. don't need 'timeline' characters in some function name.
3. keep syncobj cb

normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
timeline syncobj is tested by ./amdgpu_test -s 9

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Cc: Christian Konig <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Rakos <Daniel.Rakos@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
 include/drm/drm_syncobj.h                  |  62 +++--
 include/uapi/drm/drm.h                     |   1 +
 4 files changed, 292 insertions(+), 69 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index e9ce623d049e..e78d076f2703 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,9 @@
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
 
+/* merge normal syncobj to timeline syncobj, the point interval is 1 */
+#define DRM_SYNCOBJ_NORMAL_POINT 1
+
 struct drm_syncobj_stub_fence {
 	struct dma_fence base;
 	spinlock_t lock;
@@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
 	.release = drm_syncobj_stub_fence_release,
 };
 
+struct drm_syncobj_signal_pt {
+	struct dma_fence_array *base;
+	u64    value;
+	struct list_head list;
+};
 
 /**
  * drm_syncobj_find - lookup and reference a sync object.
@@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
 {
 	int ret;
 
-	*fence = drm_syncobj_fence_get(syncobj);
+	ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
 	if (*fence)
 		return 1;
 
@@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
 	 * have the lock, try one more time just to be sure we don't add a
 	 * callback when a fence has already been set.
 	 */
-	if (syncobj->fence) {
-		*fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
-								 lockdep_is_held(&syncobj->lock)));
-		ret = 1;
+	if (fence) {
+		drm_syncobj_search_fence(syncobj, 0, 0, fence);
+		if (*fence)
+			ret = 1;
 	} else {
 		*fence = NULL;
 		drm_syncobj_add_callback_locked(syncobj, cb, func);
@@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
 	spin_unlock(&syncobj->lock);
 }
 
+static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
+				      struct drm_syncobj_timeline *syncobj_timeline)
+{
+	spin_lock(&syncobj->lock);
+	syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
+	syncobj_timeline->timeline = 0;
+	syncobj_timeline->signal_point = 0;
+	init_waitqueue_head(&syncobj_timeline->wq);
+
+	INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
+	spin_unlock(&syncobj->lock);
+}
+
+static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
+				      struct drm_syncobj_timeline *syncobj_timeline)
+{
+	struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
+
+	spin_lock(&syncobj->lock);
+	list_for_each_entry_safe(signal_pt, tmp,
+				 &syncobj_timeline->signal_pt_list, list) {
+		list_del(&signal_pt->list);
+		dma_fence_put(&signal_pt->base->base);
+		kfree(signal_pt);
+	}
+	spin_unlock(&syncobj->lock);
+}
+
+static struct dma_fence
+*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
+				      uint64_t point)
+{
+	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
+	struct drm_syncobj_signal_pt *signal_pt;
+
+	if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
+	    (point <= timeline->timeline)) {
+		struct drm_syncobj_stub_fence *fence =
+			kzalloc(sizeof(struct drm_syncobj_stub_fence),
+				GFP_KERNEL);
+
+		if (!fence)
+			return NULL;
+		spin_lock_init(&fence->lock);
+		dma_fence_init(&fence->base,
+			       &drm_syncobj_stub_fence_ops,
+			       &fence->lock,
+			       syncobj->syncobj_timeline.timeline_context,
+			       point);
+
+		dma_fence_signal(&fence->base);
+		return &fence->base;
+	}
+
+	list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
+		if (point > signal_pt->value)
+			continue;
+		if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
+		    (point != signal_pt->value))
+			continue;
+		return dma_fence_get(&signal_pt->base->base);
+	}
+	return NULL;
+}
+
+static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
+					struct dma_fence *fence,
+					u64 point)
+{
+	struct drm_syncobj_signal_pt *signal_pt =
+		kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
+	struct drm_syncobj_signal_pt *tail_pt;
+	struct dma_fence **fences;
+	int num_fences = 0;
+	int ret = 0, i;
+
+	if (!signal_pt)
+		return -ENOMEM;
+	if (syncobj->syncobj_timeline.signal_point >= point) {
+		DRM_WARN("A later signal is ready!");
+		goto out;
+	}
+	if (!fence)
+		goto out;
+
+	fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
+	if (!fences)
+		goto out;
+	fences[num_fences++] = dma_fence_get(fence);
+	/* timeline syncobj must take this dependency */
+	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+		spin_lock(&syncobj->lock);
+		if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
+			tail_pt = list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
+						  struct drm_syncobj_signal_pt, list);
+			fences[num_fences++] = dma_fence_get(&tail_pt->base->base);
+		}
+		spin_unlock(&syncobj->lock);
+	}
+	signal_pt->base = dma_fence_array_create(num_fences, fences,
+						 syncobj->syncobj_timeline.timeline_context,
+						 point, false);
+	if (!signal_pt->base)
+		goto fail;
+
+	spin_lock(&syncobj->lock);
+	signal_pt->value = point;
+	INIT_LIST_HEAD(&signal_pt->list);
+	list_add_tail(&signal_pt->list, &syncobj->syncobj_timeline.signal_pt_list);
+	syncobj->syncobj_timeline.signal_point = point;
+	spin_unlock(&syncobj->lock);
+	wake_up_all(&syncobj->syncobj_timeline.wq);
+
+	return 0;
+fail:
+	for (i = 0; i < num_fences; i++)
+		dma_fence_put(fences[i]);
+	kfree(fences);
+out:
+	kfree(signal_pt);
+	return ret;
+}
+
+static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
+{
+	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
+	struct drm_syncobj_signal_pt *signal_pt, *tmp;
+
+	spin_lock(&syncobj->lock);
+	list_for_each_entry_safe(signal_pt, tmp,
+				 &timeline->signal_pt_list, list) {
+		if (dma_fence_is_signaled(&signal_pt->base->base)) {
+			timeline->timeline = signal_pt->value;
+			list_del(&signal_pt->list);
+			dma_fence_put(&signal_pt->base->base);
+			kfree(signal_pt);
+		} else {
+			/*signal_pt is in order in list, from small to big, so
+			 * the later must not be signal either */
+			break;
+		}
+	}
+
+	spin_unlock(&syncobj->lock);
+}
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
@@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       u64 point,
 			       struct dma_fence *fence)
 {
-	struct dma_fence *old_fence;
-	struct drm_syncobj_cb *cur, *tmp;
-
-	if (fence)
-		dma_fence_get(fence);
-
-	spin_lock(&syncobj->lock);
-
-	old_fence = rcu_dereference_protected(syncobj->fence,
-					      lockdep_is_held(&syncobj->lock));
-	rcu_assign_pointer(syncobj->fence, fence);
+	drm_syncobj_garbage_collection(syncobj);
+	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+		if (fence)
+			drm_syncobj_create_signal_pt(syncobj, fence, point);
+	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
+		u64 pt_value;
+
+		if (!fence) {
+			drm_syncobj_timeline_fini(syncobj,
+						  &syncobj->syncobj_timeline);
+			drm_syncobj_timeline_init(syncobj,
+						  &syncobj->syncobj_timeline);
+			return;
+		}
+		pt_value = syncobj->syncobj_timeline.signal_point +
+			DRM_SYNCOBJ_NORMAL_POINT;
+		drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
+	} else {
+		DRM_ERROR("the syncobj type isn't support\n");
+		return;
+	}
+	if (fence) {
+		struct drm_syncobj_cb *cur, *tmp;
 
-	if (fence != old_fence) {
+		spin_lock(&syncobj->lock);
 		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
 			list_del_init(&cur->node);
 			cur->func(syncobj, cur);
 		}
+		spin_unlock(&syncobj->lock);
 	}
-
-	spin_unlock(&syncobj->lock);
-
-	dma_fence_put(old_fence);
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
@@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
 	return 0;
 }
 
+static struct dma_fence *
+drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 flags)
+{
+	struct dma_fence *fence;
+	int ret = 0;
+
+	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+		ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
+					point <= syncobj->syncobj_timeline.signal_point,
+					msecs_to_jiffies(10000)); /* wait 10s */
+		if (ret <= 0)
+			return NULL;
+	}
+	spin_lock(&syncobj->lock);
+	fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
+	spin_unlock(&syncobj->lock);
+	return fence;
+}
+
 /**
  * drm_syncobj_find_fence - lookup and reference the fence in a sync object
  * @file_private: drm file private pointer
@@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
  * contains a reference to the fence, which must be released by calling
  * dma_fence_put().
  */
-int drm_syncobj_find_fence(struct drm_file *file_private,
-			   u32 handle, u64 point,
-			   struct dma_fence **fence)
+int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
+			     u64 flags, struct dma_fence **fence)
 {
-	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
 	int ret = 0;
 
 	if (!syncobj)
 		return -ENOENT;
 
-	*fence = drm_syncobj_fence_get(syncobj);
+	drm_syncobj_garbage_collection(syncobj);
+	if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
+		/*NORMAL syncobj always wait on last pt */
+		u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
+
+		if (tail_pt_value == 0)
+			tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
+		/* NORMAL syncobj doesn't care point value */
+		WARN_ON(point != 0);
+		*fence = drm_syncobj_point_get(syncobj, tail_pt_value,
+							flags);
+	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+		*fence = drm_syncobj_point_get(syncobj, point,
+							flags);
+	} else {
+		DRM_ERROR("Don't support this type syncobj\n");
+		*fence = NULL;
+	}
 	if (!*fence) {
 		ret = -EINVAL;
 	}
+	return ret;
+}
+EXPORT_SYMBOL(drm_syncobj_search_fence);
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle, u64 point,
+			   struct dma_fence **fence) {
+	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
+
+	int ret = drm_syncobj_search_fence(syncobj, point,
+					DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+					fence);
 	drm_syncobj_put(syncobj);
 	return ret;
 }
@@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
 	struct drm_syncobj *syncobj = container_of(kref,
 						   struct drm_syncobj,
 						   refcount);
-	drm_syncobj_replace_fence(syncobj, 0, NULL);
+	drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
 	kfree(syncobj);
 }
 EXPORT_SYMBOL(drm_syncobj_free);
@@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 	kref_init(&syncobj->refcount);
 	INIT_LIST_HEAD(&syncobj->cb_list);
 	spin_lock_init(&syncobj->lock);
+	if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
+		syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
+	else
+		syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
+	drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
 
 	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
 		ret = drm_syncobj_assign_null_handle(syncobj);
@@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev, void *data,
 		return -ENODEV;
 
 	/* no valid flags yet */
-	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
+	if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
+			    DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
 		return -EINVAL;
 
 	return drm_syncobj_create_as_handle(file_private,
@@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
 	struct syncobj_wait_entry *wait =
 		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
 
-	/* This happens inside the syncobj lock */
-	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
-							      lockdep_is_held(&syncobj->lock)));
+	drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
+
 	wake_up_process(wait->task);
 }
 
@@ -698,7 +910,8 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
 	signaled_count = 0;
 	for (i = 0; i < count; ++i) {
 		entries[i].task = current;
-		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
+		ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
+					       &entries[i].fence);
 		if (!entries[i].fence) {
 			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
 				continue;
@@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
 	if (ret < 0)
 		return ret;
 
-	for (i = 0; i < args->count_handles; i++)
-		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
-
+	for (i = 0; i < args->count_handles; i++) {
+		if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
+			DRM_ERROR("timeline syncobj cannot reset!\n");
+			ret = -EINVAL;
+			goto out;
+		}
+		drm_syncobj_timeline_fini(syncobjs[i],
+					  &syncobjs[i]->syncobj_timeline);
+		drm_syncobj_timeline_init(syncobjs[i],
+					  &syncobjs[i]->syncobj_timeline);
+	}
+out:
 	drm_syncobj_array_free(syncobjs, args->count_handles);
 
-	return 0;
+	return ret;
 }
 
 int
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 0a8d2d64f380..579e91a5858b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
 		if (!(flags & I915_EXEC_FENCE_WAIT))
 			continue;
 
-		fence = drm_syncobj_fence_get(syncobj);
+		drm_syncobj_search_fence(syncobj, 0,
+					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
+					 &fence);
 		if (!fence)
 			return -EINVAL;
 
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 425432b85a87..9535521e6623 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -30,6 +30,25 @@
 
 struct drm_syncobj_cb;
 
+enum drm_syncobj_type {
+	DRM_SYNCOBJ_TYPE_NORMAL,
+	DRM_SYNCOBJ_TYPE_TIMELINE
+};
+
+struct drm_syncobj_timeline {
+	wait_queue_head_t	wq;
+	u64 timeline_context;
+	/**
+	 * @timeline: syncobj timeline
+	 */
+	u64 timeline;
+	u64 signal_point;
+
+
+	struct rb_root wait_pt_tree;
+	struct list_head signal_pt_list;
+};
+
 /**
  * struct drm_syncobj - sync object.
  *
@@ -41,19 +60,19 @@ struct drm_syncobj {
 	 */
 	struct kref refcount;
 	/**
-	 * @fence:
-	 * NULL or a pointer to the fence bound to this object.
-	 *
-	 * This field should not be used directly. Use drm_syncobj_fence_get()
-	 * and drm_syncobj_replace_fence() instead.
+	 * @type: indicate syncobj type
 	 */
-	struct dma_fence __rcu *fence;
+	enum drm_syncobj_type type;
 	/**
-	 * @cb_list: List of callbacks to call when the &fence gets replaced.
+	 * @syncobj_timeline: timeline
 	 */
+	struct drm_syncobj_timeline syncobj_timeline;
+	/**
+         * @cb_list: List of callbacks to call when the &fence gets replaced.
+         */
 	struct list_head cb_list;
 	/**
-	 * @lock: Protects &cb_list and write-locks &fence.
+	 * @lock: Protects syncobj list and write-locks &fence.
 	 */
 	spinlock_t lock;
 	/**
@@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
 /**
  * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
  * @node: used by drm_syncob_add_callback to append this struct to
- *	  &drm_syncobj.cb_list
+ *       &drm_syncobj.cb_list
  * @func: drm_syncobj_func_t to call
  *
  * This struct will be initialized by drm_syncobj_add_callback, additional
@@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
 	kref_put(&obj->refcount, drm_syncobj_free);
 }
 
-/**
- * drm_syncobj_fence_get - get a reference to a fence in a sync object
- * @syncobj: sync object.
- *
- * This acquires additional reference to &drm_syncobj.fence contained in @obj,
- * if not NULL. It is illegal to call this without already holding a reference.
- * No locks required.
- *
- * Returns:
- * Either the fence of @obj or NULL if there's none.
- */
-static inline struct dma_fence *
-drm_syncobj_fence_get(struct drm_syncobj *syncobj)
-{
-	struct dma_fence *fence;
-
-	rcu_read_lock();
-	fence = dma_fence_get_rcu_safe(&syncobj->fence);
-	rcu_read_unlock();
-
-	return fence;
-}
-
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
@@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 int drm_syncobj_get_handle(struct drm_file *file_private,
 			   struct drm_syncobj *syncobj, u32 *handle);
 int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
+int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
+			     u64 flags, struct dma_fence **fence);
 
 #endif
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 300f336633f2..cebdb2541eb7 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -717,6 +717,7 @@ struct drm_prime_handle {
 struct drm_syncobj_create {
 	__u32 handle;
 #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
+#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
 	__u32 flags;
 };
 
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
  2018-09-14 10:37 [PATCH] [RFC]drm: add syncobj timeline support v5 Chunming Zhou
@ 2018-09-14 10:49 ` Christian König
       [not found]   ` <9a07a371-8de1-2ebf-66e7-23b93c1bc1c3-5C7GfCeVMHo@public.gmane.org>
  2018-09-17  8:37 ` Christian König
  1 sibling, 1 reply; 12+ messages in thread
From: Christian König @ 2018-09-14 10:49 UTC (permalink / raw)
  To: Chunming Zhou, dri-devel; +Cc: Dave Airlie, Daniel Rakos, amd-gfx

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side:
> This extension introduces a new type of syncobj that has an integer payload
> identifying a point in a timeline. Such timeline syncobjs support the
> following operations:
>     * CPU query - A host operation that allows querying the payload of the
>       timeline syncobj.
>     * CPU wait - A host operation that allows a blocking wait for a
>       timeline syncobj to reach a specified value.
>     * Device wait - A device operation that allows waiting for a
>       timeline syncobj to reach a specified value.
>     * Device signal - A device operation that allows advancing the
>       timeline syncobj to a specified value.
>
> Since it's a timeline, that means the front time point(PT) always is signaled before the late PT.
> a. signal PT design:
> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled,
> the timeline will increase to value of PT[N].
> b. wait PT design:
> Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare
> wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be
> signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline,
> so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to
> perform that.
>
> v2:
> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
> 2. move unexposed denitions to .c file. (Daniel Vetter)
> 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian)
> 4. split up the change to drm_syncobj_replace_fence() in a separate patch.
> 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian)
> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
>
> v3:
> 1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
>      a. normal syncobj signal op will create a signal PT to tail of signal pt list.
>      b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT.
> 2. many bug fix and clean up
> 3. stub fence moving is moved to other patch.
>
> v4:
> 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
> 2. fix syncobj lifecycle. (Christian)
> 3. only enable_signaling when there is wait_pt. (Christian)
> 4. fix timeline path issues.
> 5. write a timeline test in libdrm
>
> v5: (Christian)
> 1. semaphore is called syncobj in kernel side.
> 2. don't need 'timeline' characters in some function name.
> 3. keep syncobj cb
>
> normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
> timeline syncobj is tested by ./amdgpu_test -s 9
>
> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
> Cc: Christian Konig <christian.koenig@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Daniel Rakos <Daniel.Rakos@amd.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>

At least on first glance that looks like it should work, going to do a 
detailed review on Monday.

Christian.

> ---
>   drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
>   include/drm/drm_syncobj.h                  |  62 +++--
>   include/uapi/drm/drm.h                     |   1 +
>   4 files changed, 292 insertions(+), 69 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index e9ce623d049e..e78d076f2703 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -56,6 +56,9 @@either
>   #include "drm_internal.h"
>   #include <drm/drm_syncobj.h>
>   
> +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
> +#define DRM_SYNCOBJ_NORMAL_POINT 1
> +
>   struct drm_syncobj_stub_fence {
>   	struct dma_fence base;
>   	spinlock_t lock;
> @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
>   	.release = drm_syncobj_stub_fence_release,
>   };
>   
> +struct drm_syncobj_signal_pt {
> +	struct dma_fence_array *base;
> +	u64    value;
> +	struct list_head list;
> +};
>   
>   /**
>    * drm_syncobj_find - lookup and reference a sync object.
> @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>   {
>   	int ret;
>   
> -	*fence = drm_syncobj_fence_get(syncobj);
> +	ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
>   	if (*fence)
>   		return 1;
>   
> @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>   	 * have the lock, try one more time just to be sure we don't add a
>   	 * callback when a fence has already been set.
>   	 */
> -	if (syncobj->fence) {
> -		*fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
> -								 lockdep_is_held(&syncobj->lock)));
> -		ret = 1;
> +	if (fence) {
> +		drm_syncobj_search_fence(syncobj, 0, 0, fence);
> +		if (*fence)
> +			ret = 1;
>   	} else {
>   		*fence = NULL;
>   		drm_syncobj_add_callback_locked(syncobj, cb, func);
> @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
>   	spin_unlock(&syncobj->lock);
>   }
>   
> +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
> +				      struct drm_syncobj_timeline *syncobj_timeline)
> +{
> +	spin_lock(&syncobj->lock);
> +	syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
> +	syncobj_timeline->timeline = 0;
> +	syncobj_timeline->signal_point = 0;
> +	init_waitqueue_head(&syncobj_timeline->wq);
> +
> +	INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
> +	spin_unlock(&syncobj->lock);
> +}
> +
> +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
> +				      struct drm_syncobj_timeline *syncobj_timeline)
> +{
> +	struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
> +
> +	spin_lock(&syncobj->lock);
> +	list_for_each_entry_safe(signal_pt, tmp,
> +				 &syncobj_timeline->signal_pt_list, list) {
> +		list_del(&signal_pt->list);
> +		dma_fence_put(&signal_pt->base->base);
> +		kfree(signal_pt);
> +	}
> +	spin_unlock(&syncobj->lock);
> +}
> +
> +static struct dma_fence
> +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
> +				      uint64_t point)
> +{
> +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
> +	struct drm_syncobj_signal_pt *signal_pt;
> +
> +	if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
> +	    (point <= timeline->timeline)) {
> +		struct drm_syncobj_stub_fence *fence =
> +			kzalloc(sizeof(struct drm_syncobj_stub_fence),
> +				GFP_KERNEL);
> +
> +		if (!fence)
> +			return NULL;
> +		spin_lock_init(&fence->lock);
> +		dma_fence_init(&fence->base,
> +			       &drm_syncobj_stub_fence_ops,
> +			       &fence->lock,
> +			       syncobj->syncobj_timeline.timeline_context,
> +			       point);
> +
> +		dma_fence_signal(&fence->base);
> +		return &fence->base;
> +	}
> +
> +	list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
> +		if (point > signal_pt->value)
> +			continue;
> +		if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
> +		    (point != signal_pt->value))
> +			continue;
> +		return dma_fence_get(&signal_pt->base->base);
> +	}
> +	return NULL;
> +}
> +
> +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
> +					struct dma_fence *fence,
> +					u64 point)
> +{
> +	struct drm_syncobj_signal_pt *signal_pt =
> +		kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
> +	struct drm_syncobj_signal_pt *tail_pt;
> +	struct dma_fence **fences;
> +	int num_fences = 0;
> +	int ret = 0, i;
> +
> +	if (!signal_pt)
> +		return -ENOMEM;
> +	if (syncobj->syncobj_timeline.signal_point >= point) {
> +		DRM_WARN("A later signal is ready!");
> +		goto out;
> +	}
> +	if (!fence)
> +		goto out;
> +
> +	fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
> +	if (!fences)
> +		goto out;
> +	fences[num_fences++] = dma_fence_get(fence);
> +	/* timeline syncobj must take this dependency */
> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +		spin_lock(&syncobj->lock);
> +		if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
> +			tail_pt = list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
> +						  struct drm_syncobj_signal_pt, list);
> +			fences[num_fences++] = dma_fence_get(&tail_pt->base->base);
> +		}
> +		spin_unlock(&syncobj->lock);
> +	}
> +	signal_pt->base = dma_fence_array_create(num_fences, fences,
> +						 syncobj->syncobj_timeline.timeline_context,
> +						 point, false);
> +	if (!signal_pt->base)
> +		goto fail;
> +
> +	spin_lock(&syncobj->lock);
> +	signal_pt->value = point;
> +	INIT_LIST_HEAD(&signal_pt->list);
> +	list_add_tail(&signal_pt->list, &syncobj->syncobj_timeline.signal_pt_list);
> +	syncobj->syncobj_timeline.signal_point = point;
> +	spin_unlock(&syncobj->lock);
> +	wake_up_all(&syncobj->syncobj_timeline.wq);
> +
> +	return 0;
> +fail:
> +	for (i = 0; i < num_fences; i++)
> +		dma_fence_put(fences[i]);
> +	kfree(fences);
> +out:
> +	kfree(signal_pt);
> +	return ret;
> +}
> +
> +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
> +{
> +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
> +	struct drm_syncobj_signal_pt *signal_pt, *tmp;
> +
> +	spin_lock(&syncobj->lock);
> +	list_for_each_entry_safe(signal_pt, tmp,
> +				 &timeline->signal_pt_list, list) {
> +		if (dma_fence_is_signaled(&signal_pt->base->base)) {
> +			timeline->timeline = signal_pt->value;
> +			list_del(&signal_pt->list);
> +			dma_fence_put(&signal_pt->base->base);
> +			kfree(signal_pt);
> +		} else {
> +			/*signal_pt is in order in list, from small to big, so
> +			 * the later must not be signal either */
> +			break;
> +		}
> +	}
> +
> +	spin_unlock(&syncobj->lock);
> +}
>   /**
>    * drm_syncobj_replace_fence - replace fence in a sync object.
>    * @syncobj: Sync object to replace fence in
> @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>   			       u64 point,
>   			       struct dma_fence *fence)
>   {
> -	struct dma_fence *old_fence;
> -	struct drm_syncobj_cb *cur, *tmp;
> -
> -	if (fence)
> -		dma_fence_get(fence);
> -
> -	spin_lock(&syncobj->lock);
> -
> -	old_fence = rcu_dereference_protected(syncobj->fence,
> -					      lockdep_is_held(&syncobj->lock));
> -	rcu_assign_pointer(syncobj->fence, fence);
> +	drm_syncobj_garbage_collection(syncobj);
> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +		if (fence)
> +			drm_syncobj_create_signal_pt(syncobj, fence, point);
> +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> +		u64 pt_value;
> +
> +		if (!fence) {
> +			drm_syncobj_timeline_fini(syncobj,
> +						  &syncobj->syncobj_timeline);
> +			drm_syncobj_timeline_init(syncobj,
> +						  &syncobj->syncobj_timeline);
> +			return;
> +		}
> +		pt_value = syncobj->syncobj_timeline.signal_point +
> +			DRM_SYNCOBJ_NORMAL_POINT;
> +		drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
> +	} else {
> +		DRM_ERROR("the syncobj type isn't support\n");
> +		return;
> +	}
> +	if (fence) {
> +		struct drm_syncobj_cb *cur, *tmp;
>   
> -	if (fence != old_fence) {
> +		spin_lock(&syncobj->lock);
>   		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
>   			list_del_init(&cur->node);
>   			cur->func(syncobj, cur);
>   		}
> +		spin_unlock(&syncobj->lock);
>   	}
> -
> -	spin_unlock(&syncobj->lock);
> -
> -	dma_fence_put(old_fence);
>   }
>   EXPORT_SYMBOL(drm_syncobj_replace_fence);
>   
> @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>   	return 0;
>   }
>   
> +static struct dma_fence *
> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 flags)
> +{
> +	struct dma_fence *fence;
> +	int ret = 0;
> +
> +	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> +		ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
> +					point <= syncobj->syncobj_timeline.signal_point,
> +					msecs_to_jiffies(10000)); /* wait 10s */
> +		if (ret <= 0)
> +			return NULL;
> +	}
> +	spin_lock(&syncobj->lock);
> +	fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
> +	spin_unlock(&syncobj->lock);
> +	return fence;
> +}
> +
>   /**
>    * drm_syncobj_find_fence - lookup and reference the fence in a sync object
>    * @file_private: drm file private pointer
> @@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>    * contains a reference to the fence, which must be released by calling
>    * dma_fence_put().
>    */
> -int drm_syncobj_find_fence(struct drm_file *file_private,
> -			   u32 handle, u64 point,
> -			   struct dma_fence **fence)
> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> +			     u64 flags, struct dma_fence **fence)
>   {
> -	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
>   	int ret = 0;
>   
>   	if (!syncobj)
>   		return -ENOENT;
>   
> -	*fence = drm_syncobj_fence_get(syncobj);
> +	drm_syncobj_garbage_collection(syncobj);
> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> +		/*NORMAL syncobj always wait on last pt */
> +		u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
> +
> +		if (tail_pt_value == 0)
> +			tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
> +		/* NORMAL syncobj doesn't care point value */
> +		WARN_ON(point != 0);
> +		*fence = drm_syncobj_point_get(syncobj, tail_pt_value,
> +							flags);
> +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +		*fence = drm_syncobj_point_get(syncobj, point,
> +							flags);
> +	} else {
> +		DRM_ERROR("Don't support this type syncobj\n");
> +		*fence = NULL;
> +	}
>   	if (!*fence) {
>   		ret = -EINVAL;
>   	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_syncobj_search_fence);
> +int drm_syncobj_find_fence(struct drm_file *file_private,
> +			   u32 handle, u64 point,
> +			   struct dma_fence **fence) {
> +	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
> +
> +	int ret = drm_syncobj_search_fence(syncobj, point,
> +					DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> +					fence);
>   	drm_syncobj_put(syncobj);
>   	return ret;
>   }
> @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
>   	struct drm_syncobj *syncobj = container_of(kref,
>   						   struct drm_syncobj,
>   						   refcount);
> -	drm_syncobj_replace_fence(syncobj, 0, NULL);
> +	drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>   	kfree(syncobj);
>   }
>   EXPORT_SYMBOL(drm_syncobj_free);
> @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>   	kref_init(&syncobj->refcount);
>   	INIT_LIST_HEAD(&syncobj->cb_list);
>   	spin_lock_init(&syncobj->lock);
> +	if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
> +		syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
> +	else
> +		syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
> +	drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
>   
>   	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>   		ret = drm_syncobj_assign_null_handle(syncobj);
> @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev, void *data,
>   		return -ENODEV;
>   
>   	/* no valid flags yet */
> -	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
> +	if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
> +			    DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
>   		return -EINVAL;
>   
>   	return drm_syncobj_create_as_handle(file_private,
> @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
>   	struct syncobj_wait_entry *wait =
>   		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>   
> -	/* This happens inside the syncobj lock */
> -	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
> -							      lockdep_is_held(&syncobj->lock)));
> +	drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
> +
>   	wake_up_process(wait->task);
>   }
>   
> @@ -698,7 +910,8 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>   	signaled_count = 0;
>   	for (i = 0; i < count; ++i) {
>   		entries[i].task = current;
> -		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
> +		ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
> +					       &entries[i].fence);
>   		if (!entries[i].fence) {
>   			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>   				continue;
> @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
>   	if (ret < 0)
>   		return ret;
>   
> -	for (i = 0; i < args->count_handles; i++)
> -		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
> -
> +	for (i = 0; i < args->count_handles; i++) {
> +		if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +			DRM_ERROR("timeline syncobj cannot reset!\n");
> +			ret = -EINVAL;
> +			goto out;
> +		}
> +		drm_syncobj_timeline_fini(syncobjs[i],
> +					  &syncobjs[i]->syncobj_timeline);
> +		drm_syncobj_timeline_init(syncobjs[i],
> +					  &syncobjs[i]->syncobj_timeline);
> +	}
> +out:
>   	drm_syncobj_array_free(syncobjs, args->count_handles);
>   
> -	return 0;
> +	return ret;
>   }
>   
>   int
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 0a8d2d64f380..579e91a5858b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>   		if (!(flags & I915_EXEC_FENCE_WAIT))
>   			continue;
>   
> -		fence = drm_syncobj_fence_get(syncobj);
> +		drm_syncobj_search_fence(syncobj, 0,
> +					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> +					 &fence);
>   		if (!fence)
>   			return -EINVAL;
>   
> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
> index 425432b85a87..9535521e6623 100644
> --- a/include/drm/drm_syncobj.h
> +++ b/include/drm/drm_syncobj.h
> @@ -30,6 +30,25 @@
>   
>   struct drm_syncobj_cb;
>   
> +enum drm_syncobj_type {
> +	DRM_SYNCOBJ_TYPE_NORMAL,
> +	DRM_SYNCOBJ_TYPE_TIMELINE
> +};
> +
> +struct drm_syncobj_timeline {
> +	wait_queue_head_t	wq;
> +	u64 timeline_context;
> +	/**
> +	 * @timeline: syncobj timeline
> +	 */
> +	u64 timeline;
> +	u64 signal_point;
> +
> +
> +	struct rb_root wait_pt_tree;
> +	struct list_head signal_pt_list;
> +};
> +
>   /**
>    * struct drm_syncobj - sync object.
>    *
> @@ -41,19 +60,19 @@ struct drm_syncobj {
>   	 */
>   	struct kref refcount;
>   	/**
> -	 * @fence:
> -	 * NULL or a pointer to the fence bound to this object.
> -	 *
> -	 * This field should not be used directly. Use drm_syncobj_fence_get()
> -	 * and drm_syncobj_replace_fence() instead.
> +	 * @type: indicate syncobj type
>   	 */
> -	struct dma_fence __rcu *fence;
> +	enum drm_syncobj_type type;
>   	/**
> -	 * @cb_list: List of callbacks to call when the &fence gets replaced.
> +	 * @syncobj_timeline: timeline
>   	 */
> +	struct drm_syncobj_timeline syncobj_timeline;
> +	/**
> +         * @cb_list: List of callbacks to call when the &fence gets replaced.
> +         */
>   	struct list_head cb_list;
>   	/**
> -	 * @lock: Protects &cb_list and write-locks &fence.
> +	 * @lock: Protects syncobj list and write-locks &fence.
>   	 */
>   	spinlock_t lock;
>   	/**
> @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
>   /**
>    * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>    * @node: used by drm_syncob_add_callback to append this struct to
> - *	  &drm_syncobj.cb_list
> + *       &drm_syncobj.cb_list
>    * @func: drm_syncobj_func_t to call
>    *
>    * This struct will be initialized by drm_syncobj_add_callback, additional
> @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>   	kref_put(&obj->refcount, drm_syncobj_free);
>   }
>   
> -/**
> - * drm_syncobj_fence_get - get a reference to a fence in a sync object
> - * @syncobj: sync object.
> - *
> - * This acquires additional reference to &drm_syncobj.fence contained in @obj,
> - * if not NULL. It is illegal to call this without already holding a reference.
> - * No locks required.
> - *
> - * Returns:
> - * Either the fence of @obj or NULL if there's none.
> - */
> -static inline struct dma_fence *
> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
> -{
> -	struct dma_fence *fence;
> -
> -	rcu_read_lock();
> -	fence = dma_fence_get_rcu_safe(&syncobj->fence);
> -	rcu_read_unlock();
> -
> -	return fence;
> -}
> -
>   struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>   				     u32 handle);
>   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
> @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>   int drm_syncobj_get_handle(struct drm_file *file_private,
>   			   struct drm_syncobj *syncobj, u32 *handle);
>   int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> +			     u64 flags, struct dma_fence **fence);
>   
>   #endif
> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
> index 300f336633f2..cebdb2541eb7 100644
> --- a/include/uapi/drm/drm.h
> +++ b/include/uapi/drm/drm.h
> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>   struct drm_syncobj_create {
>   	__u32 handle;
>   #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>   	__u32 flags;
>   };
>   

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
       [not found]   ` <9a07a371-8de1-2ebf-66e7-23b93c1bc1c3-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-14 16:10     ` Daniel Vetter
       [not found]       ` <20180914161059.GR11082-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2018-09-14 16:10 UTC (permalink / raw)
  To: Christian König
  Cc: Chunming Zhou, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Daniel Rakos, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	Daniel Vetter, Dave Airlie

On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> > This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side:
> > This extension introduces a new type of syncobj that has an integer payload
> > identifying a point in a timeline. Such timeline syncobjs support the
> > following operations:
> >     * CPU query - A host operation that allows querying the payload of the
> >       timeline syncobj.
> >     * CPU wait - A host operation that allows a blocking wait for a
> >       timeline syncobj to reach a specified value.
> >     * Device wait - A device operation that allows waiting for a
> >       timeline syncobj to reach a specified value.
> >     * Device signal - A device operation that allows advancing the
> >       timeline syncobj to a specified value.
> > 
> > Since it's a timeline, that means the front time point(PT) always is signaled before the late PT.
> > a. signal PT design:
> > Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled,
> > the timeline will increase to value of PT[N].
> > b. wait PT design:
> > Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare
> > wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be
> > signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline,
> > so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to
> > perform that.
> > 
> > v2:
> > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
> > 2. move unexposed denitions to .c file. (Daniel Vetter)
> > 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian)
> > 4. split up the change to drm_syncobj_replace_fence() in a separate patch.
> > 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian)
> > 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
> > 
> > v3:
> > 1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
> >      a. normal syncobj signal op will create a signal PT to tail of signal pt list.
> >      b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT.
> > 2. many bug fix and clean up
> > 3. stub fence moving is moved to other patch.
> > 
> > v4:
> > 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
> > 2. fix syncobj lifecycle. (Christian)
> > 3. only enable_signaling when there is wait_pt. (Christian)
> > 4. fix timeline path issues.
> > 5. write a timeline test in libdrm
> > 
> > v5: (Christian)
> > 1. semaphore is called syncobj in kernel side.
> > 2. don't need 'timeline' characters in some function name.
> > 3. keep syncobj cb
> > 
> > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
> > timeline syncobj is tested by ./amdgpu_test -s 9
> > 
> > Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
> > Cc: Christian Konig <christian.koenig@amd.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Daniel Rakos <Daniel.Rakos@amd.com>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> 
> At least on first glance that looks like it should work, going to do a
> detailed review on Monday.

Just for my understanding, it's all condensed down to 1 patch now? I kinda
didn't follow the detailed discussion last few days at all :-/

Also, is there a testcase, igt highly preferred (because then we'll run it
in our intel-gfx CI, and a bunch of people outside of intel have already
discovered that and are using it).

Thanks, Daniel

> 
> Christian.
> 
> > ---
> >   drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
> >   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
> >   include/drm/drm_syncobj.h                  |  62 +++--
> >   include/uapi/drm/drm.h                     |   1 +
> >   4 files changed, 292 insertions(+), 69 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> > index e9ce623d049e..e78d076f2703 100644
> > --- a/drivers/gpu/drm/drm_syncobj.c
> > +++ b/drivers/gpu/drm/drm_syncobj.c
> > @@ -56,6 +56,9 @@either
> >   #include "drm_internal.h"
> >   #include <drm/drm_syncobj.h>
> > +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
> > +#define DRM_SYNCOBJ_NORMAL_POINT 1
> > +
> >   struct drm_syncobj_stub_fence {
> >   	struct dma_fence base;
> >   	spinlock_t lock;
> > @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
> >   	.release = drm_syncobj_stub_fence_release,
> >   };
> > +struct drm_syncobj_signal_pt {
> > +	struct dma_fence_array *base;
> > +	u64    value;
> > +	struct list_head list;
> > +};
> >   /**
> >    * drm_syncobj_find - lookup and reference a sync object.
> > @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
> >   {
> >   	int ret;
> > -	*fence = drm_syncobj_fence_get(syncobj);
> > +	ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
> >   	if (*fence)
> >   		return 1;
> > @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
> >   	 * have the lock, try one more time just to be sure we don't add a
> >   	 * callback when a fence has already been set.
> >   	 */
> > -	if (syncobj->fence) {
> > -		*fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
> > -								 lockdep_is_held(&syncobj->lock)));
> > -		ret = 1;
> > +	if (fence) {
> > +		drm_syncobj_search_fence(syncobj, 0, 0, fence);
> > +		if (*fence)
> > +			ret = 1;
> >   	} else {
> >   		*fence = NULL;
> >   		drm_syncobj_add_callback_locked(syncobj, cb, func);
> > @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
> >   	spin_unlock(&syncobj->lock);
> >   }
> > +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
> > +				      struct drm_syncobj_timeline *syncobj_timeline)
> > +{
> > +	spin_lock(&syncobj->lock);
> > +	syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
> > +	syncobj_timeline->timeline = 0;
> > +	syncobj_timeline->signal_point = 0;
> > +	init_waitqueue_head(&syncobj_timeline->wq);
> > +
> > +	INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
> > +	spin_unlock(&syncobj->lock);
> > +}
> > +
> > +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
> > +				      struct drm_syncobj_timeline *syncobj_timeline)
> > +{
> > +	struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
> > +
> > +	spin_lock(&syncobj->lock);
> > +	list_for_each_entry_safe(signal_pt, tmp,
> > +				 &syncobj_timeline->signal_pt_list, list) {
> > +		list_del(&signal_pt->list);
> > +		dma_fence_put(&signal_pt->base->base);
> > +		kfree(signal_pt);
> > +	}
> > +	spin_unlock(&syncobj->lock);
> > +}
> > +
> > +static struct dma_fence
> > +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
> > +				      uint64_t point)
> > +{
> > +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
> > +	struct drm_syncobj_signal_pt *signal_pt;
> > +
> > +	if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
> > +	    (point <= timeline->timeline)) {
> > +		struct drm_syncobj_stub_fence *fence =
> > +			kzalloc(sizeof(struct drm_syncobj_stub_fence),
> > +				GFP_KERNEL);
> > +
> > +		if (!fence)
> > +			return NULL;
> > +		spin_lock_init(&fence->lock);
> > +		dma_fence_init(&fence->base,
> > +			       &drm_syncobj_stub_fence_ops,
> > +			       &fence->lock,
> > +			       syncobj->syncobj_timeline.timeline_context,
> > +			       point);
> > +
> > +		dma_fence_signal(&fence->base);
> > +		return &fence->base;
> > +	}
> > +
> > +	list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
> > +		if (point > signal_pt->value)
> > +			continue;
> > +		if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
> > +		    (point != signal_pt->value))
> > +			continue;
> > +		return dma_fence_get(&signal_pt->base->base);
> > +	}
> > +	return NULL;
> > +}
> > +
> > +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
> > +					struct dma_fence *fence,
> > +					u64 point)
> > +{
> > +	struct drm_syncobj_signal_pt *signal_pt =
> > +		kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
> > +	struct drm_syncobj_signal_pt *tail_pt;
> > +	struct dma_fence **fences;
> > +	int num_fences = 0;
> > +	int ret = 0, i;
> > +
> > +	if (!signal_pt)
> > +		return -ENOMEM;
> > +	if (syncobj->syncobj_timeline.signal_point >= point) {
> > +		DRM_WARN("A later signal is ready!");
> > +		goto out;
> > +	}
> > +	if (!fence)
> > +		goto out;
> > +
> > +	fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
> > +	if (!fences)
> > +		goto out;
> > +	fences[num_fences++] = dma_fence_get(fence);
> > +	/* timeline syncobj must take this dependency */
> > +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > +		spin_lock(&syncobj->lock);
> > +		if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
> > +			tail_pt = list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
> > +						  struct drm_syncobj_signal_pt, list);
> > +			fences[num_fences++] = dma_fence_get(&tail_pt->base->base);
> > +		}
> > +		spin_unlock(&syncobj->lock);
> > +	}
> > +	signal_pt->base = dma_fence_array_create(num_fences, fences,
> > +						 syncobj->syncobj_timeline.timeline_context,
> > +						 point, false);
> > +	if (!signal_pt->base)
> > +		goto fail;
> > +
> > +	spin_lock(&syncobj->lock);
> > +	signal_pt->value = point;
> > +	INIT_LIST_HEAD(&signal_pt->list);
> > +	list_add_tail(&signal_pt->list, &syncobj->syncobj_timeline.signal_pt_list);
> > +	syncobj->syncobj_timeline.signal_point = point;
> > +	spin_unlock(&syncobj->lock);
> > +	wake_up_all(&syncobj->syncobj_timeline.wq);
> > +
> > +	return 0;
> > +fail:
> > +	for (i = 0; i < num_fences; i++)
> > +		dma_fence_put(fences[i]);
> > +	kfree(fences);
> > +out:
> > +	kfree(signal_pt);
> > +	return ret;
> > +}
> > +
> > +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
> > +{
> > +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
> > +	struct drm_syncobj_signal_pt *signal_pt, *tmp;
> > +
> > +	spin_lock(&syncobj->lock);
> > +	list_for_each_entry_safe(signal_pt, tmp,
> > +				 &timeline->signal_pt_list, list) {
> > +		if (dma_fence_is_signaled(&signal_pt->base->base)) {
> > +			timeline->timeline = signal_pt->value;
> > +			list_del(&signal_pt->list);
> > +			dma_fence_put(&signal_pt->base->base);
> > +			kfree(signal_pt);
> > +		} else {
> > +			/*signal_pt is in order in list, from small to big, so
> > +			 * the later must not be signal either */
> > +			break;
> > +		}
> > +	}
> > +
> > +	spin_unlock(&syncobj->lock);
> > +}
> >   /**
> >    * drm_syncobj_replace_fence - replace fence in a sync object.
> >    * @syncobj: Sync object to replace fence in
> > @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
> >   			       u64 point,
> >   			       struct dma_fence *fence)
> >   {
> > -	struct dma_fence *old_fence;
> > -	struct drm_syncobj_cb *cur, *tmp;
> > -
> > -	if (fence)
> > -		dma_fence_get(fence);
> > -
> > -	spin_lock(&syncobj->lock);
> > -
> > -	old_fence = rcu_dereference_protected(syncobj->fence,
> > -					      lockdep_is_held(&syncobj->lock));
> > -	rcu_assign_pointer(syncobj->fence, fence);
> > +	drm_syncobj_garbage_collection(syncobj);
> > +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > +		if (fence)
> > +			drm_syncobj_create_signal_pt(syncobj, fence, point);
> > +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> > +		u64 pt_value;
> > +
> > +		if (!fence) {
> > +			drm_syncobj_timeline_fini(syncobj,
> > +						  &syncobj->syncobj_timeline);
> > +			drm_syncobj_timeline_init(syncobj,
> > +						  &syncobj->syncobj_timeline);
> > +			return;
> > +		}
> > +		pt_value = syncobj->syncobj_timeline.signal_point +
> > +			DRM_SYNCOBJ_NORMAL_POINT;
> > +		drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
> > +	} else {
> > +		DRM_ERROR("the syncobj type isn't support\n");
> > +		return;
> > +	}
> > +	if (fence) {
> > +		struct drm_syncobj_cb *cur, *tmp;
> > -	if (fence != old_fence) {
> > +		spin_lock(&syncobj->lock);
> >   		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
> >   			list_del_init(&cur->node);
> >   			cur->func(syncobj, cur);
> >   		}
> > +		spin_unlock(&syncobj->lock);
> >   	}
> > -
> > -	spin_unlock(&syncobj->lock);
> > -
> > -	dma_fence_put(old_fence);
> >   }
> >   EXPORT_SYMBOL(drm_syncobj_replace_fence);
> > @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
> >   	return 0;
> >   }
> > +static struct dma_fence *
> > +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 flags)
> > +{
> > +	struct dma_fence *fence;
> > +	int ret = 0;
> > +
> > +	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> > +		ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
> > +					point <= syncobj->syncobj_timeline.signal_point,
> > +					msecs_to_jiffies(10000)); /* wait 10s */
> > +		if (ret <= 0)
> > +			return NULL;
> > +	}
> > +	spin_lock(&syncobj->lock);
> > +	fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
> > +	spin_unlock(&syncobj->lock);
> > +	return fence;
> > +}
> > +
> >   /**
> >    * drm_syncobj_find_fence - lookup and reference the fence in a sync object
> >    * @file_private: drm file private pointer
> > @@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
> >    * contains a reference to the fence, which must be released by calling
> >    * dma_fence_put().
> >    */
> > -int drm_syncobj_find_fence(struct drm_file *file_private,
> > -			   u32 handle, u64 point,
> > -			   struct dma_fence **fence)
> > +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> > +			     u64 flags, struct dma_fence **fence)
> >   {
> > -	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
> >   	int ret = 0;
> >   	if (!syncobj)
> >   		return -ENOENT;
> > -	*fence = drm_syncobj_fence_get(syncobj);
> > +	drm_syncobj_garbage_collection(syncobj);
> > +	if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> > +		/*NORMAL syncobj always wait on last pt */
> > +		u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
> > +
> > +		if (tail_pt_value == 0)
> > +			tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
> > +		/* NORMAL syncobj doesn't care point value */
> > +		WARN_ON(point != 0);
> > +		*fence = drm_syncobj_point_get(syncobj, tail_pt_value,
> > +							flags);
> > +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > +		*fence = drm_syncobj_point_get(syncobj, point,
> > +							flags);
> > +	} else {
> > +		DRM_ERROR("Don't support this type syncobj\n");
> > +		*fence = NULL;
> > +	}
> >   	if (!*fence) {
> >   		ret = -EINVAL;
> >   	}
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_syncobj_search_fence);
> > +int drm_syncobj_find_fence(struct drm_file *file_private,
> > +			   u32 handle, u64 point,
> > +			   struct dma_fence **fence) {
> > +	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
> > +
> > +	int ret = drm_syncobj_search_fence(syncobj, point,
> > +					DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> > +					fence);
> >   	drm_syncobj_put(syncobj);
> >   	return ret;
> >   }
> > @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
> >   	struct drm_syncobj *syncobj = container_of(kref,
> >   						   struct drm_syncobj,
> >   						   refcount);
> > -	drm_syncobj_replace_fence(syncobj, 0, NULL);
> > +	drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
> >   	kfree(syncobj);
> >   }
> >   EXPORT_SYMBOL(drm_syncobj_free);
> > @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
> >   	kref_init(&syncobj->refcount);
> >   	INIT_LIST_HEAD(&syncobj->cb_list);
> >   	spin_lock_init(&syncobj->lock);
> > +	if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
> > +		syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
> > +	else
> > +		syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
> > +	drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
> >   	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
> >   		ret = drm_syncobj_assign_null_handle(syncobj);
> > @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev, void *data,
> >   		return -ENODEV;
> >   	/* no valid flags yet */
> > -	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
> > +	if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
> > +			    DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
> >   		return -EINVAL;
> >   	return drm_syncobj_create_as_handle(file_private,
> > @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
> >   	struct syncobj_wait_entry *wait =
> >   		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
> > -	/* This happens inside the syncobj lock */
> > -	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
> > -							      lockdep_is_held(&syncobj->lock)));
> > +	drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
> > +
> >   	wake_up_process(wait->task);
> >   }
> > @@ -698,7 +910,8 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
> >   	signaled_count = 0;
> >   	for (i = 0; i < count; ++i) {
> >   		entries[i].task = current;
> > -		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
> > +		ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
> > +					       &entries[i].fence);
> >   		if (!entries[i].fence) {
> >   			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> >   				continue;
> > @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
> >   	if (ret < 0)
> >   		return ret;
> > -	for (i = 0; i < args->count_handles; i++)
> > -		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
> > -
> > +	for (i = 0; i < args->count_handles; i++) {
> > +		if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > +			DRM_ERROR("timeline syncobj cannot reset!\n");
> > +			ret = -EINVAL;
> > +			goto out;
> > +		}
> > +		drm_syncobj_timeline_fini(syncobjs[i],
> > +					  &syncobjs[i]->syncobj_timeline);
> > +		drm_syncobj_timeline_init(syncobjs[i],
> > +					  &syncobjs[i]->syncobj_timeline);
> > +	}
> > +out:
> >   	drm_syncobj_array_free(syncobjs, args->count_handles);
> > -	return 0;
> > +	return ret;
> >   }
> >   int
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 0a8d2d64f380..579e91a5858b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
> >   		if (!(flags & I915_EXEC_FENCE_WAIT))
> >   			continue;
> > -		fence = drm_syncobj_fence_get(syncobj);
> > +		drm_syncobj_search_fence(syncobj, 0,
> > +					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> > +					 &fence);
> >   		if (!fence)
> >   			return -EINVAL;
> > diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
> > index 425432b85a87..9535521e6623 100644
> > --- a/include/drm/drm_syncobj.h
> > +++ b/include/drm/drm_syncobj.h
> > @@ -30,6 +30,25 @@
> >   struct drm_syncobj_cb;
> > +enum drm_syncobj_type {
> > +	DRM_SYNCOBJ_TYPE_NORMAL,
> > +	DRM_SYNCOBJ_TYPE_TIMELINE
> > +};
> > +
> > +struct drm_syncobj_timeline {
> > +	wait_queue_head_t	wq;
> > +	u64 timeline_context;
> > +	/**
> > +	 * @timeline: syncobj timeline
> > +	 */
> > +	u64 timeline;
> > +	u64 signal_point;
> > +
> > +
> > +	struct rb_root wait_pt_tree;
> > +	struct list_head signal_pt_list;
> > +};
> > +
> >   /**
> >    * struct drm_syncobj - sync object.
> >    *
> > @@ -41,19 +60,19 @@ struct drm_syncobj {
> >   	 */
> >   	struct kref refcount;
> >   	/**
> > -	 * @fence:
> > -	 * NULL or a pointer to the fence bound to this object.
> > -	 *
> > -	 * This field should not be used directly. Use drm_syncobj_fence_get()
> > -	 * and drm_syncobj_replace_fence() instead.
> > +	 * @type: indicate syncobj type
> >   	 */
> > -	struct dma_fence __rcu *fence;
> > +	enum drm_syncobj_type type;
> >   	/**
> > -	 * @cb_list: List of callbacks to call when the &fence gets replaced.
> > +	 * @syncobj_timeline: timeline
> >   	 */
> > +	struct drm_syncobj_timeline syncobj_timeline;
> > +	/**
> > +         * @cb_list: List of callbacks to call when the &fence gets replaced.
> > +         */
> >   	struct list_head cb_list;
> >   	/**
> > -	 * @lock: Protects &cb_list and write-locks &fence.
> > +	 * @lock: Protects syncobj list and write-locks &fence.
> >   	 */
> >   	spinlock_t lock;
> >   	/**
> > @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
> >   /**
> >    * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
> >    * @node: used by drm_syncob_add_callback to append this struct to
> > - *	  &drm_syncobj.cb_list
> > + *       &drm_syncobj.cb_list
> >    * @func: drm_syncobj_func_t to call
> >    *
> >    * This struct will be initialized by drm_syncobj_add_callback, additional
> > @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
> >   	kref_put(&obj->refcount, drm_syncobj_free);
> >   }
> > -/**
> > - * drm_syncobj_fence_get - get a reference to a fence in a sync object
> > - * @syncobj: sync object.
> > - *
> > - * This acquires additional reference to &drm_syncobj.fence contained in @obj,
> > - * if not NULL. It is illegal to call this without already holding a reference.
> > - * No locks required.
> > - *
> > - * Returns:
> > - * Either the fence of @obj or NULL if there's none.
> > - */
> > -static inline struct dma_fence *
> > -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
> > -{
> > -	struct dma_fence *fence;
> > -
> > -	rcu_read_lock();
> > -	fence = dma_fence_get_rcu_safe(&syncobj->fence);
> > -	rcu_read_unlock();
> > -
> > -	return fence;
> > -}
> > -
> >   struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
> >   				     u32 handle);
> >   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
> > @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
> >   int drm_syncobj_get_handle(struct drm_file *file_private,
> >   			   struct drm_syncobj *syncobj, u32 *handle);
> >   int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
> > +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> > +			     u64 flags, struct dma_fence **fence);
> >   #endif
> > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
> > index 300f336633f2..cebdb2541eb7 100644
> > --- a/include/uapi/drm/drm.h
> > +++ b/include/uapi/drm/drm.h
> > @@ -717,6 +717,7 @@ struct drm_prime_handle {
> >   struct drm_syncobj_create {
> >   	__u32 handle;
> >   #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
> > +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
> >   	__u32 flags;
> >   };
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
       [not found]       ` <20180914161059.GR11082-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
@ 2018-09-14 16:43         ` Christian König
  2018-09-14 18:24           ` Daniel Vetter
  2018-09-17  2:23         ` Zhou, David(ChunMing)
  1 sibling, 1 reply; 12+ messages in thread
From: Christian König @ 2018-09-14 16:43 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Chunming Zhou, Daniel Rakos,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Dave Airlie

Am 14.09.2018 um 18:10 schrieb Daniel Vetter:
> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
>> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
>>> This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side:
>>> This extension introduces a new type of syncobj that has an integer payload
>>> identifying a point in a timeline. Such timeline syncobjs support the
>>> following operations:
>>>      * CPU query - A host operation that allows querying the payload of the
>>>        timeline syncobj.
>>>      * CPU wait - A host operation that allows a blocking wait for a
>>>        timeline syncobj to reach a specified value.
>>>      * Device wait - A device operation that allows waiting for a
>>>        timeline syncobj to reach a specified value.
>>>      * Device signal - A device operation that allows advancing the
>>>        timeline syncobj to a specified value.
>>>
>>> Since it's a timeline, that means the front time point(PT) always is signaled before the late PT.
>>> a. signal PT design:
>>> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled,
>>> the timeline will increase to value of PT[N].
>>> b. wait PT design:
>>> Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare
>>> wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be
>>> signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline,
>>> so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to
>>> perform that.
>>>
>>> v2:
>>> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
>>> 2. move unexposed denitions to .c file. (Daniel Vetter)
>>> 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian)
>>> 4. split up the change to drm_syncobj_replace_fence() in a separate patch.
>>> 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian)
>>> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
>>>
>>> v3:
>>> 1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
>>>       a. normal syncobj signal op will create a signal PT to tail of signal pt list.
>>>       b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT.
>>> 2. many bug fix and clean up
>>> 3. stub fence moving is moved to other patch.
>>>
>>> v4:
>>> 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
>>> 2. fix syncobj lifecycle. (Christian)
>>> 3. only enable_signaling when there is wait_pt. (Christian)
>>> 4. fix timeline path issues.
>>> 5. write a timeline test in libdrm
>>>
>>> v5: (Christian)
>>> 1. semaphore is called syncobj in kernel side.
>>> 2. don't need 'timeline' characters in some function name.
>>> 3. keep syncobj cb
>>>
>>> normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
>>> timeline syncobj is tested by ./amdgpu_test -s 9
>>>
>>> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
>>> Cc: Christian Konig <christian.koenig@amd.com>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> Cc: Daniel Rakos <Daniel.Rakos@amd.com>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> At least on first glance that looks like it should work, going to do a
>> detailed review on Monday.
> Just for my understanding, it's all condensed down to 1 patch now? I kinda
> didn't follow the detailed discussion last few days at all :-/

I've already committed all the cleanup/fix prerequisites to drm-misc-next.

The driver specific implementation needs to come on top and maybe a new 
CPU wait IOCTL.

But essentially this patch is just the core of the kernel implementation.

> Also, is there a testcase, igt highly preferred (because then we'll run it
> in our intel-gfx CI, and a bunch of people outside of intel have already
> discovered that and are using it).

libdrm patches and I think amdgpu based test cases where already 
published as well.

Not sure about igt testcases.

Christian.

>
> Thanks, Daniel
>
>> Christian.
>>
>>> ---
>>>    drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
>>>    drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
>>>    include/drm/drm_syncobj.h                  |  62 +++--
>>>    include/uapi/drm/drm.h                     |   1 +
>>>    4 files changed, 292 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
>>> index e9ce623d049e..e78d076f2703 100644
>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>> @@ -56,6 +56,9 @@either
>>>    #include "drm_internal.h"
>>>    #include <drm/drm_syncobj.h>
>>> +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
>>> +#define DRM_SYNCOBJ_NORMAL_POINT 1
>>> +
>>>    struct drm_syncobj_stub_fence {
>>>    	struct dma_fence base;
>>>    	spinlock_t lock;
>>> @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
>>>    	.release = drm_syncobj_stub_fence_release,
>>>    };
>>> +struct drm_syncobj_signal_pt {
>>> +	struct dma_fence_array *base;
>>> +	u64    value;
>>> +	struct list_head list;
>>> +};
>>>    /**
>>>     * drm_syncobj_find - lookup and reference a sync object.
>>> @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>>    {
>>>    	int ret;
>>> -	*fence = drm_syncobj_fence_get(syncobj);
>>> +	ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>>    	if (*fence)
>>>    		return 1;
>>> @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>>    	 * have the lock, try one more time just to be sure we don't add a
>>>    	 * callback when a fence has already been set.
>>>    	 */
>>> -	if (syncobj->fence) {
>>> -		*fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>> -								 lockdep_is_held(&syncobj->lock)));
>>> -		ret = 1;
>>> +	if (fence) {
>>> +		drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>> +		if (*fence)
>>> +			ret = 1;
>>>    	} else {
>>>    		*fence = NULL;
>>>    		drm_syncobj_add_callback_locked(syncobj, cb, func);
>>> @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
>>>    	spin_unlock(&syncobj->lock);
>>>    }
>>> +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
>>> +				      struct drm_syncobj_timeline *syncobj_timeline)
>>> +{
>>> +	spin_lock(&syncobj->lock);
>>> +	syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
>>> +	syncobj_timeline->timeline = 0;
>>> +	syncobj_timeline->signal_point = 0;
>>> +	init_waitqueue_head(&syncobj_timeline->wq);
>>> +
>>> +	INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
>>> +	spin_unlock(&syncobj->lock);
>>> +}
>>> +
>>> +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
>>> +				      struct drm_syncobj_timeline *syncobj_timeline)
>>> +{
>>> +	struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
>>> +
>>> +	spin_lock(&syncobj->lock);
>>> +	list_for_each_entry_safe(signal_pt, tmp,
>>> +				 &syncobj_timeline->signal_pt_list, list) {
>>> +		list_del(&signal_pt->list);
>>> +		dma_fence_put(&signal_pt->base->base);
>>> +		kfree(signal_pt);
>>> +	}
>>> +	spin_unlock(&syncobj->lock);
>>> +}
>>> +
>>> +static struct dma_fence
>>> +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
>>> +				      uint64_t point)
>>> +{
>>> +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
>>> +	struct drm_syncobj_signal_pt *signal_pt;
>>> +
>>> +	if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
>>> +	    (point <= timeline->timeline)) {
>>> +		struct drm_syncobj_stub_fence *fence =
>>> +			kzalloc(sizeof(struct drm_syncobj_stub_fence),
>>> +				GFP_KERNEL);
>>> +
>>> +		if (!fence)
>>> +			return NULL;
>>> +		spin_lock_init(&fence->lock);
>>> +		dma_fence_init(&fence->base,
>>> +			       &drm_syncobj_stub_fence_ops,
>>> +			       &fence->lock,
>>> +			       syncobj->syncobj_timeline.timeline_context,
>>> +			       point);
>>> +
>>> +		dma_fence_signal(&fence->base);
>>> +		return &fence->base;
>>> +	}
>>> +
>>> +	list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
>>> +		if (point > signal_pt->value)
>>> +			continue;
>>> +		if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
>>> +		    (point != signal_pt->value))
>>> +			continue;
>>> +		return dma_fence_get(&signal_pt->base->base);
>>> +	}
>>> +	return NULL;
>>> +}
>>> +
>>> +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
>>> +					struct dma_fence *fence,
>>> +					u64 point)
>>> +{
>>> +	struct drm_syncobj_signal_pt *signal_pt =
>>> +		kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
>>> +	struct drm_syncobj_signal_pt *tail_pt;
>>> +	struct dma_fence **fences;
>>> +	int num_fences = 0;
>>> +	int ret = 0, i;
>>> +
>>> +	if (!signal_pt)
>>> +		return -ENOMEM;
>>> +	if (syncobj->syncobj_timeline.signal_point >= point) {
>>> +		DRM_WARN("A later signal is ready!");
>>> +		goto out;
>>> +	}
>>> +	if (!fence)
>>> +		goto out;
>>> +
>>> +	fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
>>> +	if (!fences)
>>> +		goto out;
>>> +	fences[num_fences++] = dma_fence_get(fence);
>>> +	/* timeline syncobj must take this dependency */
>>> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>> +		spin_lock(&syncobj->lock);
>>> +		if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
>>> +			tail_pt = list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>>> +						  struct drm_syncobj_signal_pt, list);
>>> +			fences[num_fences++] = dma_fence_get(&tail_pt->base->base);
>>> +		}
>>> +		spin_unlock(&syncobj->lock);
>>> +	}
>>> +	signal_pt->base = dma_fence_array_create(num_fences, fences,
>>> +						 syncobj->syncobj_timeline.timeline_context,
>>> +						 point, false);
>>> +	if (!signal_pt->base)
>>> +		goto fail;
>>> +
>>> +	spin_lock(&syncobj->lock);
>>> +	signal_pt->value = point;
>>> +	INIT_LIST_HEAD(&signal_pt->list);
>>> +	list_add_tail(&signal_pt->list, &syncobj->syncobj_timeline.signal_pt_list);
>>> +	syncobj->syncobj_timeline.signal_point = point;
>>> +	spin_unlock(&syncobj->lock);
>>> +	wake_up_all(&syncobj->syncobj_timeline.wq);
>>> +
>>> +	return 0;
>>> +fail:
>>> +	for (i = 0; i < num_fences; i++)
>>> +		dma_fence_put(fences[i]);
>>> +	kfree(fences);
>>> +out:
>>> +	kfree(signal_pt);
>>> +	return ret;
>>> +}
>>> +
>>> +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
>>> +{
>>> +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
>>> +	struct drm_syncobj_signal_pt *signal_pt, *tmp;
>>> +
>>> +	spin_lock(&syncobj->lock);
>>> +	list_for_each_entry_safe(signal_pt, tmp,
>>> +				 &timeline->signal_pt_list, list) {
>>> +		if (dma_fence_is_signaled(&signal_pt->base->base)) {
>>> +			timeline->timeline = signal_pt->value;
>>> +			list_del(&signal_pt->list);
>>> +			dma_fence_put(&signal_pt->base->base);
>>> +			kfree(signal_pt);
>>> +		} else {
>>> +			/*signal_pt is in order in list, from small to big, so
>>> +			 * the later must not be signal either */
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	spin_unlock(&syncobj->lock);
>>> +}
>>>    /**
>>>     * drm_syncobj_replace_fence - replace fence in a sync object.
>>>     * @syncobj: Sync object to replace fence in
>>> @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>>>    			       u64 point,
>>>    			       struct dma_fence *fence)
>>>    {
>>> -	struct dma_fence *old_fence;
>>> -	struct drm_syncobj_cb *cur, *tmp;
>>> -
>>> -	if (fence)
>>> -		dma_fence_get(fence);
>>> -
>>> -	spin_lock(&syncobj->lock);
>>> -
>>> -	old_fence = rcu_dereference_protected(syncobj->fence,
>>> -					      lockdep_is_held(&syncobj->lock));
>>> -	rcu_assign_pointer(syncobj->fence, fence);
>>> +	drm_syncobj_garbage_collection(syncobj);
>>> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>> +		if (fence)
>>> +			drm_syncobj_create_signal_pt(syncobj, fence, point);
>>> +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>> +		u64 pt_value;
>>> +
>>> +		if (!fence) {
>>> +			drm_syncobj_timeline_fini(syncobj,
>>> +						  &syncobj->syncobj_timeline);
>>> +			drm_syncobj_timeline_init(syncobj,
>>> +						  &syncobj->syncobj_timeline);
>>> +			return;
>>> +		}
>>> +		pt_value = syncobj->syncobj_timeline.signal_point +
>>> +			DRM_SYNCOBJ_NORMAL_POINT;
>>> +		drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
>>> +	} else {
>>> +		DRM_ERROR("the syncobj type isn't support\n");
>>> +		return;
>>> +	}
>>> +	if (fence) {
>>> +		struct drm_syncobj_cb *cur, *tmp;
>>> -	if (fence != old_fence) {
>>> +		spin_lock(&syncobj->lock);
>>>    		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
>>>    			list_del_init(&cur->node);
>>>    			cur->func(syncobj, cur);
>>>    		}
>>> +		spin_unlock(&syncobj->lock);
>>>    	}
>>> -
>>> -	spin_unlock(&syncobj->lock);
>>> -
>>> -	dma_fence_put(old_fence);
>>>    }
>>>    EXPORT_SYMBOL(drm_syncobj_replace_fence);
>>> @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>>>    	return 0;
>>>    }
>>> +static struct dma_fence *
>>> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 flags)
>>> +{
>>> +	struct dma_fence *fence;
>>> +	int ret = 0;
>>> +
>>> +	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>> +		ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
>>> +					point <= syncobj->syncobj_timeline.signal_point,
>>> +					msecs_to_jiffies(10000)); /* wait 10s */
>>> +		if (ret <= 0)
>>> +			return NULL;
>>> +	}
>>> +	spin_lock(&syncobj->lock);
>>> +	fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
>>> +	spin_unlock(&syncobj->lock);
>>> +	return fence;
>>> +}
>>> +
>>>    /**
>>>     * drm_syncobj_find_fence - lookup and reference the fence in a sync object
>>>     * @file_private: drm file private pointer
>>> @@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>>>     * contains a reference to the fence, which must be released by calling
>>>     * dma_fence_put().
>>>     */
>>> -int drm_syncobj_find_fence(struct drm_file *file_private,
>>> -			   u32 handle, u64 point,
>>> -			   struct dma_fence **fence)
>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>>> +			     u64 flags, struct dma_fence **fence)
>>>    {
>>> -	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
>>>    	int ret = 0;
>>>    	if (!syncobj)
>>>    		return -ENOENT;
>>> -	*fence = drm_syncobj_fence_get(syncobj);
>>> +	drm_syncobj_garbage_collection(syncobj);
>>> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>> +		/*NORMAL syncobj always wait on last pt */
>>> +		u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
>>> +
>>> +		if (tail_pt_value == 0)
>>> +			tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
>>> +		/* NORMAL syncobj doesn't care point value */
>>> +		WARN_ON(point != 0);
>>> +		*fence = drm_syncobj_point_get(syncobj, tail_pt_value,
>>> +							flags);
>>> +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>> +		*fence = drm_syncobj_point_get(syncobj, point,
>>> +							flags);
>>> +	} else {
>>> +		DRM_ERROR("Don't support this type syncobj\n");
>>> +		*fence = NULL;
>>> +	}
>>>    	if (!*fence) {
>>>    		ret = -EINVAL;
>>>    	}
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL(drm_syncobj_search_fence);
>>> +int drm_syncobj_find_fence(struct drm_file *file_private,
>>> +			   u32 handle, u64 point,
>>> +			   struct dma_fence **fence) {
>>> +	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
>>> +
>>> +	int ret = drm_syncobj_search_fence(syncobj, point,
>>> +					DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>> +					fence);
>>>    	drm_syncobj_put(syncobj);
>>>    	return ret;
>>>    }
>>> @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
>>>    	struct drm_syncobj *syncobj = container_of(kref,
>>>    						   struct drm_syncobj,
>>>    						   refcount);
>>> -	drm_syncobj_replace_fence(syncobj, 0, NULL);
>>> +	drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>>>    	kfree(syncobj);
>>>    }
>>>    EXPORT_SYMBOL(drm_syncobj_free);
>>> @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>>>    	kref_init(&syncobj->refcount);
>>>    	INIT_LIST_HEAD(&syncobj->cb_list);
>>>    	spin_lock_init(&syncobj->lock);
>>> +	if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
>>> +		syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
>>> +	else
>>> +		syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
>>> +	drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
>>>    	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>>>    		ret = drm_syncobj_assign_null_handle(syncobj);
>>> @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev, void *data,
>>>    		return -ENODEV;
>>>    	/* no valid flags yet */
>>> -	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
>>> +	if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
>>> +			    DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
>>>    		return -EINVAL;
>>>    	return drm_syncobj_create_as_handle(file_private,
>>> @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
>>>    	struct syncobj_wait_entry *wait =
>>>    		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>>> -	/* This happens inside the syncobj lock */
>>> -	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>> -							      lockdep_is_held(&syncobj->lock)));
>>> +	drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
>>> +
>>>    	wake_up_process(wait->task);
>>>    }
>>> @@ -698,7 +910,8 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>>    	signaled_count = 0;
>>>    	for (i = 0; i < count; ++i) {
>>>    		entries[i].task = current;
>>> -		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
>>> +		ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
>>> +					       &entries[i].fence);
>>>    		if (!entries[i].fence) {
>>>    			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>    				continue;
>>> @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
>>>    	if (ret < 0)
>>>    		return ret;
>>> -	for (i = 0; i < args->count_handles; i++)
>>> -		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
>>> -
>>> +	for (i = 0; i < args->count_handles; i++) {
>>> +		if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>> +			DRM_ERROR("timeline syncobj cannot reset!\n");
>>> +			ret = -EINVAL;
>>> +			goto out;
>>> +		}
>>> +		drm_syncobj_timeline_fini(syncobjs[i],
>>> +					  &syncobjs[i]->syncobj_timeline);
>>> +		drm_syncobj_timeline_init(syncobjs[i],
>>> +					  &syncobjs[i]->syncobj_timeline);
>>> +	}
>>> +out:
>>>    	drm_syncobj_array_free(syncobjs, args->count_handles);
>>> -	return 0;
>>> +	return ret;
>>>    }
>>>    int
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index 0a8d2d64f380..579e91a5858b 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>>>    		if (!(flags & I915_EXEC_FENCE_WAIT))
>>>    			continue;
>>> -		fence = drm_syncobj_fence_get(syncobj);
>>> +		drm_syncobj_search_fence(syncobj, 0,
>>> +					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>> +					 &fence);
>>>    		if (!fence)
>>>    			return -EINVAL;
>>> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
>>> index 425432b85a87..9535521e6623 100644
>>> --- a/include/drm/drm_syncobj.h
>>> +++ b/include/drm/drm_syncobj.h
>>> @@ -30,6 +30,25 @@
>>>    struct drm_syncobj_cb;
>>> +enum drm_syncobj_type {
>>> +	DRM_SYNCOBJ_TYPE_NORMAL,
>>> +	DRM_SYNCOBJ_TYPE_TIMELINE
>>> +};
>>> +
>>> +struct drm_syncobj_timeline {
>>> +	wait_queue_head_t	wq;
>>> +	u64 timeline_context;
>>> +	/**
>>> +	 * @timeline: syncobj timeline
>>> +	 */
>>> +	u64 timeline;
>>> +	u64 signal_point;
>>> +
>>> +
>>> +	struct rb_root wait_pt_tree;
>>> +	struct list_head signal_pt_list;
>>> +};
>>> +
>>>    /**
>>>     * struct drm_syncobj - sync object.
>>>     *
>>> @@ -41,19 +60,19 @@ struct drm_syncobj {
>>>    	 */
>>>    	struct kref refcount;
>>>    	/**
>>> -	 * @fence:
>>> -	 * NULL or a pointer to the fence bound to this object.
>>> -	 *
>>> -	 * This field should not be used directly. Use drm_syncobj_fence_get()
>>> -	 * and drm_syncobj_replace_fence() instead.
>>> +	 * @type: indicate syncobj type
>>>    	 */
>>> -	struct dma_fence __rcu *fence;
>>> +	enum drm_syncobj_type type;
>>>    	/**
>>> -	 * @cb_list: List of callbacks to call when the &fence gets replaced.
>>> +	 * @syncobj_timeline: timeline
>>>    	 */
>>> +	struct drm_syncobj_timeline syncobj_timeline;
>>> +	/**
>>> +         * @cb_list: List of callbacks to call when the &fence gets replaced.
>>> +         */
>>>    	struct list_head cb_list;
>>>    	/**
>>> -	 * @lock: Protects &cb_list and write-locks &fence.
>>> +	 * @lock: Protects syncobj list and write-locks &fence.
>>>    	 */
>>>    	spinlock_t lock;
>>>    	/**
>>> @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
>>>    /**
>>>     * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>>>     * @node: used by drm_syncob_add_callback to append this struct to
>>> - *	  &drm_syncobj.cb_list
>>> + *       &drm_syncobj.cb_list
>>>     * @func: drm_syncobj_func_t to call
>>>     *
>>>     * This struct will be initialized by drm_syncobj_add_callback, additional
>>> @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>>>    	kref_put(&obj->refcount, drm_syncobj_free);
>>>    }
>>> -/**
>>> - * drm_syncobj_fence_get - get a reference to a fence in a sync object
>>> - * @syncobj: sync object.
>>> - *
>>> - * This acquires additional reference to &drm_syncobj.fence contained in @obj,
>>> - * if not NULL. It is illegal to call this without already holding a reference.
>>> - * No locks required.
>>> - *
>>> - * Returns:
>>> - * Either the fence of @obj or NULL if there's none.
>>> - */
>>> -static inline struct dma_fence *
>>> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
>>> -{
>>> -	struct dma_fence *fence;
>>> -
>>> -	rcu_read_lock();
>>> -	fence = dma_fence_get_rcu_safe(&syncobj->fence);
>>> -	rcu_read_unlock();
>>> -
>>> -	return fence;
>>> -}
>>> -
>>>    struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>>>    				     u32 handle);
>>>    void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
>>> @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>>>    int drm_syncobj_get_handle(struct drm_file *file_private,
>>>    			   struct drm_syncobj *syncobj, u32 *handle);
>>>    int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>>> +			     u64 flags, struct dma_fence **fence);
>>>    #endif
>>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>>> index 300f336633f2..cebdb2541eb7 100644
>>> --- a/include/uapi/drm/drm.h
>>> +++ b/include/uapi/drm/drm.h
>>> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>>>    struct drm_syncobj_create {
>>>    	__u32 handle;
>>>    #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
>>> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>>>    	__u32 flags;
>>>    };

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
  2018-09-14 16:43         ` Christian König
@ 2018-09-14 18:24           ` Daniel Vetter
       [not found]             ` <CAKMK7uH9jLO0bJe8ek-JyK3u1jbKY6Jk6xgT9ifgcko7Hi2URA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Vetter @ 2018-09-14 18:24 UTC (permalink / raw)
  To: Christian König; +Cc: Daniel Rakos, amd-gfx list, dri-devel, Dave Airlie

On Fri, Sep 14, 2018 at 6:43 PM, Christian König
<christian.koenig@amd.com> wrote:
> Am 14.09.2018 um 18:10 schrieb Daniel Vetter:
>>
>> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
>>>
>>> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
>>>>
>>>> This patch is for VK_KHR_timeline_semaphore extension, semaphore is
>>>> called syncobj in kernel side:
>>>> This extension introduces a new type of syncobj that has an integer
>>>> payload
>>>> identifying a point in a timeline. Such timeline syncobjs support the
>>>> following operations:
>>>>      * CPU query - A host operation that allows querying the payload of
>>>> the
>>>>        timeline syncobj.
>>>>      * CPU wait - A host operation that allows a blocking wait for a
>>>>        timeline syncobj to reach a specified value.
>>>>      * Device wait - A device operation that allows waiting for a
>>>>        timeline syncobj to reach a specified value.
>>>>      * Device signal - A device operation that allows advancing the
>>>>        timeline syncobj to a specified value.
>>>>
>>>> Since it's a timeline, that means the front time point(PT) always is
>>>> signaled before the late PT.
>>>> a. signal PT design:
>>>> Signal PT fence N depends on PT[N-1] fence and signal opertion fence,
>>>> when PT[N] fence is signaled,
>>>> the timeline will increase to value of PT[N].
>>>> b. wait PT design:
>>>> Wait PT fence is signaled by reaching timeline point value, when
>>>> timeline is increasing, will compare
>>>> wait PTs value with new timeline value, if PT value is lower than
>>>> timeline value, then wait PT will be
>>>> signaled, otherwise keep in list. syncobj wait operation can wait on any
>>>> point of timeline,
>>>> so need a RB tree to order them. And wait PT could ahead of signal PT,
>>>> we need a sumission fence to
>>>> perform that.
>>>>
>>>> v2:
>>>> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
>>>> 2. move unexposed denitions to .c file. (Daniel Vetter)
>>>> 3. split up the change to drm_syncobj_find_fence() in a separate patch.
>>>> (Christian)
>>>> 4. split up the change to drm_syncobj_replace_fence() in a separate
>>>> patch.
>>>> 5. drop the submission_fence implementation and instead use wait_event()
>>>> for that. (Christian)
>>>> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
>>>>
>>>> v3:
>>>> 1. replace normal syncobj with timeline implemenation. (Vetter and
>>>> Christian)
>>>>       a. normal syncobj signal op will create a signal PT to tail of
>>>> signal pt list.
>>>>       b. normal syncobj wait op will create a wait pt with last signal
>>>> point, and this wait PT is only signaled by related signal point PT.
>>>> 2. many bug fix and clean up
>>>> 3. stub fence moving is moved to other patch.
>>>>
>>>> v4:
>>>> 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
>>>> 2. fix syncobj lifecycle. (Christian)
>>>> 3. only enable_signaling when there is wait_pt. (Christian)
>>>> 4. fix timeline path issues.
>>>> 5. write a timeline test in libdrm
>>>>
>>>> v5: (Christian)
>>>> 1. semaphore is called syncobj in kernel side.
>>>> 2. don't need 'timeline' characters in some function name.
>>>> 3. keep syncobj cb
>>>>
>>>> normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
>>>> timeline syncobj is tested by ./amdgpu_test -s 9
>>>>
>>>> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
>>>> Cc: Christian Konig <christian.koenig@amd.com>
>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>> Cc: Daniel Rakos <Daniel.Rakos@amd.com>
>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>
>>> At least on first glance that looks like it should work, going to do a
>>> detailed review on Monday.
>>
>> Just for my understanding, it's all condensed down to 1 patch now? I kinda
>> didn't follow the detailed discussion last few days at all :-/
>
>
> I've already committed all the cleanup/fix prerequisites to drm-misc-next.
>
> The driver specific implementation needs to come on top and maybe a new CPU
> wait IOCTL.
>
> But essentially this patch is just the core of the kernel implementation.

Ah cool, missed that.

>> Also, is there a testcase, igt highly preferred (because then we'll run it
>> in our intel-gfx CI, and a bunch of people outside of intel have already
>> discovered that and are using it).
>
>
> libdrm patches and I think amdgpu based test cases where already published
> as well.
>
> Not sure about igt testcases.

I guess we can write them when the intel implementation shows up. Just
kinda still hoping that we'd have a more unfified test suite. And not
really well-kept secret: We do have an amdgpu in our CI, in the form
of kbl-g :-) But unfortunately it's not running the full test set for
patches (only for drm-tip). But we could perhaps run more of the
amdgpu tests somehow, if there's serious interest.

Cheers, Daniel


> Christian.
>
>
>>
>> Thanks, Daniel
>>
>>> Christian.
>>>
>>>> ---
>>>>    drivers/gpu/drm/drm_syncobj.c              | 294
>>>> ++++++++++++++++++---
>>>>    drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
>>>>    include/drm/drm_syncobj.h                  |  62 +++--
>>>>    include/uapi/drm/drm.h                     |   1 +
>>>>    4 files changed, 292 insertions(+), 69 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>> index e9ce623d049e..e78d076f2703 100644
>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>> @@ -56,6 +56,9 @@either
>>>>    #include "drm_internal.h"
>>>>    #include <drm/drm_syncobj.h>
>>>> +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
>>>> +#define DRM_SYNCOBJ_NORMAL_POINT 1
>>>> +
>>>>    struct drm_syncobj_stub_fence {
>>>>         struct dma_fence base;
>>>>         spinlock_t lock;
>>>> @@ -82,6 +85,11 @@ static const struct dma_fence_ops
>>>> drm_syncobj_stub_fence_ops = {
>>>>         .release = drm_syncobj_stub_fence_release,
>>>>    };
>>>> +struct drm_syncobj_signal_pt {
>>>> +       struct dma_fence_array *base;
>>>> +       u64    value;
>>>> +       struct list_head list;
>>>> +};
>>>>    /**
>>>>     * drm_syncobj_find - lookup and reference a sync object.
>>>> @@ -124,7 +132,7 @@ static int
>>>> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>>>    {
>>>>         int ret;
>>>> -       *fence = drm_syncobj_fence_get(syncobj);
>>>> +       ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>>>         if (*fence)
>>>>                 return 1;
>>>> @@ -133,10 +141,10 @@ static int
>>>> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>>>          * have the lock, try one more time just to be sure we don't add
>>>> a
>>>>          * callback when a fence has already been set.
>>>>          */
>>>> -       if (syncobj->fence) {
>>>> -               *fence =
>>>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>>> -
>>>> lockdep_is_held(&syncobj->lock)));
>>>> -               ret = 1;
>>>> +       if (fence) {
>>>> +               drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>>> +               if (*fence)
>>>> +                       ret = 1;
>>>>         } else {
>>>>                 *fence = NULL;
>>>>                 drm_syncobj_add_callback_locked(syncobj, cb, func);
>>>> @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct
>>>> drm_syncobj *syncobj,
>>>>         spin_unlock(&syncobj->lock);
>>>>    }
>>>> +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
>>>> +                                     struct drm_syncobj_timeline
>>>> *syncobj_timeline)
>>>> +{
>>>> +       spin_lock(&syncobj->lock);
>>>> +       syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
>>>> +       syncobj_timeline->timeline = 0;
>>>> +       syncobj_timeline->signal_point = 0;
>>>> +       init_waitqueue_head(&syncobj_timeline->wq);
>>>> +
>>>> +       INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
>>>> +       spin_unlock(&syncobj->lock);
>>>> +}
>>>> +
>>>> +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
>>>> +                                     struct drm_syncobj_timeline
>>>> *syncobj_timeline)
>>>> +{
>>>> +       struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
>>>> +
>>>> +       spin_lock(&syncobj->lock);
>>>> +       list_for_each_entry_safe(signal_pt, tmp,
>>>> +                                &syncobj_timeline->signal_pt_list,
>>>> list) {
>>>> +               list_del(&signal_pt->list);
>>>> +               dma_fence_put(&signal_pt->base->base);
>>>> +               kfree(signal_pt);
>>>> +       }
>>>> +       spin_unlock(&syncobj->lock);
>>>> +}
>>>> +
>>>> +static struct dma_fence
>>>> +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
>>>> +                                     uint64_t point)
>>>> +{
>>>> +       struct drm_syncobj_timeline *timeline =
>>>> &syncobj->syncobj_timeline;
>>>> +       struct drm_syncobj_signal_pt *signal_pt;
>>>> +
>>>> +       if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
>>>> +           (point <= timeline->timeline)) {
>>>> +               struct drm_syncobj_stub_fence *fence =
>>>> +                       kzalloc(sizeof(struct drm_syncobj_stub_fence),
>>>> +                               GFP_KERNEL);
>>>> +
>>>> +               if (!fence)
>>>> +                       return NULL;
>>>> +               spin_lock_init(&fence->lock);
>>>> +               dma_fence_init(&fence->base,
>>>> +                              &drm_syncobj_stub_fence_ops,
>>>> +                              &fence->lock,
>>>> +
>>>> syncobj->syncobj_timeline.timeline_context,
>>>> +                              point);
>>>> +
>>>> +               dma_fence_signal(&fence->base);
>>>> +               return &fence->base;
>>>> +       }
>>>> +
>>>> +       list_for_each_entry(signal_pt, &timeline->signal_pt_list, list)
>>>> {
>>>> +               if (point > signal_pt->value)
>>>> +                       continue;
>>>> +               if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
>>>> +                   (point != signal_pt->value))
>>>> +                       continue;
>>>> +               return dma_fence_get(&signal_pt->base->base);
>>>> +       }
>>>> +       return NULL;
>>>> +}
>>>> +
>>>> +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
>>>> +                                       struct dma_fence *fence,
>>>> +                                       u64 point)
>>>> +{
>>>> +       struct drm_syncobj_signal_pt *signal_pt =
>>>> +               kzalloc(sizeof(struct drm_syncobj_signal_pt),
>>>> GFP_KERNEL);
>>>> +       struct drm_syncobj_signal_pt *tail_pt;
>>>> +       struct dma_fence **fences;
>>>> +       int num_fences = 0;
>>>> +       int ret = 0, i;
>>>> +
>>>> +       if (!signal_pt)
>>>> +               return -ENOMEM;
>>>> +       if (syncobj->syncobj_timeline.signal_point >= point) {
>>>> +               DRM_WARN("A later signal is ready!");
>>>> +               goto out;
>>>> +       }
>>>> +       if (!fence)
>>>> +               goto out;
>>>> +
>>>> +       fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
>>>> +       if (!fences)
>>>> +               goto out;
>>>> +       fences[num_fences++] = dma_fence_get(fence);
>>>> +       /* timeline syncobj must take this dependency */
>>>> +       if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>> +               spin_lock(&syncobj->lock);
>>>> +               if
>>>> (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
>>>> +                       tail_pt =
>>>> list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>>>> +                                                 struct
>>>> drm_syncobj_signal_pt, list);
>>>> +                       fences[num_fences++] =
>>>> dma_fence_get(&tail_pt->base->base);
>>>> +               }
>>>> +               spin_unlock(&syncobj->lock);
>>>> +       }
>>>> +       signal_pt->base = dma_fence_array_create(num_fences, fences,
>>>> +
>>>> syncobj->syncobj_timeline.timeline_context,
>>>> +                                                point, false);
>>>> +       if (!signal_pt->base)
>>>> +               goto fail;
>>>> +
>>>> +       spin_lock(&syncobj->lock);
>>>> +       signal_pt->value = point;
>>>> +       INIT_LIST_HEAD(&signal_pt->list);
>>>> +       list_add_tail(&signal_pt->list,
>>>> &syncobj->syncobj_timeline.signal_pt_list);
>>>> +       syncobj->syncobj_timeline.signal_point = point;
>>>> +       spin_unlock(&syncobj->lock);
>>>> +       wake_up_all(&syncobj->syncobj_timeline.wq);
>>>> +
>>>> +       return 0;
>>>> +fail:
>>>> +       for (i = 0; i < num_fences; i++)
>>>> +               dma_fence_put(fences[i]);
>>>> +       kfree(fences);
>>>> +out:
>>>> +       kfree(signal_pt);
>>>> +       return ret;
>>>> +}
>>>> +
>>>> +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
>>>> +{
>>>> +       struct drm_syncobj_timeline *timeline =
>>>> &syncobj->syncobj_timeline;
>>>> +       struct drm_syncobj_signal_pt *signal_pt, *tmp;
>>>> +
>>>> +       spin_lock(&syncobj->lock);
>>>> +       list_for_each_entry_safe(signal_pt, tmp,
>>>> +                                &timeline->signal_pt_list, list) {
>>>> +               if (dma_fence_is_signaled(&signal_pt->base->base)) {
>>>> +                       timeline->timeline = signal_pt->value;
>>>> +                       list_del(&signal_pt->list);
>>>> +                       dma_fence_put(&signal_pt->base->base);
>>>> +                       kfree(signal_pt);
>>>> +               } else {
>>>> +                       /*signal_pt is in order in list, from small to
>>>> big, so
>>>> +                        * the later must not be signal either */
>>>> +                       break;
>>>> +               }
>>>> +       }
>>>> +
>>>> +       spin_unlock(&syncobj->lock);
>>>> +}
>>>>    /**
>>>>     * drm_syncobj_replace_fence - replace fence in a sync object.
>>>>     * @syncobj: Sync object to replace fence in
>>>> @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj
>>>> *syncobj,
>>>>                                u64 point,
>>>>                                struct dma_fence *fence)
>>>>    {
>>>> -       struct dma_fence *old_fence;
>>>> -       struct drm_syncobj_cb *cur, *tmp;
>>>> -
>>>> -       if (fence)
>>>> -               dma_fence_get(fence);
>>>> -
>>>> -       spin_lock(&syncobj->lock);
>>>> -
>>>> -       old_fence = rcu_dereference_protected(syncobj->fence,
>>>> -
>>>> lockdep_is_held(&syncobj->lock));
>>>> -       rcu_assign_pointer(syncobj->fence, fence);
>>>> +       drm_syncobj_garbage_collection(syncobj);
>>>> +       if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>> +               if (fence)
>>>> +                       drm_syncobj_create_signal_pt(syncobj, fence,
>>>> point);
>>>> +       } else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>>> +               u64 pt_value;
>>>> +
>>>> +               if (!fence) {
>>>> +                       drm_syncobj_timeline_fini(syncobj,
>>>> +
>>>> &syncobj->syncobj_timeline);
>>>> +                       drm_syncobj_timeline_init(syncobj,
>>>> +
>>>> &syncobj->syncobj_timeline);
>>>> +                       return;
>>>> +               }
>>>> +               pt_value = syncobj->syncobj_timeline.signal_point +
>>>> +                       DRM_SYNCOBJ_NORMAL_POINT;
>>>> +               drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
>>>> +       } else {
>>>> +               DRM_ERROR("the syncobj type isn't support\n");
>>>> +               return;
>>>> +       }
>>>> +       if (fence) {
>>>> +               struct drm_syncobj_cb *cur, *tmp;
>>>> -       if (fence != old_fence) {
>>>> +               spin_lock(&syncobj->lock);
>>>>                 list_for_each_entry_safe(cur, tmp, &syncobj->cb_list,
>>>> node) {
>>>>                         list_del_init(&cur->node);
>>>>                         cur->func(syncobj, cur);
>>>>                 }
>>>> +               spin_unlock(&syncobj->lock);
>>>>         }
>>>> -
>>>> -       spin_unlock(&syncobj->lock);
>>>> -
>>>> -       dma_fence_put(old_fence);
>>>>    }
>>>>    EXPORT_SYMBOL(drm_syncobj_replace_fence);
>>>> @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct
>>>> drm_syncobj *syncobj)
>>>>         return 0;
>>>>    }
>>>> +static struct dma_fence *
>>>> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64
>>>> flags)
>>>> +{
>>>> +       struct dma_fence *fence;
>>>> +       int ret = 0;
>>>> +
>>>> +       if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>> +               ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
>>>> +                                       point <=
>>>> syncobj->syncobj_timeline.signal_point,
>>>> +                                       msecs_to_jiffies(10000)); /*
>>>> wait 10s */
>>>> +               if (ret <= 0)
>>>> +                       return NULL;
>>>> +       }
>>>> +       spin_lock(&syncobj->lock);
>>>> +       fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
>>>> +       spin_unlock(&syncobj->lock);
>>>> +       return fence;
>>>> +}
>>>> +
>>>>    /**
>>>>     * drm_syncobj_find_fence - lookup and reference the fence in a sync
>>>> object
>>>>     * @file_private: drm file private pointer
>>>> @@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct
>>>> drm_syncobj *syncobj)
>>>>     * contains a reference to the fence, which must be released by
>>>> calling
>>>>     * dma_fence_put().
>>>>     */
>>>> -int drm_syncobj_find_fence(struct drm_file *file_private,
>>>> -                          u32 handle, u64 point,
>>>> -                          struct dma_fence **fence)
>>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>>>> +                            u64 flags, struct dma_fence **fence)
>>>>    {
>>>> -       struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
>>>> handle);
>>>>         int ret = 0;
>>>>         if (!syncobj)
>>>>                 return -ENOENT;
>>>> -       *fence = drm_syncobj_fence_get(syncobj);
>>>> +       drm_syncobj_garbage_collection(syncobj);
>>>> +       if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>>> +               /*NORMAL syncobj always wait on last pt */
>>>> +               u64 tail_pt_value =
>>>> syncobj->syncobj_timeline.signal_point;
>>>> +
>>>> +               if (tail_pt_value == 0)
>>>> +                       tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
>>>> +               /* NORMAL syncobj doesn't care point value */
>>>> +               WARN_ON(point != 0);
>>>> +               *fence = drm_syncobj_point_get(syncobj, tail_pt_value,
>>>> +                                                       flags);
>>>> +       } else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>> +               *fence = drm_syncobj_point_get(syncobj, point,
>>>> +                                                       flags);
>>>> +       } else {
>>>> +               DRM_ERROR("Don't support this type syncobj\n");
>>>> +               *fence = NULL;
>>>> +       }
>>>>         if (!*fence) {
>>>>                 ret = -EINVAL;
>>>>         }
>>>> +       return ret;
>>>> +}
>>>> +EXPORT_SYMBOL(drm_syncobj_search_fence);
>>>> +int drm_syncobj_find_fence(struct drm_file *file_private,
>>>> +                          u32 handle, u64 point,
>>>> +                          struct dma_fence **fence) {
>>>> +       struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
>>>> handle);
>>>> +
>>>> +       int ret = drm_syncobj_search_fence(syncobj, point,
>>>> +
>>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>>> +                                       fence);
>>>>         drm_syncobj_put(syncobj);
>>>>         return ret;
>>>>    }
>>>> @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
>>>>         struct drm_syncobj *syncobj = container_of(kref,
>>>>                                                    struct drm_syncobj,
>>>>                                                    refcount);
>>>> -       drm_syncobj_replace_fence(syncobj, 0, NULL);
>>>> +       drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>>>>         kfree(syncobj);
>>>>    }
>>>>    EXPORT_SYMBOL(drm_syncobj_free);
>>>> @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj
>>>> **out_syncobj, uint32_t flags,
>>>>         kref_init(&syncobj->refcount);
>>>>         INIT_LIST_HEAD(&syncobj->cb_list);
>>>>         spin_lock_init(&syncobj->lock);
>>>> +       if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
>>>> +               syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
>>>> +       else
>>>> +               syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
>>>> +       drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
>>>>         if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>>>>                 ret = drm_syncobj_assign_null_handle(syncobj);
>>>> @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev,
>>>> void *data,
>>>>                 return -ENODEV;
>>>>         /* no valid flags yet */
>>>> -       if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
>>>> +       if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
>>>> +                           DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
>>>>                 return -EINVAL;
>>>>         return drm_syncobj_create_as_handle(file_private,
>>>> @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct
>>>> drm_syncobj *syncobj,
>>>>         struct syncobj_wait_entry *wait =
>>>>                 container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>>>> -       /* This happens inside the syncobj lock */
>>>> -       wait->fence =
>>>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>>> -
>>>> lockdep_is_held(&syncobj->lock)));
>>>> +       drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
>>>> +
>>>>         wake_up_process(wait->task);
>>>>    }
>>>> @@ -698,7 +910,8 @@ static signed long
>>>> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>>>         signaled_count = 0;
>>>>         for (i = 0; i < count; ++i) {
>>>>                 entries[i].task = current;
>>>> -               entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
>>>> +               ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
>>>> +                                              &entries[i].fence);
>>>>                 if (!entries[i].fence) {
>>>>                         if (flags &
>>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>>                                 continue;
>>>> @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev,
>>>> void *data,
>>>>         if (ret < 0)
>>>>                 return ret;
>>>> -       for (i = 0; i < args->count_handles; i++)
>>>> -               drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
>>>> -
>>>> +       for (i = 0; i < args->count_handles; i++) {
>>>> +               if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>> +                       DRM_ERROR("timeline syncobj cannot reset!\n");
>>>> +                       ret = -EINVAL;
>>>> +                       goto out;
>>>> +               }
>>>> +               drm_syncobj_timeline_fini(syncobjs[i],
>>>> +
>>>> &syncobjs[i]->syncobj_timeline);
>>>> +               drm_syncobj_timeline_init(syncobjs[i],
>>>> +
>>>> &syncobjs[i]->syncobj_timeline);
>>>> +       }
>>>> +out:
>>>>         drm_syncobj_array_free(syncobjs, args->count_handles);
>>>> -       return 0;
>>>> +       return ret;
>>>>    }
>>>>    int
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> index 0a8d2d64f380..579e91a5858b 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>>>>                 if (!(flags & I915_EXEC_FENCE_WAIT))
>>>>                         continue;
>>>> -               fence = drm_syncobj_fence_get(syncobj);
>>>> +               drm_syncobj_search_fence(syncobj, 0,
>>>> +
>>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>>> +                                        &fence);
>>>>                 if (!fence)
>>>>                         return -EINVAL;
>>>> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
>>>> index 425432b85a87..9535521e6623 100644
>>>> --- a/include/drm/drm_syncobj.h
>>>> +++ b/include/drm/drm_syncobj.h
>>>> @@ -30,6 +30,25 @@
>>>>    struct drm_syncobj_cb;
>>>> +enum drm_syncobj_type {
>>>> +       DRM_SYNCOBJ_TYPE_NORMAL,
>>>> +       DRM_SYNCOBJ_TYPE_TIMELINE
>>>> +};
>>>> +
>>>> +struct drm_syncobj_timeline {
>>>> +       wait_queue_head_t       wq;
>>>> +       u64 timeline_context;
>>>> +       /**
>>>> +        * @timeline: syncobj timeline
>>>> +        */
>>>> +       u64 timeline;
>>>> +       u64 signal_point;
>>>> +
>>>> +
>>>> +       struct rb_root wait_pt_tree;
>>>> +       struct list_head signal_pt_list;
>>>> +};
>>>> +
>>>>    /**
>>>>     * struct drm_syncobj - sync object.
>>>>     *
>>>> @@ -41,19 +60,19 @@ struct drm_syncobj {
>>>>          */
>>>>         struct kref refcount;
>>>>         /**
>>>> -        * @fence:
>>>> -        * NULL or a pointer to the fence bound to this object.
>>>> -        *
>>>> -        * This field should not be used directly. Use
>>>> drm_syncobj_fence_get()
>>>> -        * and drm_syncobj_replace_fence() instead.
>>>> +        * @type: indicate syncobj type
>>>>          */
>>>> -       struct dma_fence __rcu *fence;
>>>> +       enum drm_syncobj_type type;
>>>>         /**
>>>> -        * @cb_list: List of callbacks to call when the &fence gets
>>>> replaced.
>>>> +        * @syncobj_timeline: timeline
>>>>          */
>>>> +       struct drm_syncobj_timeline syncobj_timeline;
>>>> +       /**
>>>> +         * @cb_list: List of callbacks to call when the &fence gets
>>>> replaced.
>>>> +         */
>>>>         struct list_head cb_list;
>>>>         /**
>>>> -        * @lock: Protects &cb_list and write-locks &fence.
>>>> +        * @lock: Protects syncobj list and write-locks &fence.
>>>>          */
>>>>         spinlock_t lock;
>>>>         /**
>>>> @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj
>>>> *syncobj,
>>>>    /**
>>>>     * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>>>>     * @node: used by drm_syncob_add_callback to append this struct to
>>>> - *       &drm_syncobj.cb_list
>>>> + *       &drm_syncobj.cb_list
>>>>     * @func: drm_syncobj_func_t to call
>>>>     *
>>>>     * This struct will be initialized by drm_syncobj_add_callback,
>>>> additional
>>>> @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>>>>         kref_put(&obj->refcount, drm_syncobj_free);
>>>>    }
>>>> -/**
>>>> - * drm_syncobj_fence_get - get a reference to a fence in a sync object
>>>> - * @syncobj: sync object.
>>>> - *
>>>> - * This acquires additional reference to &drm_syncobj.fence contained
>>>> in @obj,
>>>> - * if not NULL. It is illegal to call this without already holding a
>>>> reference.
>>>> - * No locks required.
>>>> - *
>>>> - * Returns:
>>>> - * Either the fence of @obj or NULL if there's none.
>>>> - */
>>>> -static inline struct dma_fence *
>>>> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
>>>> -{
>>>> -       struct dma_fence *fence;
>>>> -
>>>> -       rcu_read_lock();
>>>> -       fence = dma_fence_get_rcu_safe(&syncobj->fence);
>>>> -       rcu_read_unlock();
>>>> -
>>>> -       return fence;
>>>> -}
>>>> -
>>>>    struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>>>>                                      u32 handle);
>>>>    void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64
>>>> point,
>>>> @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj
>>>> **out_syncobj, uint32_t flags,
>>>>    int drm_syncobj_get_handle(struct drm_file *file_private,
>>>>                            struct drm_syncobj *syncobj, u32 *handle);
>>>>    int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
>>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>>>> +                            u64 flags, struct dma_fence **fence);
>>>>    #endif
>>>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>>>> index 300f336633f2..cebdb2541eb7 100644
>>>> --- a/include/uapi/drm/drm.h
>>>> +++ b/include/uapi/drm/drm.h
>>>> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>>>>    struct drm_syncobj_create {
>>>>         __u32 handle;
>>>>    #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
>>>> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>>>>         __u32 flags;
>>>>    };
>
>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
       [not found]             ` <CAKMK7uH9jLO0bJe8ek-JyK3u1jbKY6Jk6xgT9ifgcko7Hi2URA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-09-14 18:27               ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2018-09-14 18:27 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Chunming Zhou, Daniel Rakos, amd-gfx list, dri-devel, Dave Airlie

Am 14.09.2018 um 20:24 schrieb Daniel Vetter:
> On Fri, Sep 14, 2018 at 6:43 PM, Christian König
> <christian.koenig@amd.com> wrote:
>> Am 14.09.2018 um 18:10 schrieb Daniel Vetter:
>>> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
>>>> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
>>>>> This patch is for VK_KHR_timeline_semaphore extension, semaphore is
>>>>> called syncobj in kernel side:
>>>>> This extension introduces a new type of syncobj that has an integer
>>>>> payload
>>>>> identifying a point in a timeline. Such timeline syncobjs support the
>>>>> following operations:
>>>>>       * CPU query - A host operation that allows querying the payload of
>>>>> the
>>>>>         timeline syncobj.
>>>>>       * CPU wait - A host operation that allows a blocking wait for a
>>>>>         timeline syncobj to reach a specified value.
>>>>>       * Device wait - A device operation that allows waiting for a
>>>>>         timeline syncobj to reach a specified value.
>>>>>       * Device signal - A device operation that allows advancing the
>>>>>         timeline syncobj to a specified value.
>>>>>
>>>>> Since it's a timeline, that means the front time point(PT) always is
>>>>> signaled before the late PT.
>>>>> a. signal PT design:
>>>>> Signal PT fence N depends on PT[N-1] fence and signal opertion fence,
>>>>> when PT[N] fence is signaled,
>>>>> the timeline will increase to value of PT[N].
>>>>> b. wait PT design:
>>>>> Wait PT fence is signaled by reaching timeline point value, when
>>>>> timeline is increasing, will compare
>>>>> wait PTs value with new timeline value, if PT value is lower than
>>>>> timeline value, then wait PT will be
>>>>> signaled, otherwise keep in list. syncobj wait operation can wait on any
>>>>> point of timeline,
>>>>> so need a RB tree to order them. And wait PT could ahead of signal PT,
>>>>> we need a sumission fence to
>>>>> perform that.
>>>>>
>>>>> v2:
>>>>> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
>>>>> 2. move unexposed denitions to .c file. (Daniel Vetter)
>>>>> 3. split up the change to drm_syncobj_find_fence() in a separate patch.
>>>>> (Christian)
>>>>> 4. split up the change to drm_syncobj_replace_fence() in a separate
>>>>> patch.
>>>>> 5. drop the submission_fence implementation and instead use wait_event()
>>>>> for that. (Christian)
>>>>> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
>>>>>
>>>>> v3:
>>>>> 1. replace normal syncobj with timeline implemenation. (Vetter and
>>>>> Christian)
>>>>>        a. normal syncobj signal op will create a signal PT to tail of
>>>>> signal pt list.
>>>>>        b. normal syncobj wait op will create a wait pt with last signal
>>>>> point, and this wait PT is only signaled by related signal point PT.
>>>>> 2. many bug fix and clean up
>>>>> 3. stub fence moving is moved to other patch.
>>>>>
>>>>> v4:
>>>>> 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
>>>>> 2. fix syncobj lifecycle. (Christian)
>>>>> 3. only enable_signaling when there is wait_pt. (Christian)
>>>>> 4. fix timeline path issues.
>>>>> 5. write a timeline test in libdrm
>>>>>
>>>>> v5: (Christian)
>>>>> 1. semaphore is called syncobj in kernel side.
>>>>> 2. don't need 'timeline' characters in some function name.
>>>>> 3. keep syncobj cb
>>>>>
>>>>> normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
>>>>> timeline syncobj is tested by ./amdgpu_test -s 9
>>>>>
>>>>> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
>>>>> Cc: Christian Konig <christian.koenig@amd.com>
>>>>> Cc: Dave Airlie <airlied@redhat.com>
>>>>> Cc: Daniel Rakos <Daniel.Rakos@amd.com>
>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>> At least on first glance that looks like it should work, going to do a
>>>> detailed review on Monday.
>>> Just for my understanding, it's all condensed down to 1 patch now? I kinda
>>> didn't follow the detailed discussion last few days at all :-/
>>
>> I've already committed all the cleanup/fix prerequisites to drm-misc-next.
>>
>> The driver specific implementation needs to come on top and maybe a new CPU
>> wait IOCTL.
>>
>> But essentially this patch is just the core of the kernel implementation.
> Ah cool, missed that.
>
>>> Also, is there a testcase, igt highly preferred (because then we'll run it
>>> in our intel-gfx CI, and a bunch of people outside of intel have already
>>> discovered that and are using it).
>>
>> libdrm patches and I think amdgpu based test cases where already published
>> as well.
>>
>> Not sure about igt testcases.
> I guess we can write them when the intel implementation shows up. Just
> kinda still hoping that we'd have a more unfified test suite. And not
> really well-kept secret: We do have an amdgpu in our CI, in the form
> of kbl-g :-) But unfortunately it's not running the full test set for
> patches (only for drm-tip). But we could perhaps run more of the
> amdgpu tests somehow, if there's serious interest.

Well I wouldn't mind if we sooner or later get rid of the amdgpu unit 
tests in libdrm.

They are more or less just a really bloody mess.

Christian.

>
> Cheers, Daniel
>
>
>> Christian.
>>
>>
>>> Thanks, Daniel
>>>
>>>> Christian.
>>>>
>>>>> ---
>>>>>     drivers/gpu/drm/drm_syncobj.c              | 294
>>>>> ++++++++++++++++++---
>>>>>     drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
>>>>>     include/drm/drm_syncobj.h                  |  62 +++--
>>>>>     include/uapi/drm/drm.h                     |   1 +
>>>>>     4 files changed, 292 insertions(+), 69 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>> index e9ce623d049e..e78d076f2703 100644
>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>> @@ -56,6 +56,9 @@either
>>>>>     #include "drm_internal.h"
>>>>>     #include <drm/drm_syncobj.h>
>>>>> +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
>>>>> +#define DRM_SYNCOBJ_NORMAL_POINT 1
>>>>> +
>>>>>     struct drm_syncobj_stub_fence {
>>>>>          struct dma_fence base;
>>>>>          spinlock_t lock;
>>>>> @@ -82,6 +85,11 @@ static const struct dma_fence_ops
>>>>> drm_syncobj_stub_fence_ops = {
>>>>>          .release = drm_syncobj_stub_fence_release,
>>>>>     };
>>>>> +struct drm_syncobj_signal_pt {
>>>>> +       struct dma_fence_array *base;
>>>>> +       u64    value;
>>>>> +       struct list_head list;
>>>>> +};
>>>>>     /**
>>>>>      * drm_syncobj_find - lookup and reference a sync object.
>>>>> @@ -124,7 +132,7 @@ static int
>>>>> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>>>>     {
>>>>>          int ret;
>>>>> -       *fence = drm_syncobj_fence_get(syncobj);
>>>>> +       ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>>>>          if (*fence)
>>>>>                  return 1;
>>>>> @@ -133,10 +141,10 @@ static int
>>>>> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>>>>           * have the lock, try one more time just to be sure we don't add
>>>>> a
>>>>>           * callback when a fence has already been set.
>>>>>           */
>>>>> -       if (syncobj->fence) {
>>>>> -               *fence =
>>>>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>>>> -
>>>>> lockdep_is_held(&syncobj->lock)));
>>>>> -               ret = 1;
>>>>> +       if (fence) {
>>>>> +               drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>>>> +               if (*fence)
>>>>> +                       ret = 1;
>>>>>          } else {
>>>>>                  *fence = NULL;
>>>>>                  drm_syncobj_add_callback_locked(syncobj, cb, func);
>>>>> @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct
>>>>> drm_syncobj *syncobj,
>>>>>          spin_unlock(&syncobj->lock);
>>>>>     }
>>>>> +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
>>>>> +                                     struct drm_syncobj_timeline
>>>>> *syncobj_timeline)
>>>>> +{
>>>>> +       spin_lock(&syncobj->lock);
>>>>> +       syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
>>>>> +       syncobj_timeline->timeline = 0;
>>>>> +       syncobj_timeline->signal_point = 0;
>>>>> +       init_waitqueue_head(&syncobj_timeline->wq);
>>>>> +
>>>>> +       INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
>>>>> +       spin_unlock(&syncobj->lock);
>>>>> +}
>>>>> +
>>>>> +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
>>>>> +                                     struct drm_syncobj_timeline
>>>>> *syncobj_timeline)
>>>>> +{
>>>>> +       struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
>>>>> +
>>>>> +       spin_lock(&syncobj->lock);
>>>>> +       list_for_each_entry_safe(signal_pt, tmp,
>>>>> +                                &syncobj_timeline->signal_pt_list,
>>>>> list) {
>>>>> +               list_del(&signal_pt->list);
>>>>> +               dma_fence_put(&signal_pt->base->base);
>>>>> +               kfree(signal_pt);
>>>>> +       }
>>>>> +       spin_unlock(&syncobj->lock);
>>>>> +}
>>>>> +
>>>>> +static struct dma_fence
>>>>> +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
>>>>> +                                     uint64_t point)
>>>>> +{
>>>>> +       struct drm_syncobj_timeline *timeline =
>>>>> &syncobj->syncobj_timeline;
>>>>> +       struct drm_syncobj_signal_pt *signal_pt;
>>>>> +
>>>>> +       if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
>>>>> +           (point <= timeline->timeline)) {
>>>>> +               struct drm_syncobj_stub_fence *fence =
>>>>> +                       kzalloc(sizeof(struct drm_syncobj_stub_fence),
>>>>> +                               GFP_KERNEL);
>>>>> +
>>>>> +               if (!fence)
>>>>> +                       return NULL;
>>>>> +               spin_lock_init(&fence->lock);
>>>>> +               dma_fence_init(&fence->base,
>>>>> +                              &drm_syncobj_stub_fence_ops,
>>>>> +                              &fence->lock,
>>>>> +
>>>>> syncobj->syncobj_timeline.timeline_context,
>>>>> +                              point);
>>>>> +
>>>>> +               dma_fence_signal(&fence->base);
>>>>> +               return &fence->base;
>>>>> +       }
>>>>> +
>>>>> +       list_for_each_entry(signal_pt, &timeline->signal_pt_list, list)
>>>>> {
>>>>> +               if (point > signal_pt->value)
>>>>> +                       continue;
>>>>> +               if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
>>>>> +                   (point != signal_pt->value))
>>>>> +                       continue;
>>>>> +               return dma_fence_get(&signal_pt->base->base);
>>>>> +       }
>>>>> +       return NULL;
>>>>> +}
>>>>> +
>>>>> +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
>>>>> +                                       struct dma_fence *fence,
>>>>> +                                       u64 point)
>>>>> +{
>>>>> +       struct drm_syncobj_signal_pt *signal_pt =
>>>>> +               kzalloc(sizeof(struct drm_syncobj_signal_pt),
>>>>> GFP_KERNEL);
>>>>> +       struct drm_syncobj_signal_pt *tail_pt;
>>>>> +       struct dma_fence **fences;
>>>>> +       int num_fences = 0;
>>>>> +       int ret = 0, i;
>>>>> +
>>>>> +       if (!signal_pt)
>>>>> +               return -ENOMEM;
>>>>> +       if (syncobj->syncobj_timeline.signal_point >= point) {
>>>>> +               DRM_WARN("A later signal is ready!");
>>>>> +               goto out;
>>>>> +       }
>>>>> +       if (!fence)
>>>>> +               goto out;
>>>>> +
>>>>> +       fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
>>>>> +       if (!fences)
>>>>> +               goto out;
>>>>> +       fences[num_fences++] = dma_fence_get(fence);
>>>>> +       /* timeline syncobj must take this dependency */
>>>>> +       if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>> +               spin_lock(&syncobj->lock);
>>>>> +               if
>>>>> (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
>>>>> +                       tail_pt =
>>>>> list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>>>>> +                                                 struct
>>>>> drm_syncobj_signal_pt, list);
>>>>> +                       fences[num_fences++] =
>>>>> dma_fence_get(&tail_pt->base->base);
>>>>> +               }
>>>>> +               spin_unlock(&syncobj->lock);
>>>>> +       }
>>>>> +       signal_pt->base = dma_fence_array_create(num_fences, fences,
>>>>> +
>>>>> syncobj->syncobj_timeline.timeline_context,
>>>>> +                                                point, false);
>>>>> +       if (!signal_pt->base)
>>>>> +               goto fail;
>>>>> +
>>>>> +       spin_lock(&syncobj->lock);
>>>>> +       signal_pt->value = point;
>>>>> +       INIT_LIST_HEAD(&signal_pt->list);
>>>>> +       list_add_tail(&signal_pt->list,
>>>>> &syncobj->syncobj_timeline.signal_pt_list);
>>>>> +       syncobj->syncobj_timeline.signal_point = point;
>>>>> +       spin_unlock(&syncobj->lock);
>>>>> +       wake_up_all(&syncobj->syncobj_timeline.wq);
>>>>> +
>>>>> +       return 0;
>>>>> +fail:
>>>>> +       for (i = 0; i < num_fences; i++)
>>>>> +               dma_fence_put(fences[i]);
>>>>> +       kfree(fences);
>>>>> +out:
>>>>> +       kfree(signal_pt);
>>>>> +       return ret;
>>>>> +}
>>>>> +
>>>>> +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
>>>>> +{
>>>>> +       struct drm_syncobj_timeline *timeline =
>>>>> &syncobj->syncobj_timeline;
>>>>> +       struct drm_syncobj_signal_pt *signal_pt, *tmp;
>>>>> +
>>>>> +       spin_lock(&syncobj->lock);
>>>>> +       list_for_each_entry_safe(signal_pt, tmp,
>>>>> +                                &timeline->signal_pt_list, list) {
>>>>> +               if (dma_fence_is_signaled(&signal_pt->base->base)) {
>>>>> +                       timeline->timeline = signal_pt->value;
>>>>> +                       list_del(&signal_pt->list);
>>>>> +                       dma_fence_put(&signal_pt->base->base);
>>>>> +                       kfree(signal_pt);
>>>>> +               } else {
>>>>> +                       /*signal_pt is in order in list, from small to
>>>>> big, so
>>>>> +                        * the later must not be signal either */
>>>>> +                       break;
>>>>> +               }
>>>>> +       }
>>>>> +
>>>>> +       spin_unlock(&syncobj->lock);
>>>>> +}
>>>>>     /**
>>>>>      * drm_syncobj_replace_fence - replace fence in a sync object.
>>>>>      * @syncobj: Sync object to replace fence in
>>>>> @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj
>>>>> *syncobj,
>>>>>                                 u64 point,
>>>>>                                 struct dma_fence *fence)
>>>>>     {
>>>>> -       struct dma_fence *old_fence;
>>>>> -       struct drm_syncobj_cb *cur, *tmp;
>>>>> -
>>>>> -       if (fence)
>>>>> -               dma_fence_get(fence);
>>>>> -
>>>>> -       spin_lock(&syncobj->lock);
>>>>> -
>>>>> -       old_fence = rcu_dereference_protected(syncobj->fence,
>>>>> -
>>>>> lockdep_is_held(&syncobj->lock));
>>>>> -       rcu_assign_pointer(syncobj->fence, fence);
>>>>> +       drm_syncobj_garbage_collection(syncobj);
>>>>> +       if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>> +               if (fence)
>>>>> +                       drm_syncobj_create_signal_pt(syncobj, fence,
>>>>> point);
>>>>> +       } else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>>>> +               u64 pt_value;
>>>>> +
>>>>> +               if (!fence) {
>>>>> +                       drm_syncobj_timeline_fini(syncobj,
>>>>> +
>>>>> &syncobj->syncobj_timeline);
>>>>> +                       drm_syncobj_timeline_init(syncobj,
>>>>> +
>>>>> &syncobj->syncobj_timeline);
>>>>> +                       return;
>>>>> +               }
>>>>> +               pt_value = syncobj->syncobj_timeline.signal_point +
>>>>> +                       DRM_SYNCOBJ_NORMAL_POINT;
>>>>> +               drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
>>>>> +       } else {
>>>>> +               DRM_ERROR("the syncobj type isn't support\n");
>>>>> +               return;
>>>>> +       }
>>>>> +       if (fence) {
>>>>> +               struct drm_syncobj_cb *cur, *tmp;
>>>>> -       if (fence != old_fence) {
>>>>> +               spin_lock(&syncobj->lock);
>>>>>                  list_for_each_entry_safe(cur, tmp, &syncobj->cb_list,
>>>>> node) {
>>>>>                          list_del_init(&cur->node);
>>>>>                          cur->func(syncobj, cur);
>>>>>                  }
>>>>> +               spin_unlock(&syncobj->lock);
>>>>>          }
>>>>> -
>>>>> -       spin_unlock(&syncobj->lock);
>>>>> -
>>>>> -       dma_fence_put(old_fence);
>>>>>     }
>>>>>     EXPORT_SYMBOL(drm_syncobj_replace_fence);
>>>>> @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct
>>>>> drm_syncobj *syncobj)
>>>>>          return 0;
>>>>>     }
>>>>> +static struct dma_fence *
>>>>> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64
>>>>> flags)
>>>>> +{
>>>>> +       struct dma_fence *fence;
>>>>> +       int ret = 0;
>>>>> +
>>>>> +       if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>>> +               ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
>>>>> +                                       point <=
>>>>> syncobj->syncobj_timeline.signal_point,
>>>>> +                                       msecs_to_jiffies(10000)); /*
>>>>> wait 10s */
>>>>> +               if (ret <= 0)
>>>>> +                       return NULL;
>>>>> +       }
>>>>> +       spin_lock(&syncobj->lock);
>>>>> +       fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
>>>>> +       spin_unlock(&syncobj->lock);
>>>>> +       return fence;
>>>>> +}
>>>>> +
>>>>>     /**
>>>>>      * drm_syncobj_find_fence - lookup and reference the fence in a sync
>>>>> object
>>>>>      * @file_private: drm file private pointer
>>>>> @@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct
>>>>> drm_syncobj *syncobj)
>>>>>      * contains a reference to the fence, which must be released by
>>>>> calling
>>>>>      * dma_fence_put().
>>>>>      */
>>>>> -int drm_syncobj_find_fence(struct drm_file *file_private,
>>>>> -                          u32 handle, u64 point,
>>>>> -                          struct dma_fence **fence)
>>>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>>>>> +                            u64 flags, struct dma_fence **fence)
>>>>>     {
>>>>> -       struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
>>>>> handle);
>>>>>          int ret = 0;
>>>>>          if (!syncobj)
>>>>>                  return -ENOENT;
>>>>> -       *fence = drm_syncobj_fence_get(syncobj);
>>>>> +       drm_syncobj_garbage_collection(syncobj);
>>>>> +       if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>>>>> +               /*NORMAL syncobj always wait on last pt */
>>>>> +               u64 tail_pt_value =
>>>>> syncobj->syncobj_timeline.signal_point;
>>>>> +
>>>>> +               if (tail_pt_value == 0)
>>>>> +                       tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
>>>>> +               /* NORMAL syncobj doesn't care point value */
>>>>> +               WARN_ON(point != 0);
>>>>> +               *fence = drm_syncobj_point_get(syncobj, tail_pt_value,
>>>>> +                                                       flags);
>>>>> +       } else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>> +               *fence = drm_syncobj_point_get(syncobj, point,
>>>>> +                                                       flags);
>>>>> +       } else {
>>>>> +               DRM_ERROR("Don't support this type syncobj\n");
>>>>> +               *fence = NULL;
>>>>> +       }
>>>>>          if (!*fence) {
>>>>>                  ret = -EINVAL;
>>>>>          }
>>>>> +       return ret;
>>>>> +}
>>>>> +EXPORT_SYMBOL(drm_syncobj_search_fence);
>>>>> +int drm_syncobj_find_fence(struct drm_file *file_private,
>>>>> +                          u32 handle, u64 point,
>>>>> +                          struct dma_fence **fence) {
>>>>> +       struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
>>>>> handle);
>>>>> +
>>>>> +       int ret = drm_syncobj_search_fence(syncobj, point,
>>>>> +
>>>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>>>> +                                       fence);
>>>>>          drm_syncobj_put(syncobj);
>>>>>          return ret;
>>>>>     }
>>>>> @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
>>>>>          struct drm_syncobj *syncobj = container_of(kref,
>>>>>                                                     struct drm_syncobj,
>>>>>                                                     refcount);
>>>>> -       drm_syncobj_replace_fence(syncobj, 0, NULL);
>>>>> +       drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>>>>>          kfree(syncobj);
>>>>>     }
>>>>>     EXPORT_SYMBOL(drm_syncobj_free);
>>>>> @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj
>>>>> **out_syncobj, uint32_t flags,
>>>>>          kref_init(&syncobj->refcount);
>>>>>          INIT_LIST_HEAD(&syncobj->cb_list);
>>>>>          spin_lock_init(&syncobj->lock);
>>>>> +       if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
>>>>> +               syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
>>>>> +       else
>>>>> +               syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
>>>>> +       drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
>>>>>          if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>>>>>                  ret = drm_syncobj_assign_null_handle(syncobj);
>>>>> @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev,
>>>>> void *data,
>>>>>                  return -ENODEV;
>>>>>          /* no valid flags yet */
>>>>> -       if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
>>>>> +       if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
>>>>> +                           DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
>>>>>                  return -EINVAL;
>>>>>          return drm_syncobj_create_as_handle(file_private,
>>>>> @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct
>>>>> drm_syncobj *syncobj,
>>>>>          struct syncobj_wait_entry *wait =
>>>>>                  container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>>>>> -       /* This happens inside the syncobj lock */
>>>>> -       wait->fence =
>>>>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>>>>> -
>>>>> lockdep_is_held(&syncobj->lock)));
>>>>> +       drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
>>>>> +
>>>>>          wake_up_process(wait->task);
>>>>>     }
>>>>> @@ -698,7 +910,8 @@ static signed long
>>>>> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>>>>          signaled_count = 0;
>>>>>          for (i = 0; i < count; ++i) {
>>>>>                  entries[i].task = current;
>>>>> -               entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
>>>>> +               ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
>>>>> +                                              &entries[i].fence);
>>>>>                  if (!entries[i].fence) {
>>>>>                          if (flags &
>>>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>>>>                                  continue;
>>>>> @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev,
>>>>> void *data,
>>>>>          if (ret < 0)
>>>>>                  return ret;
>>>>> -       for (i = 0; i < args->count_handles; i++)
>>>>> -               drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
>>>>> -
>>>>> +       for (i = 0; i < args->count_handles; i++) {
>>>>> +               if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>> +                       DRM_ERROR("timeline syncobj cannot reset!\n");
>>>>> +                       ret = -EINVAL;
>>>>> +                       goto out;
>>>>> +               }
>>>>> +               drm_syncobj_timeline_fini(syncobjs[i],
>>>>> +
>>>>> &syncobjs[i]->syncobj_timeline);
>>>>> +               drm_syncobj_timeline_init(syncobjs[i],
>>>>> +
>>>>> &syncobjs[i]->syncobj_timeline);
>>>>> +       }
>>>>> +out:
>>>>>          drm_syncobj_array_free(syncobjs, args->count_handles);
>>>>> -       return 0;
>>>>> +       return ret;
>>>>>     }
>>>>>     int
>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>> index 0a8d2d64f380..579e91a5858b 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>>> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>>>>>                  if (!(flags & I915_EXEC_FENCE_WAIT))
>>>>>                          continue;
>>>>> -               fence = drm_syncobj_fence_get(syncobj);
>>>>> +               drm_syncobj_search_fence(syncobj, 0,
>>>>> +
>>>>> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>>>> +                                        &fence);
>>>>>                  if (!fence)
>>>>>                          return -EINVAL;
>>>>> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
>>>>> index 425432b85a87..9535521e6623 100644
>>>>> --- a/include/drm/drm_syncobj.h
>>>>> +++ b/include/drm/drm_syncobj.h
>>>>> @@ -30,6 +30,25 @@
>>>>>     struct drm_syncobj_cb;
>>>>> +enum drm_syncobj_type {
>>>>> +       DRM_SYNCOBJ_TYPE_NORMAL,
>>>>> +       DRM_SYNCOBJ_TYPE_TIMELINE
>>>>> +};
>>>>> +
>>>>> +struct drm_syncobj_timeline {
>>>>> +       wait_queue_head_t       wq;
>>>>> +       u64 timeline_context;
>>>>> +       /**
>>>>> +        * @timeline: syncobj timeline
>>>>> +        */
>>>>> +       u64 timeline;
>>>>> +       u64 signal_point;
>>>>> +
>>>>> +
>>>>> +       struct rb_root wait_pt_tree;
>>>>> +       struct list_head signal_pt_list;
>>>>> +};
>>>>> +
>>>>>     /**
>>>>>      * struct drm_syncobj - sync object.
>>>>>      *
>>>>> @@ -41,19 +60,19 @@ struct drm_syncobj {
>>>>>           */
>>>>>          struct kref refcount;
>>>>>          /**
>>>>> -        * @fence:
>>>>> -        * NULL or a pointer to the fence bound to this object.
>>>>> -        *
>>>>> -        * This field should not be used directly. Use
>>>>> drm_syncobj_fence_get()
>>>>> -        * and drm_syncobj_replace_fence() instead.
>>>>> +        * @type: indicate syncobj type
>>>>>           */
>>>>> -       struct dma_fence __rcu *fence;
>>>>> +       enum drm_syncobj_type type;
>>>>>          /**
>>>>> -        * @cb_list: List of callbacks to call when the &fence gets
>>>>> replaced.
>>>>> +        * @syncobj_timeline: timeline
>>>>>           */
>>>>> +       struct drm_syncobj_timeline syncobj_timeline;
>>>>> +       /**
>>>>> +         * @cb_list: List of callbacks to call when the &fence gets
>>>>> replaced.
>>>>> +         */
>>>>>          struct list_head cb_list;
>>>>>          /**
>>>>> -        * @lock: Protects &cb_list and write-locks &fence.
>>>>> +        * @lock: Protects syncobj list and write-locks &fence.
>>>>>           */
>>>>>          spinlock_t lock;
>>>>>          /**
>>>>> @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj
>>>>> *syncobj,
>>>>>     /**
>>>>>      * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>>>>>      * @node: used by drm_syncob_add_callback to append this struct to
>>>>> - *       &drm_syncobj.cb_list
>>>>> + *       &drm_syncobj.cb_list
>>>>>      * @func: drm_syncobj_func_t to call
>>>>>      *
>>>>>      * This struct will be initialized by drm_syncobj_add_callback,
>>>>> additional
>>>>> @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>>>>>          kref_put(&obj->refcount, drm_syncobj_free);
>>>>>     }
>>>>> -/**
>>>>> - * drm_syncobj_fence_get - get a reference to a fence in a sync object
>>>>> - * @syncobj: sync object.
>>>>> - *
>>>>> - * This acquires additional reference to &drm_syncobj.fence contained
>>>>> in @obj,
>>>>> - * if not NULL. It is illegal to call this without already holding a
>>>>> reference.
>>>>> - * No locks required.
>>>>> - *
>>>>> - * Returns:
>>>>> - * Either the fence of @obj or NULL if there's none.
>>>>> - */
>>>>> -static inline struct dma_fence *
>>>>> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
>>>>> -{
>>>>> -       struct dma_fence *fence;
>>>>> -
>>>>> -       rcu_read_lock();
>>>>> -       fence = dma_fence_get_rcu_safe(&syncobj->fence);
>>>>> -       rcu_read_unlock();
>>>>> -
>>>>> -       return fence;
>>>>> -}
>>>>> -
>>>>>     struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>>>>>                                       u32 handle);
>>>>>     void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64
>>>>> point,
>>>>> @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj
>>>>> **out_syncobj, uint32_t flags,
>>>>>     int drm_syncobj_get_handle(struct drm_file *file_private,
>>>>>                             struct drm_syncobj *syncobj, u32 *handle);
>>>>>     int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
>>>>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>>>>> +                            u64 flags, struct dma_fence **fence);
>>>>>     #endif
>>>>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>>>>> index 300f336633f2..cebdb2541eb7 100644
>>>>> --- a/include/uapi/drm/drm.h
>>>>> +++ b/include/uapi/drm/drm.h
>>>>> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>>>>>     struct drm_syncobj_create {
>>>>>          __u32 handle;
>>>>>     #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
>>>>> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>>>>>          __u32 flags;
>>>>>     };
>>
>
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [PATCH] [RFC]drm: add syncobj timeline support v5
       [not found]       ` <20180914161059.GR11082-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
  2018-09-14 16:43         ` Christian König
@ 2018-09-17  2:23         ` Zhou, David(ChunMing)
  1 sibling, 0 replies; 12+ messages in thread
From: Zhou, David(ChunMing) @ 2018-09-17  2:23 UTC (permalink / raw)
  To: Daniel Vetter, Koenig, Christian
  Cc: Dave Airlie, Rakos, Daniel,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



> -----Original Message-----
> From: Daniel Vetter <daniel.vetter@ffwll.ch> On Behalf Of Daniel Vetter
> Sent: Saturday, September 15, 2018 12:11 AM
> To: Koenig, Christian <Christian.Koenig@amd.com>
> Cc: Zhou, David(ChunMing) <David1.Zhou@amd.com>; dri-
> devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org; Dave Airlie
> <airlied@redhat.com>; Rakos, Daniel <Daniel.Rakos@amd.com>; Daniel
> Vetter <daniel@ffwll.ch>
> Subject: Re: [PATCH] [RFC]drm: add syncobj timeline support v5
> 
> On Fri, Sep 14, 2018 at 12:49:45PM +0200, Christian König wrote:
> > Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> > > This patch is for VK_KHR_timeline_semaphore extension, semaphore is
> called syncobj in kernel side:
> > > This extension introduces a new type of syncobj that has an integer
> > > payload identifying a point in a timeline. Such timeline syncobjs
> > > support the following operations:
> > >     * CPU query - A host operation that allows querying the payload of the
> > >       timeline syncobj.
> > >     * CPU wait - A host operation that allows a blocking wait for a
> > >       timeline syncobj to reach a specified value.
> > >     * Device wait - A device operation that allows waiting for a
> > >       timeline syncobj to reach a specified value.
> > >     * Device signal - A device operation that allows advancing the
> > >       timeline syncobj to a specified value.
> > >
> > > Since it's a timeline, that means the front time point(PT) always is
> signaled before the late PT.
> > > a. signal PT design:
> > > Signal PT fence N depends on PT[N-1] fence and signal opertion
> > > fence, when PT[N] fence is signaled, the timeline will increase to value of
> PT[N].
> > > b. wait PT design:
> > > Wait PT fence is signaled by reaching timeline point value, when
> > > timeline is increasing, will compare wait PTs value with new
> > > timeline value, if PT value is lower than timeline value, then wait
> > > PT will be signaled, otherwise keep in list. syncobj wait operation
> > > can wait on any point of timeline, so need a RB tree to order them. And
> wait PT could ahead of signal PT, we need a sumission fence to perform that.
> > >
> > > v2:
> > > 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian) 2.
> move
> > > unexposed denitions to .c file. (Daniel Vetter) 3. split up the
> > > change to drm_syncobj_find_fence() in a separate patch. (Christian)
> > > 4. split up the change to drm_syncobj_replace_fence() in a separate
> patch.
> > > 5. drop the submission_fence implementation and instead use
> > > wait_event() for that. (Christian) 6. WARN_ON(point != 0) for NORMAL
> > > type syncobj case. (Daniel Vetter)
> > >
> > > v3:
> > > 1. replace normal syncobj with timeline implemenation. (Vetter and
> Christian)
> > >      a. normal syncobj signal op will create a signal PT to tail of signal pt list.
> > >      b. normal syncobj wait op will create a wait pt with last signal point, and
> this wait PT is only signaled by related signal point PT.
> > > 2. many bug fix and clean up
> > > 3. stub fence moving is moved to other patch.
> > >
> > > v4:
> > > 1. fix RB tree loop with while(node=rb_first(...)). (Christian) 2.
> > > fix syncobj lifecycle. (Christian) 3. only enable_signaling when
> > > there is wait_pt. (Christian) 4. fix timeline path issues.
> > > 5. write a timeline test in libdrm
> > >
> > > v5: (Christian)
> > > 1. semaphore is called syncobj in kernel side.
> > > 2. don't need 'timeline' characters in some function name.
> > > 3. keep syncobj cb
> > >
> > > normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore* timeline
> > > syncobj is tested by ./amdgpu_test -s 9
> > >
> > > Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
> > > Cc: Christian Konig <christian.koenig@amd.com>
> > > Cc: Dave Airlie <airlied@redhat.com>
> > > Cc: Daniel Rakos <Daniel.Rakos@amd.com>
> > > Cc: Daniel Vetter <daniel@ffwll.ch>
> >
> > At least on first glance that looks like it should work, going to do a
> > detailed review on Monday.
> 
> Just for my understanding, it's all condensed down to 1 patch now?

Yes, Christian suggest that.

 >I kinda
> didn't follow the detailed discussion last few days at all :-/
> 
> Also, is there a testcase, igt highly preferred (because then we'll run it in our
> intel-gfx CI, and a bunch of people outside of intel have already discovered
> that and are using it).


I already wrote the test in libdrm unit test, since I'm not familiar with IGT stuff.

Thanks,
David Zhou
> 
> Thanks, Daniel
> 
> >
> > Christian.
> >
> > > ---
> > >   drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
> > >   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
> > >   include/drm/drm_syncobj.h                  |  62 +++--
> > >   include/uapi/drm/drm.h                     |   1 +
> > >   4 files changed, 292 insertions(+), 69 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/drm_syncobj.c
> > > b/drivers/gpu/drm/drm_syncobj.c index e9ce623d049e..e78d076f2703
> > > 100644
> > > --- a/drivers/gpu/drm/drm_syncobj.c
> > > +++ b/drivers/gpu/drm/drm_syncobj.c
> > > @@ -56,6 +56,9 @@either
> > >   #include "drm_internal.h"
> > >   #include <drm/drm_syncobj.h>
> > > +/* merge normal syncobj to timeline syncobj, the point interval is
> > > +1 */ #define DRM_SYNCOBJ_NORMAL_POINT 1
> > > +
> > >   struct drm_syncobj_stub_fence {
> > >   	struct dma_fence base;
> > >   	spinlock_t lock;
> > > @@ -82,6 +85,11 @@ static const struct dma_fence_ops
> drm_syncobj_stub_fence_ops = {
> > >   	.release = drm_syncobj_stub_fence_release,
> > >   };
> > > +struct drm_syncobj_signal_pt {
> > > +	struct dma_fence_array *base;
> > > +	u64    value;
> > > +	struct list_head list;
> > > +};
> > >   /**
> > >    * drm_syncobj_find - lookup and reference a sync object.
> > > @@ -124,7 +132,7 @@ static int
> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
> > >   {
> > >   	int ret;
> > > -	*fence = drm_syncobj_fence_get(syncobj);
> > > +	ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
> > >   	if (*fence)
> > >   		return 1;
> > > @@ -133,10 +141,10 @@ static int
> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
> > >   	 * have the lock, try one more time just to be sure we don't add a
> > >   	 * callback when a fence has already been set.
> > >   	 */
> > > -	if (syncobj->fence) {
> > > -		*fence =
> dma_fence_get(rcu_dereference_protected(syncobj->fence,
> > > -
> lockdep_is_held(&syncobj->lock)));
> > > -		ret = 1;
> > > +	if (fence) {
> > > +		drm_syncobj_search_fence(syncobj, 0, 0, fence);
> > > +		if (*fence)
> > > +			ret = 1;
> > >   	} else {
> > >   		*fence = NULL;
> > >   		drm_syncobj_add_callback_locked(syncobj, cb, func); @@ -
> 164,6
> > > +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj
> *syncobj,
> > >   	spin_unlock(&syncobj->lock);
> > >   }
> > > +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
> > > +				      struct drm_syncobj_timeline
> *syncobj_timeline) {
> > > +	spin_lock(&syncobj->lock);
> > > +	syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
> > > +	syncobj_timeline->timeline = 0;
> > > +	syncobj_timeline->signal_point = 0;
> > > +	init_waitqueue_head(&syncobj_timeline->wq);
> > > +
> > > +	INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
> > > +	spin_unlock(&syncobj->lock);
> > > +}
> > > +
> > > +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
> > > +				      struct drm_syncobj_timeline
> *syncobj_timeline) {
> > > +	struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
> > > +
> > > +	spin_lock(&syncobj->lock);
> > > +	list_for_each_entry_safe(signal_pt, tmp,
> > > +				 &syncobj_timeline->signal_pt_list, list) {
> > > +		list_del(&signal_pt->list);
> > > +		dma_fence_put(&signal_pt->base->base);
> > > +		kfree(signal_pt);
> > > +	}
> > > +	spin_unlock(&syncobj->lock);
> > > +}
> > > +
> > > +static struct dma_fence
> > > +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
> > > +				      uint64_t point)
> > > +{
> > > +	struct drm_syncobj_timeline *timeline = &syncobj-
> >syncobj_timeline;
> > > +	struct drm_syncobj_signal_pt *signal_pt;
> > > +
> > > +	if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
> > > +	    (point <= timeline->timeline)) {
> > > +		struct drm_syncobj_stub_fence *fence =
> > > +			kzalloc(sizeof(struct drm_syncobj_stub_fence),
> > > +				GFP_KERNEL);
> > > +
> > > +		if (!fence)
> > > +			return NULL;
> > > +		spin_lock_init(&fence->lock);
> > > +		dma_fence_init(&fence->base,
> > > +			       &drm_syncobj_stub_fence_ops,
> > > +			       &fence->lock,
> > > +			       syncobj->syncobj_timeline.timeline_context,
> > > +			       point);
> > > +
> > > +		dma_fence_signal(&fence->base);
> > > +		return &fence->base;
> > > +	}
> > > +
> > > +	list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
> > > +		if (point > signal_pt->value)
> > > +			continue;
> > > +		if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
> > > +		    (point != signal_pt->value))
> > > +			continue;
> > > +		return dma_fence_get(&signal_pt->base->base);
> > > +	}
> > > +	return NULL;
> > > +}
> > > +
> > > +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
> > > +					struct dma_fence *fence,
> > > +					u64 point)
> > > +{
> > > +	struct drm_syncobj_signal_pt *signal_pt =
> > > +		kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
> > > +	struct drm_syncobj_signal_pt *tail_pt;
> > > +	struct dma_fence **fences;
> > > +	int num_fences = 0;
> > > +	int ret = 0, i;
> > > +
> > > +	if (!signal_pt)
> > > +		return -ENOMEM;
> > > +	if (syncobj->syncobj_timeline.signal_point >= point) {
> > > +		DRM_WARN("A later signal is ready!");
> > > +		goto out;
> > > +	}
> > > +	if (!fence)
> > > +		goto out;
> > > +
> > > +	fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
> > > +	if (!fences)
> > > +		goto out;
> > > +	fences[num_fences++] = dma_fence_get(fence);
> > > +	/* timeline syncobj must take this dependency */
> > > +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > > +		spin_lock(&syncobj->lock);
> > > +		if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
> > > +			tail_pt = list_last_entry(&syncobj-
> >syncobj_timeline.signal_pt_list,
> > > +						  struct
> drm_syncobj_signal_pt, list);
> > > +			fences[num_fences++] = dma_fence_get(&tail_pt-
> >base->base);
> > > +		}
> > > +		spin_unlock(&syncobj->lock);
> > > +	}
> > > +	signal_pt->base = dma_fence_array_create(num_fences, fences,
> > > +						 syncobj-
> >syncobj_timeline.timeline_context,
> > > +						 point, false);
> > > +	if (!signal_pt->base)
> > > +		goto fail;
> > > +
> > > +	spin_lock(&syncobj->lock);
> > > +	signal_pt->value = point;
> > > +	INIT_LIST_HEAD(&signal_pt->list);
> > > +	list_add_tail(&signal_pt->list, &syncobj-
> >syncobj_timeline.signal_pt_list);
> > > +	syncobj->syncobj_timeline.signal_point = point;
> > > +	spin_unlock(&syncobj->lock);
> > > +	wake_up_all(&syncobj->syncobj_timeline.wq);
> > > +
> > > +	return 0;
> > > +fail:
> > > +	for (i = 0; i < num_fences; i++)
> > > +		dma_fence_put(fences[i]);
> > > +	kfree(fences);
> > > +out:
> > > +	kfree(signal_pt);
> > > +	return ret;
> > > +}
> > > +
> > > +static void drm_syncobj_garbage_collection(struct drm_syncobj
> > > +*syncobj) {
> > > +	struct drm_syncobj_timeline *timeline = &syncobj-
> >syncobj_timeline;
> > > +	struct drm_syncobj_signal_pt *signal_pt, *tmp;
> > > +
> > > +	spin_lock(&syncobj->lock);
> > > +	list_for_each_entry_safe(signal_pt, tmp,
> > > +				 &timeline->signal_pt_list, list) {
> > > +		if (dma_fence_is_signaled(&signal_pt->base->base)) {
> > > +			timeline->timeline = signal_pt->value;
> > > +			list_del(&signal_pt->list);
> > > +			dma_fence_put(&signal_pt->base->base);
> > > +			kfree(signal_pt);
> > > +		} else {
> > > +			/*signal_pt is in order in list, from small to big, so
> > > +			 * the later must not be signal either */
> > > +			break;
> > > +		}
> > > +	}
> > > +
> > > +	spin_unlock(&syncobj->lock);
> > > +}
> > >   /**
> > >    * drm_syncobj_replace_fence - replace fence in a sync object.
> > >    * @syncobj: Sync object to replace fence in @@ -176,28 +329,37 @@
> > > void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
> > >   			       u64 point,
> > >   			       struct dma_fence *fence)
> > >   {
> > > -	struct dma_fence *old_fence;
> > > -	struct drm_syncobj_cb *cur, *tmp;
> > > -
> > > -	if (fence)
> > > -		dma_fence_get(fence);
> > > -
> > > -	spin_lock(&syncobj->lock);
> > > -
> > > -	old_fence = rcu_dereference_protected(syncobj->fence,
> > > -					      lockdep_is_held(&syncobj->lock));
> > > -	rcu_assign_pointer(syncobj->fence, fence);
> > > +	drm_syncobj_garbage_collection(syncobj);
> > > +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > > +		if (fence)
> > > +			drm_syncobj_create_signal_pt(syncobj, fence,
> point);
> > > +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> > > +		u64 pt_value;
> > > +
> > > +		if (!fence) {
> > > +			drm_syncobj_timeline_fini(syncobj,
> > > +						  &syncobj-
> >syncobj_timeline);
> > > +			drm_syncobj_timeline_init(syncobj,
> > > +						  &syncobj-
> >syncobj_timeline);
> > > +			return;
> > > +		}
> > > +		pt_value = syncobj->syncobj_timeline.signal_point +
> > > +			DRM_SYNCOBJ_NORMAL_POINT;
> > > +		drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
> > > +	} else {
> > > +		DRM_ERROR("the syncobj type isn't support\n");
> > > +		return;
> > > +	}
> > > +	if (fence) {
> > > +		struct drm_syncobj_cb *cur, *tmp;
> > > -	if (fence != old_fence) {
> > > +		spin_lock(&syncobj->lock);
> > >   		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node)
> {
> > >   			list_del_init(&cur->node);
> > >   			cur->func(syncobj, cur);
> > >   		}
> > > +		spin_unlock(&syncobj->lock);
> > >   	}
> > > -
> > > -	spin_unlock(&syncobj->lock);
> > > -
> > > -	dma_fence_put(old_fence);
> > >   }
> > >   EXPORT_SYMBOL(drm_syncobj_replace_fence);
> > > @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct
> drm_syncobj *syncobj)
> > >   	return 0;
> > >   }
> > > +static struct dma_fence *
> > > +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64
> > > +flags) {
> > > +	struct dma_fence *fence;
> > > +	int ret = 0;
> > > +
> > > +	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> > > +		ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
> > > +					point <= syncobj-
> >syncobj_timeline.signal_point,
> > > +					msecs_to_jiffies(10000)); /* wait 10s
> */
> > > +		if (ret <= 0)
> > > +			return NULL;
> > > +	}
> > > +	spin_lock(&syncobj->lock);
> > > +	fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
> > > +	spin_unlock(&syncobj->lock);
> > > +	return fence;
> > > +}
> > > +
> > >   /**
> > >    * drm_syncobj_find_fence - lookup and reference the fence in a sync
> object
> > >    * @file_private: drm file private pointer @@ -234,20 +415,46 @@
> > > static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
> > >    * contains a reference to the fence, which must be released by calling
> > >    * dma_fence_put().
> > >    */
> > > -int drm_syncobj_find_fence(struct drm_file *file_private,
> > > -			   u32 handle, u64 point,
> > > -			   struct dma_fence **fence)
> > > +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> > > +			     u64 flags, struct dma_fence **fence)
> > >   {
> > > -	struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
> handle);
> > >   	int ret = 0;
> > >   	if (!syncobj)
> > >   		return -ENOENT;
> > > -	*fence = drm_syncobj_fence_get(syncobj);
> > > +	drm_syncobj_garbage_collection(syncobj);
> > > +	if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> > > +		/*NORMAL syncobj always wait on last pt */
> > > +		u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
> > > +
> > > +		if (tail_pt_value == 0)
> > > +			tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
> > > +		/* NORMAL syncobj doesn't care point value */
> > > +		WARN_ON(point != 0);
> > > +		*fence = drm_syncobj_point_get(syncobj, tail_pt_value,
> > > +							flags);
> > > +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > > +		*fence = drm_syncobj_point_get(syncobj, point,
> > > +							flags);
> > > +	} else {
> > > +		DRM_ERROR("Don't support this type syncobj\n");
> > > +		*fence = NULL;
> > > +	}
> > >   	if (!*fence) {
> > >   		ret = -EINVAL;
> > >   	}
> > > +	return ret;
> > > +}
> > > +EXPORT_SYMBOL(drm_syncobj_search_fence);
> > > +int drm_syncobj_find_fence(struct drm_file *file_private,
> > > +			   u32 handle, u64 point,
> > > +			   struct dma_fence **fence) {
> > > +	struct drm_syncobj *syncobj = drm_syncobj_find(file_private,
> > > +handle);
> > > +
> > > +	int ret = drm_syncobj_search_fence(syncobj, point,
> > > +
> 	DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> > > +					fence);
> > >   	drm_syncobj_put(syncobj);
> > >   	return ret;
> > >   }
> > > @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
> > >   	struct drm_syncobj *syncobj = container_of(kref,
> > >   						   struct drm_syncobj,
> > >   						   refcount);
> > > -	drm_syncobj_replace_fence(syncobj, 0, NULL);
> > > +	drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
> > >   	kfree(syncobj);
> > >   }
> > >   EXPORT_SYMBOL(drm_syncobj_free);
> > > @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj
> **out_syncobj, uint32_t flags,
> > >   	kref_init(&syncobj->refcount);
> > >   	INIT_LIST_HEAD(&syncobj->cb_list);
> > >   	spin_lock_init(&syncobj->lock);
> > > +	if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
> > > +		syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
> > > +	else
> > > +		syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
> > > +	drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
> > >   	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
> > >   		ret = drm_syncobj_assign_null_handle(syncobj);
> > > @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device
> *dev, void *data,
> > >   		return -ENODEV;
> > >   	/* no valid flags yet */
> > > -	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
> > > +	if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
> > > +			    DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
> > >   		return -EINVAL;
> > >   	return drm_syncobj_create_as_handle(file_private,
> > > @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct
> drm_syncobj *syncobj,
> > >   	struct syncobj_wait_entry *wait =
> > >   		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
> > > -	/* This happens inside the syncobj lock */
> > > -	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj-
> >fence,
> > > -
> lockdep_is_held(&syncobj->lock)));
> > > +	drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
> > > +
> > >   	wake_up_process(wait->task);
> > >   }
> > > @@ -698,7 +910,8 @@ static signed long
> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
> > >   	signaled_count = 0;
> > >   	for (i = 0; i < count; ++i) {
> > >   		entries[i].task = current;
> > > -		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
> > > +		ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
> > > +					       &entries[i].fence);
> > >   		if (!entries[i].fence) {
> > >   			if (flags &
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> > >   				continue;
> > > @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device
> *dev, void *data,
> > >   	if (ret < 0)
> > >   		return ret;
> > > -	for (i = 0; i < args->count_handles; i++)
> > > -		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
> > > -
> > > +	for (i = 0; i < args->count_handles; i++) {
> > > +		if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> > > +			DRM_ERROR("timeline syncobj cannot reset!\n");
> > > +			ret = -EINVAL;
> > > +			goto out;
> > > +		}
> > > +		drm_syncobj_timeline_fini(syncobjs[i],
> > > +					  &syncobjs[i]->syncobj_timeline);
> > > +		drm_syncobj_timeline_init(syncobjs[i],
> > > +					  &syncobjs[i]->syncobj_timeline);
> > > +	}
> > > +out:
> > >   	drm_syncobj_array_free(syncobjs, args->count_handles);
> > > -	return 0;
> > > +	return ret;
> > >   }
> > >   int
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index 0a8d2d64f380..579e91a5858b 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
> > >   		if (!(flags & I915_EXEC_FENCE_WAIT))
> > >   			continue;
> > > -		fence = drm_syncobj_fence_get(syncobj);
> > > +		drm_syncobj_search_fence(syncobj, 0,
> > > +
> DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> > > +					 &fence);
> > >   		if (!fence)
> > >   			return -EINVAL;
> > > diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
> > > index 425432b85a87..9535521e6623 100644
> > > --- a/include/drm/drm_syncobj.h
> > > +++ b/include/drm/drm_syncobj.h
> > > @@ -30,6 +30,25 @@
> > >   struct drm_syncobj_cb;
> > > +enum drm_syncobj_type {
> > > +	DRM_SYNCOBJ_TYPE_NORMAL,
> > > +	DRM_SYNCOBJ_TYPE_TIMELINE
> > > +};
> > > +
> > > +struct drm_syncobj_timeline {
> > > +	wait_queue_head_t	wq;
> > > +	u64 timeline_context;
> > > +	/**
> > > +	 * @timeline: syncobj timeline
> > > +	 */
> > > +	u64 timeline;
> > > +	u64 signal_point;
> > > +
> > > +
> > > +	struct rb_root wait_pt_tree;
> > > +	struct list_head signal_pt_list;
> > > +};
> > > +
> > >   /**
> > >    * struct drm_syncobj - sync object.
> > >    *
> > > @@ -41,19 +60,19 @@ struct drm_syncobj {
> > >   	 */
> > >   	struct kref refcount;
> > >   	/**
> > > -	 * @fence:
> > > -	 * NULL or a pointer to the fence bound to this object.
> > > -	 *
> > > -	 * This field should not be used directly. Use
> drm_syncobj_fence_get()
> > > -	 * and drm_syncobj_replace_fence() instead.
> > > +	 * @type: indicate syncobj type
> > >   	 */
> > > -	struct dma_fence __rcu *fence;
> > > +	enum drm_syncobj_type type;
> > >   	/**
> > > -	 * @cb_list: List of callbacks to call when the &fence gets replaced.
> > > +	 * @syncobj_timeline: timeline
> > >   	 */
> > > +	struct drm_syncobj_timeline syncobj_timeline;
> > > +	/**
> > > +         * @cb_list: List of callbacks to call when the &fence gets replaced.
> > > +         */
> > >   	struct list_head cb_list;
> > >   	/**
> > > -	 * @lock: Protects &cb_list and write-locks &fence.
> > > +	 * @lock: Protects syncobj list and write-locks &fence.
> > >   	 */
> > >   	spinlock_t lock;
> > >   	/**
> > > @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct
> drm_syncobj *syncobj,
> > >   /**
> > >    * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
> > >    * @node: used by drm_syncob_add_callback to append this struct to
> > > - *	  &drm_syncobj.cb_list
> > > + *       &drm_syncobj.cb_list
> > >    * @func: drm_syncobj_func_t to call
> > >    *
> > >    * This struct will be initialized by drm_syncobj_add_callback,
> > > additional @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj
> *obj)
> > >   	kref_put(&obj->refcount, drm_syncobj_free);
> > >   }
> > > -/**
> > > - * drm_syncobj_fence_get - get a reference to a fence in a sync
> > > object
> > > - * @syncobj: sync object.
> > > - *
> > > - * This acquires additional reference to &drm_syncobj.fence
> > > contained in @obj,
> > > - * if not NULL. It is illegal to call this without already holding a reference.
> > > - * No locks required.
> > > - *
> > > - * Returns:
> > > - * Either the fence of @obj or NULL if there's none.
> > > - */
> > > -static inline struct dma_fence *
> > > -drm_syncobj_fence_get(struct drm_syncobj *syncobj) -{
> > > -	struct dma_fence *fence;
> > > -
> > > -	rcu_read_lock();
> > > -	fence = dma_fence_get_rcu_safe(&syncobj->fence);
> > > -	rcu_read_unlock();
> > > -
> > > -	return fence;
> > > -}
> > > -
> > >   struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
> > >   				     u32 handle);
> > >   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64
> > > point, @@ -142,5 +138,7 @@ int drm_syncobj_create(struct
> drm_syncobj **out_syncobj, uint32_t flags,
> > >   int drm_syncobj_get_handle(struct drm_file *file_private,
> > >   			   struct drm_syncobj *syncobj, u32 *handle);
> > >   int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
> > > +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> > > +			     u64 flags, struct dma_fence **fence);
> > >   #endif
> > > diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index
> > > 300f336633f2..cebdb2541eb7 100644
> > > --- a/include/uapi/drm/drm.h
> > > +++ b/include/uapi/drm/drm.h
> > > @@ -717,6 +717,7 @@ struct drm_prime_handle {
> > >   struct drm_syncobj_create {
> > >   	__u32 handle;
> > >   #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
> > > +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
> > >   	__u32 flags;
> > >   };
> >
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
  2018-09-14 10:37 [PATCH] [RFC]drm: add syncobj timeline support v5 Chunming Zhou
  2018-09-14 10:49 ` Christian König
@ 2018-09-17  8:37 ` Christian König
       [not found]   ` <a5d2d299-4360-4267-046e-40b550239524-5C7GfCeVMHo@public.gmane.org>
  1 sibling, 1 reply; 12+ messages in thread
From: Christian König @ 2018-09-17  8:37 UTC (permalink / raw)
  To: Chunming Zhou, dri-devel; +Cc: Dave Airlie, Daniel Rakos, amd-gfx

Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
> This patch is for VK_KHR_timeline_semaphore extension, semaphore is called syncobj in kernel side:
> This extension introduces a new type of syncobj that has an integer payload
> identifying a point in a timeline. Such timeline syncobjs support the
> following operations:
>     * CPU query - A host operation that allows querying the payload of the
>       timeline syncobj.
>     * CPU wait - A host operation that allows a blocking wait for a
>       timeline syncobj to reach a specified value.
>     * Device wait - A device operation that allows waiting for a
>       timeline syncobj to reach a specified value.
>     * Device signal - A device operation that allows advancing the
>       timeline syncobj to a specified value.
>
> Since it's a timeline, that means the front time point(PT) always is signaled before the late PT.
> a. signal PT design:
> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, when PT[N] fence is signaled,
> the timeline will increase to value of PT[N].
> b. wait PT design:
> Wait PT fence is signaled by reaching timeline point value, when timeline is increasing, will compare
> wait PTs value with new timeline value, if PT value is lower than timeline value, then wait PT will be
> signaled, otherwise keep in list. syncobj wait operation can wait on any point of timeline,
> so need a RB tree to order them. And wait PT could ahead of signal PT, we need a sumission fence to
> perform that.
>
> v2:
> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
> 2. move unexposed denitions to .c file. (Daniel Vetter)
> 3. split up the change to drm_syncobj_find_fence() in a separate patch. (Christian)
> 4. split up the change to drm_syncobj_replace_fence() in a separate patch.
> 5. drop the submission_fence implementation and instead use wait_event() for that. (Christian)
> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
>
> v3:
> 1. replace normal syncobj with timeline implemenation. (Vetter and Christian)
>      a. normal syncobj signal op will create a signal PT to tail of signal pt list.
>      b. normal syncobj wait op will create a wait pt with last signal point, and this wait PT is only signaled by related signal point PT.
> 2. many bug fix and clean up
> 3. stub fence moving is moved to other patch.
>
> v4:
> 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
> 2. fix syncobj lifecycle. (Christian)
> 3. only enable_signaling when there is wait_pt. (Christian)
> 4. fix timeline path issues.
> 5. write a timeline test in libdrm
>
> v5: (Christian)
> 1. semaphore is called syncobj in kernel side.
> 2. don't need 'timeline' characters in some function name.
> 3. keep syncobj cb
>
> normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
> timeline syncobj is tested by ./amdgpu_test -s 9
>
> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
> Cc: Christian Konig <christian.koenig@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Daniel Rakos <Daniel.Rakos@amd.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>   drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
>   include/drm/drm_syncobj.h                  |  62 +++--
>   include/uapi/drm/drm.h                     |   1 +
>   4 files changed, 292 insertions(+), 69 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index e9ce623d049e..e78d076f2703 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -56,6 +56,9 @@
>   #include "drm_internal.h"
>   #include <drm/drm_syncobj.h>
>   
> +/* merge normal syncobj to timeline syncobj, the point interval is 1 */
> +#define DRM_SYNCOBJ_NORMAL_POINT 1
> +
>   struct drm_syncobj_stub_fence {
>   	struct dma_fence base;
>   	spinlock_t lock;
> @@ -82,6 +85,11 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
>   	.release = drm_syncobj_stub_fence_release,
>   };
>   
> +struct drm_syncobj_signal_pt {
> +	struct dma_fence_array *base;
> +	u64    value;
> +	struct list_head list;
> +};
>   
>   /**
>    * drm_syncobj_find - lookup and reference a sync object.
> @@ -124,7 +132,7 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>   {
>   	int ret;
>   
> -	*fence = drm_syncobj_fence_get(syncobj);
> +	ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
>   	if (*fence)

Don't we need to check ret here instead?

>   		return 1;
>   
> @@ -133,10 +141,10 @@ static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>   	 * have the lock, try one more time just to be sure we don't add a
>   	 * callback when a fence has already been set.
>   	 */
> -	if (syncobj->fence) {
> -		*fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
> -								 lockdep_is_held(&syncobj->lock)));
> -		ret = 1;
> +	if (fence) {
> +		drm_syncobj_search_fence(syncobj, 0, 0, fence);
> +		if (*fence)
> +			ret = 1;

That doesn't looks correct to me, drm_syncobj_search_fence() would try 
to grab the lock once more.

That should probably be something like checking the signal_pt list and 
if it is not empty any more drop the lock and retry.

>   	} else {
>   		*fence = NULL;
>   		drm_syncobj_add_callback_locked(syncobj, cb, func);
> @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
>   	spin_unlock(&syncobj->lock);
>   }
>   
> +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
> +				      struct drm_syncobj_timeline *syncobj_timeline)
> +{
> +	spin_lock(&syncobj->lock);
> +	syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
> +	syncobj_timeline->timeline = 0;
> +	syncobj_timeline->signal_point = 0;
> +	init_waitqueue_head(&syncobj_timeline->wq);
> +
> +	INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
> +	spin_unlock(&syncobj->lock);
> +}
> +
> +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
> +				      struct drm_syncobj_timeline *syncobj_timeline)
> +{
> +	struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
> +
> +	spin_lock(&syncobj->lock);
> +	list_for_each_entry_safe(signal_pt, tmp,
> +				 &syncobj_timeline->signal_pt_list, list) {
> +		list_del(&signal_pt->list);
> +		dma_fence_put(&signal_pt->base->base);
> +		kfree(signal_pt);
> +	}
> +	spin_unlock(&syncobj->lock);
> +}
> +
> +static struct dma_fence
> +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
> +				      uint64_t point)
> +{
> +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
> +	struct drm_syncobj_signal_pt *signal_pt;
> +
> +	if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
> +	    (point <= timeline->timeline)) {
> +		struct drm_syncobj_stub_fence *fence =
> +			kzalloc(sizeof(struct drm_syncobj_stub_fence),
> +				GFP_KERNEL);
> +
> +		if (!fence)
> +			return NULL;
> +		spin_lock_init(&fence->lock);
> +		dma_fence_init(&fence->base,
> +			       &drm_syncobj_stub_fence_ops,
> +			       &fence->lock,
> +			       syncobj->syncobj_timeline.timeline_context,
> +			       point);
> +
> +		dma_fence_signal(&fence->base);
> +		return &fence->base;
> +	}
> +
> +	list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
> +		if (point > signal_pt->value)
> +			continue;
> +		if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
> +		    (point != signal_pt->value))
> +			continue;
> +		return dma_fence_get(&signal_pt->base->base);
> +	}
> +	return NULL;
> +}
> +
> +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
> +					struct dma_fence *fence,
> +					u64 point)
> +{
> +	struct drm_syncobj_signal_pt *signal_pt =
> +		kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
> +	struct drm_syncobj_signal_pt *tail_pt;
> +	struct dma_fence **fences;
> +	int num_fences = 0;
> +	int ret = 0, i;
> +
> +	if (!signal_pt)
> +		return -ENOMEM;
> +	if (syncobj->syncobj_timeline.signal_point >= point) {
> +		DRM_WARN("A later signal is ready!");
> +		goto out;
> +	}
> +	if (!fence)
> +		goto out;
> +
> +	fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
> +	if (!fences)
> +		goto out;
> +	fences[num_fences++] = dma_fence_get(fence);
> +	/* timeline syncobj must take this dependency */
> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +		spin_lock(&syncobj->lock);
> +		if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
> +			tail_pt = list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
> +						  struct drm_syncobj_signal_pt, list);
> +			fences[num_fences++] = dma_fence_get(&tail_pt->base->base);
> +		}
> +		spin_unlock(&syncobj->lock);
> +	}
> +	signal_pt->base = dma_fence_array_create(num_fences, fences,
> +						 syncobj->syncobj_timeline.timeline_context,
> +						 point, false);
> +	if (!signal_pt->base)
> +		goto fail;
> +
> +	spin_lock(&syncobj->lock);
> +	signal_pt->value = point;
> +	INIT_LIST_HEAD(&signal_pt->list);
> +	list_add_tail(&signal_pt->list, &syncobj->syncobj_timeline.signal_pt_list);
> +	syncobj->syncobj_timeline.signal_point = point;
> +	spin_unlock(&syncobj->lock);
> +	wake_up_all(&syncobj->syncobj_timeline.wq);
> +
> +	return 0;
> +fail:
> +	for (i = 0; i < num_fences; i++)
> +		dma_fence_put(fences[i]);
> +	kfree(fences);
> +out:
> +	kfree(signal_pt);
> +	return ret;
> +}
> +
> +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
> +{
> +	struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
> +	struct drm_syncobj_signal_pt *signal_pt, *tmp;
> +
> +	spin_lock(&syncobj->lock);
> +	list_for_each_entry_safe(signal_pt, tmp,
> +				 &timeline->signal_pt_list, list) {
> +		if (dma_fence_is_signaled(&signal_pt->base->base)) {
> +			timeline->timeline = signal_pt->value;
> +			list_del(&signal_pt->list);
> +			dma_fence_put(&signal_pt->base->base);
> +			kfree(signal_pt);
> +		} else {
> +			/*signal_pt is in order in list, from small to big, so
> +			 * the later must not be signal either */
> +			break;
> +		}
> +	}
> +
> +	spin_unlock(&syncobj->lock);
> +}
>   /**
>    * drm_syncobj_replace_fence - replace fence in a sync object.
>    * @syncobj: Sync object to replace fence in
> @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>   			       u64 point,
>   			       struct dma_fence *fence)
>   {
> -	struct dma_fence *old_fence;
> -	struct drm_syncobj_cb *cur, *tmp;
> -
> -	if (fence)
> -		dma_fence_get(fence);
> -
> -	spin_lock(&syncobj->lock);
> -
> -	old_fence = rcu_dereference_protected(syncobj->fence,
> -					      lockdep_is_held(&syncobj->lock));
> -	rcu_assign_pointer(syncobj->fence, fence);
> +	drm_syncobj_garbage_collection(syncobj);
> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +		if (fence)
> +			drm_syncobj_create_signal_pt(syncobj, fence, point);
> +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> +		u64 pt_value;
> +
> +		if (!fence) {
> +			drm_syncobj_timeline_fini(syncobj,
> +						  &syncobj->syncobj_timeline);
> +			drm_syncobj_timeline_init(syncobj,
> +						  &syncobj->syncobj_timeline);
> +			return;
> +		}
> +		pt_value = syncobj->syncobj_timeline.signal_point +
> +			DRM_SYNCOBJ_NORMAL_POINT;

Looks like this could be simplified. If we don't have a timeline syncobj 
we just finish and reinitialize it.

Adding the fence then becomes common for both code paths.

> +		drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
> +	} else {
> +		DRM_ERROR("the syncobj type isn't support\n");
> +		return;
> +	}
> +	if (fence) {
> +		struct drm_syncobj_cb *cur, *tmp;
>   
> -	if (fence != old_fence) {
> +		spin_lock(&syncobj->lock);
>   		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
>   			list_del_init(&cur->node);
>   			cur->func(syncobj, cur);
>   		}
> +		spin_unlock(&syncobj->lock);
>   	}
> -
> -	spin_unlock(&syncobj->lock);
> -
> -	dma_fence_put(old_fence);
>   }
>   EXPORT_SYMBOL(drm_syncobj_replace_fence);
>   
> @@ -220,6 +382,25 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>   	return 0;
>   }
>   
> +static struct dma_fence *
> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 flags)
> +{
> +	struct dma_fence *fence;
> +	int ret = 0;
> +
> +	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
> +		ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
> +					point <= syncobj->syncobj_timeline.signal_point,
> +					msecs_to_jiffies(10000)); /* wait 10s */

Either don't use a timeout or provide the timeout explicitely, but don't 
hard code it here.

Additional to that we probably need an incorruptible wait.

> +		if (ret <= 0)
> +			return NULL;
> +	}
> +	spin_lock(&syncobj->lock);
> +	fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
> +	spin_unlock(&syncobj->lock);
> +	return fence;
> +}
> +
>   /**
>    * drm_syncobj_find_fence - lookup and reference the fence in a sync object
>    * @file_private: drm file private pointer
> @@ -234,20 +415,46 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>    * contains a reference to the fence, which must be released by calling
>    * dma_fence_put().
>    */
> -int drm_syncobj_find_fence(struct drm_file *file_private,
> -			   u32 handle, u64 point,
> -			   struct dma_fence **fence)
> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> +			     u64 flags, struct dma_fence **fence)
>   {
> -	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
>   	int ret = 0;
>   
>   	if (!syncobj)
>   		return -ENOENT;
>   
> -	*fence = drm_syncobj_fence_get(syncobj);
> +	drm_syncobj_garbage_collection(syncobj);
> +	if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
> +		/*NORMAL syncobj always wait on last pt */
> +		u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
> +
> +		if (tail_pt_value == 0)
> +			tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
> +		/* NORMAL syncobj doesn't care point value */
> +		WARN_ON(point != 0);
> +		*fence = drm_syncobj_point_get(syncobj, tail_pt_value,
> +							flags);
> +	} else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +		*fence = drm_syncobj_point_get(syncobj, point,
> +							flags);

Finding the fence is common for normal and timeline fence, only figuring 
out the sequence number is different.

So that can probably be simplified further here.

> +	} else {
> +		DRM_ERROR("Don't support this type syncobj\n");
> +		*fence = NULL;
> +	}
>   	if (!*fence) {
>   		ret = -EINVAL;
>   	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_syncobj_search_fence);
> +int drm_syncobj_find_fence(struct drm_file *file_private,
> +			   u32 handle, u64 point,
> +			   struct dma_fence **fence) {
> +	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
> +
> +	int ret = drm_syncobj_search_fence(syncobj, point,
> +					DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
> +					fence);
>   	drm_syncobj_put(syncobj);
>   	return ret;
>   }
> @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
>   	struct drm_syncobj *syncobj = container_of(kref,
>   						   struct drm_syncobj,
>   						   refcount);
> -	drm_syncobj_replace_fence(syncobj, 0, NULL);
> +	drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>   	kfree(syncobj);
>   }
>   EXPORT_SYMBOL(drm_syncobj_free);
> @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>   	kref_init(&syncobj->refcount);
>   	INIT_LIST_HEAD(&syncobj->cb_list);
>   	spin_lock_init(&syncobj->lock);
> +	if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
> +		syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
> +	else
> +		syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
> +	drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
>   
>   	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>   		ret = drm_syncobj_assign_null_handle(syncobj);
> @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev, void *data,
>   		return -ENODEV;
>   
>   	/* no valid flags yet */
> -	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
> +	if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
> +			    DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
>   		return -EINVAL;
>   
>   	return drm_syncobj_create_as_handle(file_private,
> @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
>   	struct syncobj_wait_entry *wait =
>   		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>   
> -	/* This happens inside the syncobj lock */
> -	wait->fence = dma_fence_get(rcu_dereference_protected(syncobj->fence,
> -							      lockdep_is_held(&syncobj->lock)));
> +	drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
> +
>   	wake_up_process(wait->task);
>   }
>   
> @@ -698,7 +910,8 @@ static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>   	signaled_count = 0;
>   	for (i = 0; i < count; ++i) {
>   		entries[i].task = current;
> -		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
> +		ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
> +					       &entries[i].fence);
>   		if (!entries[i].fence) {
>   			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>   				continue;
> @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
>   	if (ret < 0)
>   		return ret;
>   
> -	for (i = 0; i < args->count_handles; i++)
> -		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
> -
> +	for (i = 0; i < args->count_handles; i++) {
> +		if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
> +			DRM_ERROR("timeline syncobj cannot reset!\n");

Why not? I mean that should still work or do I miss anything?

> +			ret = -EINVAL;
> +			goto out;
> +		}
> +		drm_syncobj_timeline_fini(syncobjs[i],
> +					  &syncobjs[i]->syncobj_timeline);
> +		drm_syncobj_timeline_init(syncobjs[i],
> +					  &syncobjs[i]->syncobj_timeline);
> +	}
> +out:
>   	drm_syncobj_array_free(syncobjs, args->count_handles);
>   
> -	return 0;
> +	return ret;
>   }
>   
>   int
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 0a8d2d64f380..579e91a5858b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>   		if (!(flags & I915_EXEC_FENCE_WAIT))
>   			continue;
>   
> -		fence = drm_syncobj_fence_get(syncobj);
> +		drm_syncobj_search_fence(syncobj, 0,
> +					 DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,

Not sure if we can make DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT the 
default here.

Maybe better if drivers explicitly need to ask for it?

> +					 &fence);
>   		if (!fence)
>   			return -EINVAL;
>   
> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
> index 425432b85a87..9535521e6623 100644
> --- a/include/drm/drm_syncobj.h
> +++ b/include/drm/drm_syncobj.h
> @@ -30,6 +30,25 @@
>   
>   struct drm_syncobj_cb;
>   
> +enum drm_syncobj_type {
> +	DRM_SYNCOBJ_TYPE_NORMAL,

Does anybody have a better suggestion for _TYPE_NORMAL? That sounds like 
timeline would be special.

> +	DRM_SYNCOBJ_TYPE_TIMELINE
> +};
> +
> +struct drm_syncobj_timeline {
> +	wait_queue_head_t	wq;
> +	u64 timeline_context;
> +	/**
> +	 * @timeline: syncobj timeline
> +	 */
> +	u64 timeline;
> +	u64 signal_point;
> +
> +
> +	struct rb_root wait_pt_tree;

Not used any more?

> +	struct list_head signal_pt_list;
> +};
> +
>   /**
>    * struct drm_syncobj - sync object.
>    *
> @@ -41,19 +60,19 @@ struct drm_syncobj {
>   	 */
>   	struct kref refcount;
>   	/**
> -	 * @fence:
> -	 * NULL or a pointer to the fence bound to this object.
> -	 *
> -	 * This field should not be used directly. Use drm_syncobj_fence_get()
> -	 * and drm_syncobj_replace_fence() instead.
> +	 * @type: indicate syncobj type
>   	 */
> -	struct dma_fence __rcu *fence;
> +	enum drm_syncobj_type type;
>   	/**
> -	 * @cb_list: List of callbacks to call when the &fence gets replaced.
> +	 * @syncobj_timeline: timeline
>   	 */
> +	struct drm_syncobj_timeline syncobj_timeline;

Looks like that can now be embedded into drm_syncobj.

Regards,
Christian.

> +	/**
> +         * @cb_list: List of callbacks to call when the &fence gets replaced.
> +         */
>   	struct list_head cb_list;
>   	/**
> -	 * @lock: Protects &cb_list and write-locks &fence.
> +	 * @lock: Protects syncobj list and write-locks &fence.
>   	 */
>   	spinlock_t lock;
>   	/**
> @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
>   /**
>    * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>    * @node: used by drm_syncob_add_callback to append this struct to
> - *	  &drm_syncobj.cb_list
> + *       &drm_syncobj.cb_list
>    * @func: drm_syncobj_func_t to call
>    *
>    * This struct will be initialized by drm_syncobj_add_callback, additional
> @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>   	kref_put(&obj->refcount, drm_syncobj_free);
>   }
>   
> -/**
> - * drm_syncobj_fence_get - get a reference to a fence in a sync object
> - * @syncobj: sync object.
> - *
> - * This acquires additional reference to &drm_syncobj.fence contained in @obj,
> - * if not NULL. It is illegal to call this without already holding a reference.
> - * No locks required.
> - *
> - * Returns:
> - * Either the fence of @obj or NULL if there's none.
> - */
> -static inline struct dma_fence *
> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
> -{
> -	struct dma_fence *fence;
> -
> -	rcu_read_lock();
> -	fence = dma_fence_get_rcu_safe(&syncobj->fence);
> -	rcu_read_unlock();
> -
> -	return fence;
> -}
> -
>   struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>   				     u32 handle);
>   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
> @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>   int drm_syncobj_get_handle(struct drm_file *file_private,
>   			   struct drm_syncobj *syncobj, u32 *handle);
>   int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
> +			     u64 flags, struct dma_fence **fence);
>   
>   #endif
> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
> index 300f336633f2..cebdb2541eb7 100644
> --- a/include/uapi/drm/drm.h
> +++ b/include/uapi/drm/drm.h
> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>   struct drm_syncobj_create {
>   	__u32 handle;
>   #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>   	__u32 flags;
>   };
>   

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
       [not found]   ` <a5d2d299-4360-4267-046e-40b550239524-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-17  9:33     ` zhoucm1
  2018-09-18  8:32       ` Christian König
  0 siblings, 1 reply; 12+ messages in thread
From: zhoucm1 @ 2018-09-17  9:33 UTC (permalink / raw)
  To: Christian König, Chunming Zhou,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Dave Airlie, Daniel Rakos, Daniel Vetter,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW



On 2018年09月17日 16:37, Christian König wrote:
> Am 14.09.2018 um 12:37 schrieb Chunming Zhou:
>> This patch is for VK_KHR_timeline_semaphore extension, semaphore is 
>> called syncobj in kernel side:
>> This extension introduces a new type of syncobj that has an integer 
>> payload
>> identifying a point in a timeline. Such timeline syncobjs support the
>> following operations:
>>     * CPU query - A host operation that allows querying the payload 
>> of the
>>       timeline syncobj.
>>     * CPU wait - A host operation that allows a blocking wait for a
>>       timeline syncobj to reach a specified value.
>>     * Device wait - A device operation that allows waiting for a
>>       timeline syncobj to reach a specified value.
>>     * Device signal - A device operation that allows advancing the
>>       timeline syncobj to a specified value.
>>
>> Since it's a timeline, that means the front time point(PT) always is 
>> signaled before the late PT.
>> a. signal PT design:
>> Signal PT fence N depends on PT[N-1] fence and signal opertion fence, 
>> when PT[N] fence is signaled,
>> the timeline will increase to value of PT[N].
>> b. wait PT design:
>> Wait PT fence is signaled by reaching timeline point value, when 
>> timeline is increasing, will compare
>> wait PTs value with new timeline value, if PT value is lower than 
>> timeline value, then wait PT will be
>> signaled, otherwise keep in list. syncobj wait operation can wait on 
>> any point of timeline,
>> so need a RB tree to order them. And wait PT could ahead of signal 
>> PT, we need a sumission fence to
>> perform that.
>>
>> v2:
>> 1. remove unused DRM_SYNCOBJ_CREATE_TYPE_NORMAL. (Christian)
>> 2. move unexposed denitions to .c file. (Daniel Vetter)
>> 3. split up the change to drm_syncobj_find_fence() in a separate 
>> patch. (Christian)
>> 4. split up the change to drm_syncobj_replace_fence() in a separate 
>> patch.
>> 5. drop the submission_fence implementation and instead use 
>> wait_event() for that. (Christian)
>> 6. WARN_ON(point != 0) for NORMAL type syncobj case. (Daniel Vetter)
>>
>> v3:
>> 1. replace normal syncobj with timeline implemenation. (Vetter and 
>> Christian)
>>      a. normal syncobj signal op will create a signal PT to tail of 
>> signal pt list.
>>      b. normal syncobj wait op will create a wait pt with last signal 
>> point, and this wait PT is only signaled by related signal point PT.
>> 2. many bug fix and clean up
>> 3. stub fence moving is moved to other patch.
>>
>> v4:
>> 1. fix RB tree loop with while(node=rb_first(...)). (Christian)
>> 2. fix syncobj lifecycle. (Christian)
>> 3. only enable_signaling when there is wait_pt. (Christian)
>> 4. fix timeline path issues.
>> 5. write a timeline test in libdrm
>>
>> v5: (Christian)
>> 1. semaphore is called syncobj in kernel side.
>> 2. don't need 'timeline' characters in some function name.
>> 3. keep syncobj cb
>>
>> normal syncobj is tested by ./deqp-vk -n dEQP-VK*semaphore*
>> timeline syncobj is tested by ./amdgpu_test -s 9
>>
>> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
>> Cc: Christian Konig <christian.koenig@amd.com>
>> Cc: Dave Airlie <airlied@redhat.com>
>> Cc: Daniel Rakos <Daniel.Rakos@amd.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> ---
>>   drivers/gpu/drm/drm_syncobj.c              | 294 ++++++++++++++++++---
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |   4 +-
>>   include/drm/drm_syncobj.h                  |  62 +++--
>>   include/uapi/drm/drm.h                     |   1 +
>>   4 files changed, 292 insertions(+), 69 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_syncobj.c 
>> b/drivers/gpu/drm/drm_syncobj.c
>> index e9ce623d049e..e78d076f2703 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -56,6 +56,9 @@
>>   #include "drm_internal.h"
>>   #include <drm/drm_syncobj.h>
>>   +/* merge normal syncobj to timeline syncobj, the point interval is 
>> 1 */
>> +#define DRM_SYNCOBJ_NORMAL_POINT 1
>> +
>>   struct drm_syncobj_stub_fence {
>>       struct dma_fence base;
>>       spinlock_t lock;
>> @@ -82,6 +85,11 @@ static const struct dma_fence_ops 
>> drm_syncobj_stub_fence_ops = {
>>       .release = drm_syncobj_stub_fence_release,
>>   };
>>   +struct drm_syncobj_signal_pt {
>> +    struct dma_fence_array *base;
>> +    u64    value;
>> +    struct list_head list;
>> +};
>>     /**
>>    * drm_syncobj_find - lookup and reference a sync object.
>> @@ -124,7 +132,7 @@ static int 
>> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>   {
>>       int ret;
>>   -    *fence = drm_syncobj_fence_get(syncobj);
>> +    ret = drm_syncobj_search_fence(syncobj, 0, 0, fence);
>>       if (*fence)
>
> Don't we need to check ret here instead?
both ok, if you like check ret, will update it in v6.

>
>>           return 1;
>>   @@ -133,10 +141,10 @@ static int 
>> drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
>>        * have the lock, try one more time just to be sure we don't add a
>>        * callback when a fence has already been set.
>>        */
>> -    if (syncobj->fence) {
>> -        *fence = 
>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>> - lockdep_is_held(&syncobj->lock)));
>> -        ret = 1;
>> +    if (fence) {
>> +        drm_syncobj_search_fence(syncobj, 0, 0, fence);
>> +        if (*fence)
>> +            ret = 1;
>
> That doesn't looks correct to me, drm_syncobj_search_fence() would try 
> to grab the lock once more.
>
> That should probably be something like checking the signal_pt list and 
> if it is not empty any more drop the lock and retry.
Good catch, will change that.

>
>>       } else {
>>           *fence = NULL;
>>           drm_syncobj_add_callback_locked(syncobj, cb, func);
>> @@ -164,6 +172,151 @@ void drm_syncobj_remove_callback(struct 
>> drm_syncobj *syncobj,
>>       spin_unlock(&syncobj->lock);
>>   }
>>   +static void drm_syncobj_timeline_init(struct drm_syncobj *syncobj,
>> +                      struct drm_syncobj_timeline *syncobj_timeline)
>> +{
>> +    spin_lock(&syncobj->lock);
>> +    syncobj_timeline->timeline_context = dma_fence_context_alloc(1);
>> +    syncobj_timeline->timeline = 0;
>> +    syncobj_timeline->signal_point = 0;
>> +    init_waitqueue_head(&syncobj_timeline->wq);
>> +
>> +    INIT_LIST_HEAD(&syncobj_timeline->signal_pt_list);
>> +    spin_unlock(&syncobj->lock);
>> +}
>> +
>> +static void drm_syncobj_timeline_fini(struct drm_syncobj *syncobj,
>> +                      struct drm_syncobj_timeline *syncobj_timeline)
>> +{
>> +    struct drm_syncobj_signal_pt *signal_pt = NULL, *tmp;
>> +
>> +    spin_lock(&syncobj->lock);
>> +    list_for_each_entry_safe(signal_pt, tmp,
>> +                 &syncobj_timeline->signal_pt_list, list) {
>> +        list_del(&signal_pt->list);
>> +        dma_fence_put(&signal_pt->base->base);
>> +        kfree(signal_pt);
>> +    }
>> +    spin_unlock(&syncobj->lock);
>> +}
>> +
>> +static struct dma_fence
>> +*drm_syncobj_find_signal_pt_for_point(struct drm_syncobj *syncobj,
>> +                      uint64_t point)
>> +{
>> +    struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
>> +    struct drm_syncobj_signal_pt *signal_pt;
>> +
>> +    if ((syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) &&
>> +        (point <= timeline->timeline)) {
>> +        struct drm_syncobj_stub_fence *fence =
>> +            kzalloc(sizeof(struct drm_syncobj_stub_fence),
>> +                GFP_KERNEL);
>> +
>> +        if (!fence)
>> +            return NULL;
>> +        spin_lock_init(&fence->lock);
>> +        dma_fence_init(&fence->base,
>> +                   &drm_syncobj_stub_fence_ops,
>> +                   &fence->lock,
>> + syncobj->syncobj_timeline.timeline_context,
>> +                   point);
>> +
>> +        dma_fence_signal(&fence->base);
>> +        return &fence->base;
>> +    }
>> +
>> +    list_for_each_entry(signal_pt, &timeline->signal_pt_list, list) {
>> +        if (point > signal_pt->value)
>> +            continue;
>> +        if ((syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) &&
>> +            (point != signal_pt->value))
>> +            continue;
>> +        return dma_fence_get(&signal_pt->base->base);
>> +    }
>> +    return NULL;
>> +}
>> +
>> +static int drm_syncobj_create_signal_pt(struct drm_syncobj *syncobj,
>> +                    struct dma_fence *fence,
>> +                    u64 point)
>> +{
>> +    struct drm_syncobj_signal_pt *signal_pt =
>> +        kzalloc(sizeof(struct drm_syncobj_signal_pt), GFP_KERNEL);
>> +    struct drm_syncobj_signal_pt *tail_pt;
>> +    struct dma_fence **fences;
>> +    int num_fences = 0;
>> +    int ret = 0, i;
>> +
>> +    if (!signal_pt)
>> +        return -ENOMEM;
>> +    if (syncobj->syncobj_timeline.signal_point >= point) {
>> +        DRM_WARN("A later signal is ready!");
>> +        goto out;
>> +    }
>> +    if (!fence)
>> +        goto out;
>> +
>> +    fences = kmalloc_array(sizeof(void *), 2, GFP_KERNEL);
>> +    if (!fences)
>> +        goto out;
>> +    fences[num_fences++] = dma_fence_get(fence);
>> +    /* timeline syncobj must take this dependency */
>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>> +        spin_lock(&syncobj->lock);
>> +        if (!list_empty(&syncobj->syncobj_timeline.signal_pt_list)) {
>> +            tail_pt = 
>> list_last_entry(&syncobj->syncobj_timeline.signal_pt_list,
>> +                          struct drm_syncobj_signal_pt, list);
>> +            fences[num_fences++] = dma_fence_get(&tail_pt->base->base);
>> +        }
>> +        spin_unlock(&syncobj->lock);
>> +    }
>> +    signal_pt->base = dma_fence_array_create(num_fences, fences,
>> + syncobj->syncobj_timeline.timeline_context,
>> +                         point, false);
>> +    if (!signal_pt->base)
>> +        goto fail;
>> +
>> +    spin_lock(&syncobj->lock);
>> +    signal_pt->value = point;
>> +    INIT_LIST_HEAD(&signal_pt->list);
>> +    list_add_tail(&signal_pt->list, 
>> &syncobj->syncobj_timeline.signal_pt_list);
>> +    syncobj->syncobj_timeline.signal_point = point;
>> +    spin_unlock(&syncobj->lock);
>> +    wake_up_all(&syncobj->syncobj_timeline.wq);
>> +
>> +    return 0;
>> +fail:
>> +    for (i = 0; i < num_fences; i++)
>> +        dma_fence_put(fences[i]);
>> +    kfree(fences);
>> +out:
>> +    kfree(signal_pt);
>> +    return ret;
>> +}
>> +
>> +static void drm_syncobj_garbage_collection(struct drm_syncobj *syncobj)
>> +{
>> +    struct drm_syncobj_timeline *timeline = &syncobj->syncobj_timeline;
>> +    struct drm_syncobj_signal_pt *signal_pt, *tmp;
>> +
>> +    spin_lock(&syncobj->lock);
>> +    list_for_each_entry_safe(signal_pt, tmp,
>> +                 &timeline->signal_pt_list, list) {
>> +        if (dma_fence_is_signaled(&signal_pt->base->base)) {
>> +            timeline->timeline = signal_pt->value;
>> +            list_del(&signal_pt->list);
>> +            dma_fence_put(&signal_pt->base->base);
>> +            kfree(signal_pt);
>> +        } else {
>> +            /*signal_pt is in order in list, from small to big, so
>> +             * the later must not be signal either */
>> +            break;
>> +        }
>> +    }
>> +
>> +    spin_unlock(&syncobj->lock);
>> +}
>>   /**
>>    * drm_syncobj_replace_fence - replace fence in a sync object.
>>    * @syncobj: Sync object to replace fence in
>> @@ -176,28 +329,37 @@ void drm_syncobj_replace_fence(struct 
>> drm_syncobj *syncobj,
>>                      u64 point,
>>                      struct dma_fence *fence)
>>   {
>> -    struct dma_fence *old_fence;
>> -    struct drm_syncobj_cb *cur, *tmp;
>> -
>> -    if (fence)
>> -        dma_fence_get(fence);
>> -
>> -    spin_lock(&syncobj->lock);
>> -
>> -    old_fence = rcu_dereference_protected(syncobj->fence,
>> - lockdep_is_held(&syncobj->lock));
>> -    rcu_assign_pointer(syncobj->fence, fence);
>> +    drm_syncobj_garbage_collection(syncobj);
>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>> +        if (fence)
>> +            drm_syncobj_create_signal_pt(syncobj, fence, point);
>> +    } else if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>> +        u64 pt_value;
>> +
>> +        if (!fence) {
>> +            drm_syncobj_timeline_fini(syncobj,
>> +                          &syncobj->syncobj_timeline);
>> +            drm_syncobj_timeline_init(syncobj,
>> +                          &syncobj->syncobj_timeline);
>> +            return;
>> +        }
>> +        pt_value = syncobj->syncobj_timeline.signal_point +
>> +            DRM_SYNCOBJ_NORMAL_POINT;
>
> Looks like this could be simplified. If we don't have a timeline 
> syncobj we just finish and reinitialize it.
>
> Adding the fence then becomes common for both code paths.
>
>> + drm_syncobj_create_signal_pt(syncobj, fence, pt_value);
>> +    } else {
>> +        DRM_ERROR("the syncobj type isn't support\n");
>> +        return;
>> +    }
>> +    if (fence) {
>> +        struct drm_syncobj_cb *cur, *tmp;
>>   -    if (fence != old_fence) {
>> +        spin_lock(&syncobj->lock);
>>           list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
>>               list_del_init(&cur->node);
>>               cur->func(syncobj, cur);
>>           }
>> +        spin_unlock(&syncobj->lock);
>>       }
>> -
>> -    spin_unlock(&syncobj->lock);
>> -
>> -    dma_fence_put(old_fence);
>>   }
>>   EXPORT_SYMBOL(drm_syncobj_replace_fence);
>>   @@ -220,6 +382,25 @@ static int 
>> drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>>       return 0;
>>   }
>>   +static struct dma_fence *
>> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 
>> flags)
>> +{
>> +    struct dma_fence *fence;
>> +    int ret = 0;
>> +
>> +    if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>> +        ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
>> +                    point <= syncobj->syncobj_timeline.signal_point,
>> +                    msecs_to_jiffies(10000)); /* wait 10s */
>
> Either don't use a timeout or provide the timeout explicitely, but 
> don't hard code it here.
>
> Additional to that we probably need an incorruptible wait.
Initially, I used interruptible wait, I found it sometimes often return 
-ERESTARTSYS without waiting any second when I wrote unit test.
Any ideas for that? still need to interruptible wait?
if no interruptible, we have to use a timeout.


>
>> +        if (ret <= 0)
>> +            return NULL;
>> +    }
>> +    spin_lock(&syncobj->lock);
>> +    fence = drm_syncobj_find_signal_pt_for_point(syncobj, point);
>> +    spin_unlock(&syncobj->lock);
>> +    return fence;
>> +}
>> +
>>   /**
>>    * drm_syncobj_find_fence - lookup and reference the fence in a 
>> sync object
>>    * @file_private: drm file private pointer
>> @@ -234,20 +415,46 @@ static int 
>> drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>>    * contains a reference to the fence, which must be released by 
>> calling
>>    * dma_fence_put().
>>    */
>> -int drm_syncobj_find_fence(struct drm_file *file_private,
>> -               u32 handle, u64 point,
>> -               struct dma_fence **fence)
>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>> +                 u64 flags, struct dma_fence **fence)
>>   {
>> -    struct drm_syncobj *syncobj = drm_syncobj_find(file_private, 
>> handle);
>>       int ret = 0;
>>         if (!syncobj)
>>           return -ENOENT;
>>   -    *fence = drm_syncobj_fence_get(syncobj);
>> +    drm_syncobj_garbage_collection(syncobj);
>> +    if (syncobj->type == DRM_SYNCOBJ_TYPE_NORMAL) {
>> +        /*NORMAL syncobj always wait on last pt */
>> +        u64 tail_pt_value = syncobj->syncobj_timeline.signal_point;
>> +
>> +        if (tail_pt_value == 0)
>> +            tail_pt_value += DRM_SYNCOBJ_NORMAL_POINT;
>> +        /* NORMAL syncobj doesn't care point value */
>> +        WARN_ON(point != 0);
>> +        *fence = drm_syncobj_point_get(syncobj, tail_pt_value,
>> +                            flags);
>> +    } else if (syncobj->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>> +        *fence = drm_syncobj_point_get(syncobj, point,
>> +                            flags);
>
> Finding the fence is common for normal and timeline fence, only 
> figuring out the sequence number is different.
>
> So that can probably be simplified further here.
will do.

>
>> +    } else {
>> +        DRM_ERROR("Don't support this type syncobj\n");
>> +        *fence = NULL;
>> +    }
>>       if (!*fence) {
>>           ret = -EINVAL;
>>       }
>> +    return ret;
>> +}
>> +EXPORT_SYMBOL(drm_syncobj_search_fence);
>> +int drm_syncobj_find_fence(struct drm_file *file_private,
>> +               u32 handle, u64 point,
>> +               struct dma_fence **fence) {
>> +    struct drm_syncobj *syncobj = drm_syncobj_find(file_private, 
>> handle);
>> +
>> +    int ret = drm_syncobj_search_fence(syncobj, point,
>> +                    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>> +                    fence);
>>       drm_syncobj_put(syncobj);
>>       return ret;
>>   }
>> @@ -264,7 +471,7 @@ void drm_syncobj_free(struct kref *kref)
>>       struct drm_syncobj *syncobj = container_of(kref,
>>                              struct drm_syncobj,
>>                              refcount);
>> -    drm_syncobj_replace_fence(syncobj, 0, NULL);
>> +    drm_syncobj_timeline_fini(syncobj, &syncobj->syncobj_timeline);
>>       kfree(syncobj);
>>   }
>>   EXPORT_SYMBOL(drm_syncobj_free);
>> @@ -294,6 +501,11 @@ int drm_syncobj_create(struct drm_syncobj 
>> **out_syncobj, uint32_t flags,
>>       kref_init(&syncobj->refcount);
>>       INIT_LIST_HEAD(&syncobj->cb_list);
>>       spin_lock_init(&syncobj->lock);
>> +    if (flags & DRM_SYNCOBJ_CREATE_TYPE_TIMELINE)
>> +        syncobj->type = DRM_SYNCOBJ_TYPE_TIMELINE;
>> +    else
>> +        syncobj->type = DRM_SYNCOBJ_TYPE_NORMAL;
>> +    drm_syncobj_timeline_init(syncobj, &syncobj->syncobj_timeline);
>>         if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
>>           ret = drm_syncobj_assign_null_handle(syncobj);
>> @@ -576,7 +788,8 @@ drm_syncobj_create_ioctl(struct drm_device *dev, 
>> void *data,
>>           return -ENODEV;
>>         /* no valid flags yet */
>> -    if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
>> +    if (args->flags & ~(DRM_SYNCOBJ_CREATE_SIGNALED |
>> +                DRM_SYNCOBJ_CREATE_TYPE_TIMELINE))
>>           return -EINVAL;
>>         return drm_syncobj_create_as_handle(file_private,
>> @@ -669,9 +882,8 @@ static void syncobj_wait_syncobj_func(struct 
>> drm_syncobj *syncobj,
>>       struct syncobj_wait_entry *wait =
>>           container_of(cb, struct syncobj_wait_entry, syncobj_cb);
>>   -    /* This happens inside the syncobj lock */
>> -    wait->fence = 
>> dma_fence_get(rcu_dereference_protected(syncobj->fence,
>> - lockdep_is_held(&syncobj->lock)));
>> +    drm_syncobj_search_fence(syncobj, 0, 0, &wait->fence);
>> +
>>       wake_up_process(wait->task);
>>   }
>>   @@ -698,7 +910,8 @@ static signed long 
>> drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
>>       signaled_count = 0;
>>       for (i = 0; i < count; ++i) {
>>           entries[i].task = current;
>> -        entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
>> +        ret = drm_syncobj_search_fence(syncobjs[i], 0, 0,
>> +                           &entries[i].fence);
>>           if (!entries[i].fence) {
>>               if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>                   continue;
>> @@ -970,12 +1183,21 @@ drm_syncobj_reset_ioctl(struct drm_device 
>> *dev, void *data,
>>       if (ret < 0)
>>           return ret;
>>   -    for (i = 0; i < args->count_handles; i++)
>> -        drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
>> -
>> +    for (i = 0; i < args->count_handles; i++) {
>> +        if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>> +            DRM_ERROR("timeline syncobj cannot reset!\n");
>
> Why not? I mean that should still work or do I miss anything?
timeline semaphore spec doesn't require reset interface, it says the 
timeline value only can be changed by signal operations.

>
>> +            ret = -EINVAL;
>> +            goto out;
>> +        }
>> +        drm_syncobj_timeline_fini(syncobjs[i],
>> +                      &syncobjs[i]->syncobj_timeline);
>> +        drm_syncobj_timeline_init(syncobjs[i],
>> +                      &syncobjs[i]->syncobj_timeline);
>> +    }
>> +out:
>>       drm_syncobj_array_free(syncobjs, args->count_handles);
>>   -    return 0;
>> +    return ret;
>>   }
>>     int
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 0a8d2d64f380..579e91a5858b 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>>           if (!(flags & I915_EXEC_FENCE_WAIT))
>>               continue;
>>   -        fence = drm_syncobj_fence_get(syncobj);
>> +        drm_syncobj_search_fence(syncobj, 0,
>> +                     DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>
> Not sure if we can make DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT the 
> default here.
>
> Maybe better if drivers explicitly need to ask for it?
if you don't like the flag used in specific driver, I can wrap it to a 
new function.


>
>> +                     &fence);
>>           if (!fence)
>>               return -EINVAL;
>>   diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
>> index 425432b85a87..9535521e6623 100644
>> --- a/include/drm/drm_syncobj.h
>> +++ b/include/drm/drm_syncobj.h
>> @@ -30,6 +30,25 @@
>>     struct drm_syncobj_cb;
>>   +enum drm_syncobj_type {
>> +    DRM_SYNCOBJ_TYPE_NORMAL,
>
> Does anybody have a better suggestion for _TYPE_NORMAL? That sounds 
> like timeline would be special.
How about _TYPE_INDIVIDUAL?

>
>> +    DRM_SYNCOBJ_TYPE_TIMELINE
>> +};
>> +
>> +struct drm_syncobj_timeline {
>> +    wait_queue_head_t    wq;
>> +    u64 timeline_context;
>> +    /**
>> +     * @timeline: syncobj timeline
>> +     */
>> +    u64 timeline;
>> +    u64 signal_point;
>> +
>> +
>> +    struct rb_root wait_pt_tree;
>
> Not used any more?
will clean up.

>
>> +    struct list_head signal_pt_list;
>> +};
>> +
>>   /**
>>    * struct drm_syncobj - sync object.
>>    *
>> @@ -41,19 +60,19 @@ struct drm_syncobj {
>>        */
>>       struct kref refcount;
>>       /**
>> -     * @fence:
>> -     * NULL or a pointer to the fence bound to this object.
>> -     *
>> -     * This field should not be used directly. Use 
>> drm_syncobj_fence_get()
>> -     * and drm_syncobj_replace_fence() instead.
>> +     * @type: indicate syncobj type
>>        */
>> -    struct dma_fence __rcu *fence;
>> +    enum drm_syncobj_type type;
>>       /**
>> -     * @cb_list: List of callbacks to call when the &fence gets 
>> replaced.
>> +     * @syncobj_timeline: timeline
>>        */
>> +    struct drm_syncobj_timeline syncobj_timeline;
>
> Looks like that can now be embedded into drm_syncobj.
will try.

btw: I encounter a problem after removing advanced wait pt, I debug it 
two days, but still don't find reason, from log, it's timeount when 
wait_event_timeout, that means it don't receive signal operation.

"./deqp-vk -n dEQP-VK*semaphore*" will fail on 
'dEQP-VK.synchronization.basic.semaphore.chain' case, but can pass if 
only run that one case like "./deqp-vk -n 
dEQP-VK.synchronization.basic.semaphore.chain".

Both old and my previous implementation can pass and have no this problem.


Thanks,
David Zhou
>
> Regards,
> Christian.
>
>> +    /**
>> +         * @cb_list: List of callbacks to call when the &fence gets 
>> replaced.
>> +         */
>>       struct list_head cb_list;
>>       /**
>> -     * @lock: Protects &cb_list and write-locks &fence.
>> +     * @lock: Protects syncobj list and write-locks &fence.
>>        */
>>       spinlock_t lock;
>>       /**
>> @@ -68,7 +87,7 @@ typedef void (*drm_syncobj_func_t)(struct 
>> drm_syncobj *syncobj,
>>   /**
>>    * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
>>    * @node: used by drm_syncob_add_callback to append this struct to
>> - *      &drm_syncobj.cb_list
>> + *       &drm_syncobj.cb_list
>>    * @func: drm_syncobj_func_t to call
>>    *
>>    * This struct will be initialized by drm_syncobj_add_callback, 
>> additional
>> @@ -106,29 +125,6 @@ drm_syncobj_put(struct drm_syncobj *obj)
>>       kref_put(&obj->refcount, drm_syncobj_free);
>>   }
>>   -/**
>> - * drm_syncobj_fence_get - get a reference to a fence in a sync object
>> - * @syncobj: sync object.
>> - *
>> - * This acquires additional reference to &drm_syncobj.fence 
>> contained in @obj,
>> - * if not NULL. It is illegal to call this without already holding a 
>> reference.
>> - * No locks required.
>> - *
>> - * Returns:
>> - * Either the fence of @obj or NULL if there's none.
>> - */
>> -static inline struct dma_fence *
>> -drm_syncobj_fence_get(struct drm_syncobj *syncobj)
>> -{
>> -    struct dma_fence *fence;
>> -
>> -    rcu_read_lock();
>> -    fence = dma_fence_get_rcu_safe(&syncobj->fence);
>> -    rcu_read_unlock();
>> -
>> -    return fence;
>> -}
>> -
>>   struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>>                        u32 handle);
>>   void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
>> @@ -142,5 +138,7 @@ int drm_syncobj_create(struct drm_syncobj 
>> **out_syncobj, uint32_t flags,
>>   int drm_syncobj_get_handle(struct drm_file *file_private,
>>                  struct drm_syncobj *syncobj, u32 *handle);
>>   int drm_syncobj_get_fd(struct drm_syncobj *syncobj, int *p_fd);
>> +int drm_syncobj_search_fence(struct drm_syncobj *syncobj, u64 point,
>> +                 u64 flags, struct dma_fence **fence);
>>     #endif
>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>> index 300f336633f2..cebdb2541eb7 100644
>> --- a/include/uapi/drm/drm.h
>> +++ b/include/uapi/drm/drm.h
>> @@ -717,6 +717,7 @@ struct drm_prime_handle {
>>   struct drm_syncobj_create {
>>       __u32 handle;
>>   #define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
>> +#define DRM_SYNCOBJ_CREATE_TYPE_TIMELINE (1 << 1)
>>       __u32 flags;
>>   };
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
  2018-09-17  9:33     ` zhoucm1
@ 2018-09-18  8:32       ` Christian König
  2018-09-19  3:20         ` zhoucm1
  0 siblings, 1 reply; 12+ messages in thread
From: Christian König @ 2018-09-18  8:32 UTC (permalink / raw)
  To: zhoucm1, Christian König, Chunming Zhou, dri-devel
  Cc: Dave Airlie, Daniel Rakos, amd-gfx

Am 17.09.2018 um 11:33 schrieb zhoucm1:
> [SNIP]
>>>   +static struct dma_fence *
>>> +drm_syncobj_point_get(struct drm_syncobj *syncobj, u64 point, u64 
>>> flags)
>>> +{
>>> +    struct dma_fence *fence;
>>> +    int ret = 0;
>>> +
>>> +    if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
>>> +        ret = wait_event_timeout(syncobj->syncobj_timeline.wq,
>>> +                    point <= syncobj->syncobj_timeline.signal_point,
>>> +                    msecs_to_jiffies(10000)); /* wait 10s */
>>
>> Either don't use a timeout or provide the timeout explicitely, but 
>> don't hard code it here.
>>
>> Additional to that we probably need an incorruptible wait.
> Initially, I used interruptible wait, I found it sometimes often 
> return -ERESTARTSYS without waiting any second when I wrote unit test.
> Any ideas for that?

Yeah, that is perfectly normal. The application just received a signal 
and we need to abort the IOCTL ASAP.

> still need to interruptible wait?

Yes, that should certainly be an interruptible wait.

Otherwise an application could block on this forever when somebody 
forgets to supply a signal point.

>>>   -    for (i = 0; i < args->count_handles; i++)
>>> -        drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
>>> -
>>> +    for (i = 0; i < args->count_handles; i++) {
>>> +        if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>> +            DRM_ERROR("timeline syncobj cannot reset!\n");
>>
>> Why not? I mean that should still work or do I miss anything?
> timeline semaphore spec doesn't require reset interface, it says the 
> timeline value only can be changed by signal operations.

Yeah, but we don't care about the timeline spec in the kernel.

Question is rather if that still makes sense to support that and as far 
as I can see it should be trivial to reinitialize the object.

>>> +            ret = -EINVAL;
>>> +            goto out;
>>> +        }
>>> +        drm_syncobj_timeline_fini(syncobjs[i],
>>> +                      &syncobjs[i]->syncobj_timeline);
>>> +        drm_syncobj_timeline_init(syncobjs[i],
>>> +                      &syncobjs[i]->syncobj_timeline);
>>> +    }
>>> +out:
>>>       drm_syncobj_array_free(syncobjs, args->count_handles);
>>>   -    return 0;
>>> +    return ret;
>>>   }
>>>     int
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c 
>>> b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index 0a8d2d64f380..579e91a5858b 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -2137,7 +2137,9 @@ await_fence_array(struct i915_execbuffer *eb,
>>>           if (!(flags & I915_EXEC_FENCE_WAIT))
>>>               continue;
>>>   -        fence = drm_syncobj_fence_get(syncobj);
>>> +        drm_syncobj_search_fence(syncobj, 0,
>>> +                     DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
>>
>> Not sure if we can make DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT the 
>> default here.
>>
>> Maybe better if drivers explicitly need to ask for it?
> if you don't like the flag used in specific driver, I can wrap it to a 
> new function.

Yeah, that is probably a good idea.

>>
>>> +                     &fence);
>>>           if (!fence)
>>>               return -EINVAL;
>>>   diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
>>> index 425432b85a87..9535521e6623 100644
>>> --- a/include/drm/drm_syncobj.h
>>> +++ b/include/drm/drm_syncobj.h
>>> @@ -30,6 +30,25 @@
>>>     struct drm_syncobj_cb;
>>>   +enum drm_syncobj_type {
>>> +    DRM_SYNCOBJ_TYPE_NORMAL,
>>
>> Does anybody have a better suggestion for _TYPE_NORMAL? That sounds 
>> like timeline would be special.
> How about _TYPE_INDIVIDUAL?

Oh, really good idea.

> btw: I encounter a problem after removing advanced wait pt, I debug it 
> two days, but still don't find reason, from log, it's timeount when 
> wait_event_timeout, that means it don't receive signal operation.

Good question. Maybe a race condition? Have you tried to trace both 
threads to the trace buffer, e.g. use trace_printk?

> "./deqp-vk -n dEQP-VK*semaphore*" will fail on 
> 'dEQP-VK.synchronization.basic.semaphore.chain' case, but can pass if 
> only run that one case like "./deqp-vk -n 
> dEQP-VK.synchronization.basic.semaphore.chain".
>
> Both old and my previous implementation can pass and have no this 
> problem.

Can you reproduce that as well with the unit tests in libdrm?

Christian.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
  2018-09-18  8:32       ` Christian König
@ 2018-09-19  3:20         ` zhoucm1
       [not found]           ` <0231250d-0ed0-e99b-84c6-166b47a0b9cc-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: zhoucm1 @ 2018-09-19  3:20 UTC (permalink / raw)
  To: christian.koenig, Chunming Zhou, dri-devel, Daniel Rakos
  Cc: Dave Airlie, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 888 bytes --]



On 2018年09月18日 16:32, Christian König wrote:
>>>> +    for (i = 0; i < args->count_handles; i++) {
>>>> +        if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>> +            DRM_ERROR("timeline syncobj cannot reset!\n");
>>>
>>> Why not? I mean that should still work or do I miss anything?
>> timeline semaphore spec doesn't require reset interface, it says the 
>> timeline value only can be changed by signal operations.
>
> Yeah, but we don't care about the timeline spec in the kernel.
>
> Question is rather if that still makes sense to support that and as 
> far as I can see it should be trivial to reinitialize the object. 
Hi Daniel Rakos,

Could you give a comment on this question? Is it necessary to support 
timeline reset interface?  I only see the timeline value can be changed 
by signal operations in Spec.


Thanks,
David Zhou

[-- Attachment #1.2: Type: text/html, Size: 1729 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] [RFC]drm: add syncobj timeline support v5
       [not found]           ` <0231250d-0ed0-e99b-84c6-166b47a0b9cc-5C7GfCeVMHo@public.gmane.org>
@ 2018-09-19  7:21             ` Christian König
  0 siblings, 0 replies; 12+ messages in thread
From: Christian König @ 2018-09-19  7:21 UTC (permalink / raw)
  To: zhoucm1, christian.koenig-5C7GfCeVMHo, Chunming Zhou,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Daniel Rakos
  Cc: Dave Airlie, Daniel Vetter, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 1398 bytes --]

Am 19.09.2018 um 05:20 schrieb zhoucm1:
>
>
>
> On 2018年09月18日 16:32, Christian König wrote:
>>>>> +    for (i = 0; i < args->count_handles; i++) {
>>>>> +        if (syncobjs[i]->type == DRM_SYNCOBJ_TYPE_TIMELINE) {
>>>>> +            DRM_ERROR("timeline syncobj cannot reset!\n");
>>>>
>>>> Why not? I mean that should still work or do I miss anything?
>>> timeline semaphore spec doesn't require reset interface, it says the 
>>> timeline value only can be changed by signal operations.
>>
>> Yeah, but we don't care about the timeline spec in the kernel.
>>
>> Question is rather if that still makes sense to support that and as 
>> far as I can see it should be trivial to reinitialize the object. 
> Hi Daniel Rakos,
>
> Could you give a comment on this question? Is it necessary to support 
> timeline reset interface?  I only see the timeline value can be 
> changed by signal operations in Spec.

I think the Vulkan spec is rather irrelevant here. We added the reset 
operation because of drivers which wanted to re-use an existing syncobj.

That is still a valid use case even when Vulkan doesn't have an 
interface for it.

Christian.

>
>
> Thanks,
> David Zhou
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


[-- Attachment #1.2: Type: text/html, Size: 3036 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-09-19  7:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-14 10:37 [PATCH] [RFC]drm: add syncobj timeline support v5 Chunming Zhou
2018-09-14 10:49 ` Christian König
     [not found]   ` <9a07a371-8de1-2ebf-66e7-23b93c1bc1c3-5C7GfCeVMHo@public.gmane.org>
2018-09-14 16:10     ` Daniel Vetter
     [not found]       ` <20180914161059.GR11082-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2018-09-14 16:43         ` Christian König
2018-09-14 18:24           ` Daniel Vetter
     [not found]             ` <CAKMK7uH9jLO0bJe8ek-JyK3u1jbKY6Jk6xgT9ifgcko7Hi2URA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-09-14 18:27               ` Christian König
2018-09-17  2:23         ` Zhou, David(ChunMing)
2018-09-17  8:37 ` Christian König
     [not found]   ` <a5d2d299-4360-4267-046e-40b550239524-5C7GfCeVMHo@public.gmane.org>
2018-09-17  9:33     ` zhoucm1
2018-09-18  8:32       ` Christian König
2018-09-19  3:20         ` zhoucm1
     [not found]           ` <0231250d-0ed0-e99b-84c6-166b47a0b9cc-5C7GfCeVMHo@public.gmane.org>
2018-09-19  7:21             ` Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.