All of lore.kernel.org
 help / color / mirror / Atom feed
* restart syncobj timeline changes
@ 2018-11-15 11:12 Christian König
  2018-11-15 11:12 ` [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit Christian König
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

Ok that turned out to be more problematic than I ever thought it would be.

Taking a closer look I've found that the previous timeline implementation not only had a number of implementation bugs, but also some fundamental design problems.

Especially we can't enforce that fences are added in the correct order, e.g. when a new timeline fence is added it can in theory be that another one was added in the meantime and we can't guarantee the correct order any more.

So this new set of patches does a conservative approach on handling those kinds of errors. When userspace does something broken it can keep the pieces, but the kernel will always make sure that it does the right thing (e.g. not leak memory, crash, corrupt data, etc....).

What I've did to the implementation is to split up the patch into smaller pieces and move out most of the logic into a new dma-fence-chain container.

This way we can make sure that we don't run into resource management problems (the container is garbage collected) and at the same time minimize the changes to the existing drm_syncobj implementation to make sure not to regress any existing users.

The whole set is only barely smoke tested, cause I wanted to get feedback as early as possible on this. But as far as I can see at least this approach should work in general.

Please comment and/or review,
Christian.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit
  2018-11-15 11:12 restart syncobj timeline changes Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-16  5:26   ` Zhou, David(ChunMing)
  2018-11-15 11:12 ` [PATCH 2/7] dma-buf: add new dma_fence_chain container Christian König
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

For a lot of use cases we need 64bit sequence numbers. Currently drivers
overload the dma_fence structure to store the additional bits.

Stop doing that and make the sequence number in the dma_fence always
64bit.

For compatibility with hardware which can do only 32bit sequences the
comparisons in __dma_fence_is_later still only takes the lower 32bits as
significant.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/dma-fence.c            |  2 +-
 drivers/dma-buf/sw_sync.c              |  2 +-
 drivers/dma-buf/sync_file.c            |  4 ++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c |  2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c   |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c |  2 +-
 drivers/gpu/drm/vgem/vgem_fence.c      |  4 ++--
 include/linux/dma-fence.h              | 14 +++++++-------
 8 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 1551ca7df394..37e24b69e94b 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -615,7 +615,7 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
  */
 void
 dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
-	       spinlock_t *lock, u64 context, unsigned seqno)
+	       spinlock_t *lock, u64 context, u64 seqno)
 {
 	BUG_ON(!lock);
 	BUG_ON(!ops || !ops->get_driver_name || !ops->get_timeline_name);
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 53c1d6d36a64..32dcf7b4c935 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -172,7 +172,7 @@ static bool timeline_fence_enable_signaling(struct dma_fence *fence)
 static void timeline_fence_value_str(struct dma_fence *fence,
 				    char *str, int size)
 {
-	snprintf(str, size, "%d", fence->seqno);
+	snprintf(str, size, "%lld", fence->seqno);
 }
 
 static void timeline_fence_timeline_value_str(struct dma_fence *fence,
diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
index 35dd06479867..4f6305ca52c8 100644
--- a/drivers/dma-buf/sync_file.c
+++ b/drivers/dma-buf/sync_file.c
@@ -144,7 +144,7 @@ char *sync_file_get_name(struct sync_file *sync_file, char *buf, int len)
 	} else {
 		struct dma_fence *fence = sync_file->fence;
 
-		snprintf(buf, len, "%s-%s%llu-%d",
+		snprintf(buf, len, "%s-%s%llu-%lld",
 			 fence->ops->get_driver_name(fence),
 			 fence->ops->get_timeline_name(fence),
 			 fence->context,
@@ -258,7 +258,7 @@ static struct sync_file *sync_file_merge(const char *name, struct sync_file *a,
 
 			i_b++;
 		} else {
-			if (pt_a->seqno - pt_b->seqno <= INT_MAX)
+			if (__dma_fence_is_later(pt_a->seqno, pt_b->seqno))
 				add_fence(fences, &i, pt_a);
 			else
 				add_fence(fences, &i, pt_b);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
index 12f2bf97611f..bfaf5c6323be 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
@@ -388,7 +388,7 @@ void amdgpu_sa_bo_dump_debug_info(struct amdgpu_sa_manager *sa_manager,
 			   soffset, eoffset, eoffset - soffset);
 
 		if (i->fence)
-			seq_printf(m, " protected by 0x%08x on context %llu",
+			seq_printf(m, " protected by 0x%016llx on context %llu",
 				   i->fence->seqno, i->fence->context);
 
 		seq_printf(m, "\n");
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 6dbeed079ae5..11bcdabd5177 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -393,7 +393,7 @@ static void timer_i915_sw_fence_wake(struct timer_list *t)
 	if (!fence)
 		return;
 
-	pr_notice("Asynchronous wait on fence %s:%s:%x timed out (hint:%pS)\n",
+	pr_notice("Asynchronous wait on fence %s:%s:%llx timed out (hint:%pS)\n",
 		  cb->dma->ops->get_driver_name(cb->dma),
 		  cb->dma->ops->get_timeline_name(cb->dma),
 		  cb->dma->seqno,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 217ed3ee1cab..f28a66c67d34 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1236,7 +1236,7 @@ static void print_request(struct drm_printer *m,
 
 	x = print_sched_attr(rq->i915, &rq->sched.attr, buf, x, sizeof(buf));
 
-	drm_printf(m, "%s%x%s [%llx:%x]%s @ %dms: %s\n",
+	drm_printf(m, "%s%x%s [%llx:%llx]%s @ %dms: %s\n",
 		   prefix,
 		   rq->global_seqno,
 		   i915_request_completed(rq) ? "!" : "",
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index e6ee71323a66..1fdb09fe71d3 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -53,13 +53,13 @@ static void vgem_fence_release(struct dma_fence *base)
 
 static void vgem_fence_value_str(struct dma_fence *fence, char *str, int size)
 {
-	snprintf(str, size, "%u", fence->seqno);
+	snprintf(str, size, "%llu", fence->seqno);
 }
 
 static void vgem_fence_timeline_value_str(struct dma_fence *fence, char *str,
 					  int size)
 {
-	snprintf(str, size, "%u",
+	snprintf(str, size, "%llu",
 		 dma_fence_is_signaled(fence) ? fence->seqno : 0);
 }
 
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 02dba8cd033d..1393529c0c1f 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -77,7 +77,7 @@ struct dma_fence {
 	struct list_head cb_list;
 	spinlock_t *lock;
 	u64 context;
-	unsigned seqno;
+	u64 seqno;
 	unsigned long flags;
 	ktime_t timestamp;
 	int error;
@@ -244,7 +244,7 @@ struct dma_fence_ops {
 };
 
 void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
-		    spinlock_t *lock, u64 context, unsigned seqno);
+		    spinlock_t *lock, u64 context, u64 seqno);
 
 void dma_fence_release(struct kref *kref);
 void dma_fence_free(struct dma_fence *fence);
@@ -414,9 +414,9 @@ dma_fence_is_signaled(struct dma_fence *fence)
  * Returns true if f1 is chronologically later than f2. Both fences must be
  * from the same context, since a seqno is not common across contexts.
  */
-static inline bool __dma_fence_is_later(u32 f1, u32 f2)
+static inline bool __dma_fence_is_later(u64 f1, u64 f2)
 {
-	return (int)(f1 - f2) > 0;
+	return (int)(lower_32_bits(f1) - lower_32_bits(f2)) > 0;
 }
 
 /**
@@ -547,21 +547,21 @@ u64 dma_fence_context_alloc(unsigned num);
 	do {								\
 		struct dma_fence *__ff = (f);				\
 		if (IS_ENABLED(CONFIG_DMA_FENCE_TRACE))			\
-			pr_info("f %llu#%u: " fmt,			\
+			pr_info("f %llu#%llu: " fmt,			\
 				__ff->context, __ff->seqno, ##args);	\
 	} while (0)
 
 #define DMA_FENCE_WARN(f, fmt, args...) \
 	do {								\
 		struct dma_fence *__ff = (f);				\
-		pr_warn("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
+		pr_warn("f %llu#%llu: " fmt, __ff->context, __ff->seqno,\
 			 ##args);					\
 	} while (0)
 
 #define DMA_FENCE_ERR(f, fmt, args...) \
 	do {								\
 		struct dma_fence *__ff = (f);				\
-		pr_err("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
+		pr_err("f %llu#%llu: " fmt, __ff->context, __ff->seqno,	\
 			##args);					\
 	} while (0)
 
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/7] dma-buf: add new dma_fence_chain container
  2018-11-15 11:12 restart syncobj timeline changes Christian König
  2018-11-15 11:12 ` [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-16  5:51   ` Zhou, David(ChunMing)
  2018-11-15 11:12 ` [PATCH 3/7] drm: revert "expand replace_fence to support timeline point v2" Christian König
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

Lockless container implementation similar to a dma_fence_array, but with
only two elements per node and automatic garbage collection.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/Makefile          |   3 +-
 drivers/dma-buf/dma-fence-chain.c | 186 ++++++++++++++++++++++++++++++++++++++
 include/linux/dma-fence-chain.h   |  69 ++++++++++++++
 3 files changed, 257 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma-buf/dma-fence-chain.c
 create mode 100644 include/linux/dma-fence-chain.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 0913a6ccab5a..1f006e083eb9 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,4 +1,5 @@
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-fence.o
+obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
+	 reservation.o seqno-fence.o
 obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
 obj-$(CONFIG_SW_SYNC)		+= sw_sync.o sync_debug.o
 obj-$(CONFIG_UDMABUF)		+= udmabuf.o
diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
new file mode 100644
index 000000000000..ac830b886589
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -0,0 +1,186 @@
+/*
+ * fence-chain: chain fences together in a timeline
+ *
+ * Copyright (C) 2018 Advanced Micro Devices, Inc.
+ * Authors:
+ *	Christian König <christian.koenig@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/dma-fence-chain.h>
+
+static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
+
+/**
+ * dma_fence_chain_get_prev - use RCU to get a reference to the previous fence
+ * @chain: chain node to get the previous node from
+ *
+ * Use dma_fence_get_rcu_safe to get a reference to the previous fence of the
+ * chain node.
+ */
+static struct dma_fence *dma_fence_chain_get_prev(struct dma_fence_chain *chain)
+{
+	struct dma_fence *prev;
+
+	rcu_read_lock();
+	prev = dma_fence_get_rcu_safe(&chain->prev);
+	rcu_read_unlock();
+	return prev;
+}
+
+/**
+ * dma_fence_chain_walk - chain walking function
+ * @fence: current chain node
+ *
+ * Walk the chain to the next node. Returns the next fence or NULL if we are at
+ * the end of the chain. Garbage collects chain nodes which are already
+ * signaled.
+ */
+struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence)
+{
+	struct dma_fence_chain *chain, *prev_chain;
+	struct dma_fence *prev, *prev_prev, *tmp;
+
+	chain = to_dma_fence_chain(fence);
+	if (!chain) {
+		dma_fence_put(fence);
+		return NULL;
+	}
+
+	while ((prev = dma_fence_chain_get_prev(chain))) {
+
+		prev_chain = to_dma_fence_chain(prev);
+		if (!prev_chain || !dma_fence_is_signaled(prev_chain->fence))
+			break;
+
+		prev_prev = dma_fence_chain_get_prev(prev_chain);
+		tmp = cmpxchg(&chain->prev, prev, prev_prev);
+		if (tmp == prev)
+			dma_fence_put(tmp);
+		else
+			dma_fence_put(prev_prev);
+		dma_fence_put(prev);
+	}
+
+	dma_fence_put(fence);
+	return prev;
+}
+EXPORT_SYMBOL(dma_fence_chain_walk);
+
+static const char *dma_fence_chain_get_driver_name(struct dma_fence *fence)
+{
+        return "dma_fence_chain";
+}
+
+static const char *dma_fence_chain_get_timeline_name(struct dma_fence *fence)
+{
+        return "unbound";
+}
+
+static void dma_fence_chain_irq_work(struct irq_work *work)
+{
+	struct dma_fence_chain *chain;
+
+	chain = container_of(work, typeof(*chain), work);
+
+	/* Try to rearm the callback */
+	if (!dma_fence_chain_enable_signaling(&chain->base))
+		/* Ok, we are done. No more unsignaled fences left */
+		dma_fence_signal(&chain->base);
+}
+
+static void dma_fence_chain_cb(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+	struct dma_fence_chain *chain;
+
+	chain = container_of(cb, typeof(*chain), cb);
+	irq_work_queue(&chain->work);
+	dma_fence_put(f);
+}
+
+static bool dma_fence_chain_enable_signaling(struct dma_fence *fence)
+{
+	struct dma_fence_chain *head = to_dma_fence_chain(fence);
+
+	dma_fence_chain_for_each(fence) {
+		struct dma_fence_chain *chain = to_dma_fence_chain(fence);
+		struct dma_fence *f = chain ? chain->fence : fence;
+
+		dma_fence_get(f);
+		if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb))
+			return true;
+		dma_fence_put(f);
+	}
+	return false;
+}
+
+static bool dma_fence_chain_signaled(struct dma_fence *fence)
+{
+	dma_fence_chain_for_each(fence) {
+		struct dma_fence_chain *chain = to_dma_fence_chain(fence);
+		struct dma_fence *f = chain ? chain->fence : fence;
+
+		if (!dma_fence_is_signaled(f))
+			return false;
+	}
+
+	return true;
+}
+
+static void dma_fence_chain_release(struct dma_fence *fence)
+{
+	struct dma_fence_chain *chain = to_dma_fence_chain(fence);
+
+	dma_fence_put(chain->prev);
+	dma_fence_put(chain->fence);
+	dma_fence_free(fence);
+}
+
+const struct dma_fence_ops dma_fence_chain_ops = {
+	.get_driver_name = dma_fence_chain_get_driver_name,
+	.get_timeline_name = dma_fence_chain_get_timeline_name,
+	.enable_signaling = dma_fence_chain_enable_signaling,
+	.signaled = dma_fence_chain_signaled,
+	.release = dma_fence_chain_release,
+};
+EXPORT_SYMBOL(dma_fence_chain_ops);
+
+/**
+ * dma_fence_chain_init - initialize a fence chain
+ * @chain: the chain node to initialize
+ * @prev: the previous fence
+ * @fence: the current fence
+ *
+ * Initialize a new chain node and either start a new chain or add the node to
+ * the existing chain of the previous fence.
+ */
+void dma_fence_chain_init(struct dma_fence_chain *chain,
+			  struct dma_fence *prev,
+			  struct dma_fence *fence,
+			  uint64_t seqno)
+{
+	uint64_t context;
+
+	/* Try to reuse the context of the previous chain node. */
+	if (to_dma_fence_chain(prev) &&
+	    __dma_fence_is_later(seqno, prev->seqno))
+		context = prev->context;
+	else
+		context = dma_fence_context_alloc(1);
+
+	spin_lock_init(&chain->lock);
+	dma_fence_init(&chain->base, &dma_fence_chain_ops,
+		       &chain->lock, context, seqno);
+	chain->prev = prev;
+	chain->fence = fence;
+	init_irq_work(&chain->work, dma_fence_chain_irq_work);
+}
+EXPORT_SYMBOL(dma_fence_chain_init);
diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-fence-chain.h
new file mode 100644
index 000000000000..0e295698fcf8
--- /dev/null
+++ b/include/linux/dma-fence-chain.h
@@ -0,0 +1,69 @@
+/*
+ * fence-chain: chain fences together in a timeline
+ *
+ * Copyright (C) 2018 Advanced Micro Devices, Inc.
+ * Authors:
+ *	Christian König <christian.koenig@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __LINUX_DMA_FENCE_CHAIN_H
+#define __LINUX_DMA_FENCE_CHAIN_H
+
+#include <linux/dma-fence.h>
+#include <linux/irq_work.h>
+
+/**
+ * struct dma_fence_chain - fence to represent an node of a fence chain
+ * @base: fence base class
+ * @lock: spinlock for fence handling
+ * @prev: previous fence of the chain
+ * @fence: encapsulated fence
+ * @cb: callback structure for signaling
+ * @work: irq work item for signaling
+ */
+struct dma_fence_chain {
+	struct dma_fence base;
+	spinlock_t lock;
+	struct dma_fence *prev;
+	struct dma_fence *fence;
+	struct dma_fence_cb cb;
+	struct irq_work work;
+};
+
+extern const struct dma_fence_ops dma_fence_chain_ops;
+
+/**
+ * to_dma_fence_chain - cast a fence to a dma_fence_chain
+ * @fence: fence to cast to a dma_fence_array
+ *
+ * Returns NULL if the fence is not a dma_fence_chain,
+ * or the dma_fence_chain otherwise.
+ */
+static inline struct dma_fence_chain *
+to_dma_fence_chain(struct dma_fence *fence)
+{
+	if (fence->ops != &dma_fence_chain_ops)
+		return NULL;
+
+	return container_of(fence, struct dma_fence_chain, base);
+}
+
+#define dma_fence_chain_for_each(fence)	\
+	for (dma_fence_get(fence);fence;fence=dma_fence_chain_walk(fence))
+
+struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence);
+void dma_fence_chain_init(struct dma_fence_chain *chain,
+			  struct dma_fence *prev,
+			  struct dma_fence *fence,
+			  uint64_t seqno);
+
+#endif /* __LINUX_DMA_FENCE_CHAIN_H */
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/7] drm: revert "expand replace_fence to support timeline point v2"
  2018-11-15 11:12 restart syncobj timeline changes Christian König
  2018-11-15 11:12 ` [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit Christian König
  2018-11-15 11:12 ` [PATCH 2/7] dma-buf: add new dma_fence_chain container Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-15 11:12 ` [PATCH 4/7] drm/syncobj: use only a single stub fence Christian König
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

This reverts commit 9a09a42369a4a37a959c051d8e1a1f948c1529a4.

The whole interface isn't thought through. Since this function can't
fail we actually can't allocate an object to store the sync point.

Sorry, I should have taken the lead on this from the very beginning and
reviewed it more thoughtfully. Going to propose a new interface as a
follow up change.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c     |  2 +-
 drivers/gpu/drm/drm_syncobj.c              | 14 ++++++--------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 drivers/gpu/drm/v3d/v3d_gem.c              |  2 +-
 drivers/gpu/drm/vc4/vc4_gem.c              |  2 +-
 include/drm/drm_syncobj.h                  |  2 +-
 6 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index fd12e9162f3c..a950b219b60e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1193,7 +1193,7 @@ static void amdgpu_cs_post_dependencies(struct amdgpu_cs_parser *p)
 	int i;
 
 	for (i = 0; i < p->num_post_dep_syncobjs; ++i)
-		drm_syncobj_replace_fence(p->post_dep_syncobjs[i], 0, p->fence);
+		drm_syncobj_replace_fence(p->post_dep_syncobjs[i], p->fence);
 }
 
 static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 5c2091dbd230..f190414511ae 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -158,13 +158,11 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
- * @point: timeline point
  * @fence: fence to install in sync file.
  *
- * This replaces the fence on a sync object, or a timeline point fence.
+ * This replaces the fence on a sync object.
  */
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
-			       u64 point,
 			       struct dma_fence *fence)
 {
 	struct dma_fence *old_fence;
@@ -204,7 +202,7 @@ static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
 		       &fence->lock, 0, 0);
 	dma_fence_signal(&fence->base);
 
-	drm_syncobj_replace_fence(syncobj, 0, &fence->base);
+	drm_syncobj_replace_fence(syncobj, &fence->base);
 
 	dma_fence_put(&fence->base);
 
@@ -255,7 +253,7 @@ void drm_syncobj_free(struct kref *kref)
 	struct drm_syncobj *syncobj = container_of(kref,
 						   struct drm_syncobj,
 						   refcount);
-	drm_syncobj_replace_fence(syncobj, 0, NULL);
+	drm_syncobj_replace_fence(syncobj, NULL);
 	kfree(syncobj);
 }
 EXPORT_SYMBOL(drm_syncobj_free);
@@ -295,7 +293,7 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 	}
 
 	if (fence)
-		drm_syncobj_replace_fence(syncobj, 0, fence);
+		drm_syncobj_replace_fence(syncobj, fence);
 
 	*out_syncobj = syncobj;
 	return 0;
@@ -480,7 +478,7 @@ static int drm_syncobj_import_sync_file_fence(struct drm_file *file_private,
 		return -ENOENT;
 	}
 
-	drm_syncobj_replace_fence(syncobj, 0, fence);
+	drm_syncobj_replace_fence(syncobj, fence);
 	dma_fence_put(fence);
 	drm_syncobj_put(syncobj);
 	return 0;
@@ -954,7 +952,7 @@ drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
 		return ret;
 
 	for (i = 0; i < args->count_handles; i++)
-		drm_syncobj_replace_fence(syncobjs[i], 0, NULL);
+		drm_syncobj_replace_fence(syncobjs[i], NULL);
 
 	drm_syncobj_array_free(syncobjs, args->count_handles);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 09187286d346..bb0980793b48 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2186,7 +2186,7 @@ signal_fence_array(struct i915_execbuffer *eb,
 		if (!(flags & I915_EXEC_FENCE_SIGNAL))
 			continue;
 
-		drm_syncobj_replace_fence(syncobj, 0, fence);
+		drm_syncobj_replace_fence(syncobj, fence);
 	}
 }
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 70c54774400b..9b9ab34fb461 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -584,7 +584,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 	/* Update the return sync object for the */
 	sync_out = drm_syncobj_find(file_priv, args->out_sync);
 	if (sync_out) {
-		drm_syncobj_replace_fence(sync_out, 0,
+		drm_syncobj_replace_fence(sync_out,
 					  &exec->render.base.s_fence->finished);
 		drm_syncobj_put(sync_out);
 	}
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 5b22e996af6c..928718b467bd 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -681,7 +681,7 @@ vc4_queue_submit(struct drm_device *dev, struct vc4_exec_info *exec,
 	exec->fence = &fence->base;
 
 	if (out_sync)
-		drm_syncobj_replace_fence(out_sync, 0, exec->fence);
+		drm_syncobj_replace_fence(out_sync, exec->fence);
 
 	vc4_update_bo_seqnos(exec, seqno);
 
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 425432b85a87..ab9055f943c7 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -131,7 +131,7 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj)
 
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
-void drm_syncobj_replace_fence(struct drm_syncobj *syncobj, u64 point,
+void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
 int drm_syncobj_find_fence(struct drm_file *file_private,
 			   u32 handle, u64 point,
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/7] drm/syncobj: use only a single stub fence
  2018-11-15 11:12 restart syncobj timeline changes Christian König
                   ` (2 preceding siblings ...)
  2018-11-15 11:12 ` [PATCH 3/7] drm: revert "expand replace_fence to support timeline point v2" Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-16  5:54   ` Zhou, David(ChunMing)
  2018-11-15 11:12 ` [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into drm_syncobj.c Christian König
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

Extract of useful code from the timeline work. Let's use just a single
stub fence instance instead of allocating a new one all the time.

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_syncobj.c | 67 ++++++++++++++++++++++---------------------
 1 file changed, 35 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index f190414511ae..4c45acb326b9 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,10 +56,8 @@
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
 
-struct drm_syncobj_stub_fence {
-	struct dma_fence base;
-	spinlock_t lock;
-};
+static DEFINE_SPINLOCK(stub_fence_lock);
+static struct dma_fence stub_fence;
 
 static const char *drm_syncobj_stub_fence_get_name(struct dma_fence *fence)
 {
@@ -71,6 +69,25 @@ static const struct dma_fence_ops drm_syncobj_stub_fence_ops = {
 	.get_timeline_name = drm_syncobj_stub_fence_get_name,
 };
 
+/**
+ * drm_syncobj_get_stub_fence - return a signaled fence
+ *
+ * Return a stub fence which is already signaled.
+ */
+static struct dma_fence *drm_syncobj_get_stub_fence(void)
+{
+	spin_lock(&stub_fence_lock);
+	if (!stub_fence.ops) {
+		dma_fence_init(&stub_fence,
+			       &drm_syncobj_stub_fence_ops,
+			       &stub_fence_lock,
+			       0, 0);
+		dma_fence_signal_locked(&stub_fence);
+	}
+	spin_unlock(&stub_fence_lock);
+
+	return dma_fence_get(&stub_fence);
+}
 
 /**
  * drm_syncobj_find - lookup and reference a sync object.
@@ -190,23 +207,18 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+/**
+ * drm_syncobj_assign_null_handle - assign a stub fence to the sync object
+ * @syncobj: sync object to assign the fence on
+ *
+ * Assign a already signaled stub fence to the sync object.
+ */
+static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
 {
-	struct drm_syncobj_stub_fence *fence;
-	fence = kzalloc(sizeof(*fence), GFP_KERNEL);
-	if (fence == NULL)
-		return -ENOMEM;
+	struct dma_fence *fence = drm_syncobj_get_stub_fence();
 
-	spin_lock_init(&fence->lock);
-	dma_fence_init(&fence->base, &drm_syncobj_stub_fence_ops,
-		       &fence->lock, 0, 0);
-	dma_fence_signal(&fence->base);
-
-	drm_syncobj_replace_fence(syncobj, &fence->base);
-
-	dma_fence_put(&fence->base);
-
-	return 0;
+	drm_syncobj_replace_fence(syncobj, fence);
+	dma_fence_put(fence);
 }
 
 /**
@@ -273,7 +285,6 @@ EXPORT_SYMBOL(drm_syncobj_free);
 int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 		       struct dma_fence *fence)
 {
-	int ret;
 	struct drm_syncobj *syncobj;
 
 	syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL);
@@ -284,13 +295,8 @@ int drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
 	INIT_LIST_HEAD(&syncobj->cb_list);
 	spin_lock_init(&syncobj->lock);
 
-	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
-		ret = drm_syncobj_assign_null_handle(syncobj);
-		if (ret < 0) {
-			drm_syncobj_put(syncobj);
-			return ret;
-		}
-	}
+	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
+		drm_syncobj_assign_null_handle(syncobj);
 
 	if (fence)
 		drm_syncobj_replace_fence(syncobj, fence);
@@ -984,11 +990,8 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
 	if (ret < 0)
 		return ret;
 
-	for (i = 0; i < args->count_handles; i++) {
-		ret = drm_syncobj_assign_null_handle(syncobjs[i]);
-		if (ret < 0)
-			break;
-	}
+	for (i = 0; i < args->count_handles; i++)
+		drm_syncobj_assign_null_handle(syncobjs[i]);
 
 	drm_syncobj_array_free(syncobjs, args->count_handles);
 
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into drm_syncobj.c
  2018-11-15 11:12 restart syncobj timeline changes Christian König
                   ` (3 preceding siblings ...)
  2018-11-15 11:12 ` [PATCH 4/7] drm/syncobj: use only a single stub fence Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-16  5:55   ` Zhou, David(ChunMing)
  2018-11-15 11:12 ` [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point interface Christian König
  2018-11-15 11:12 ` [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence Christian König
  6 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

Not used outside the file.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_syncobj.c | 21 +++++++++++++++++++++
 include/drm/drm_syncobj.h     | 21 ---------------------
 2 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 4c45acb326b9..4a2e6ef16979 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -56,6 +56,27 @@
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
 
+struct drm_syncobj_cb;
+
+typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
+				   struct drm_syncobj_cb *cb);
+
+/**
+ * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
+ * @node: used by drm_syncob_add_callback to append this struct to
+ *	  &drm_syncobj.cb_list
+ * @func: drm_syncobj_func_t to call
+ *
+ * This struct will be initialized by drm_syncobj_add_callback, additional
+ * data can be passed along by embedding drm_syncobj_cb in another struct.
+ * The callback will get called the next time drm_syncobj_replace_fence is
+ * called.
+ */
+struct drm_syncobj_cb {
+	struct list_head node;
+	drm_syncobj_func_t func;
+};
+
 static DEFINE_SPINLOCK(stub_fence_lock);
 static struct dma_fence stub_fence;
 
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index ab9055f943c7..c79f5ada7cdb 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -28,8 +28,6 @@
 
 #include "linux/dma-fence.h"
 
-struct drm_syncobj_cb;
-
 /**
  * struct drm_syncobj - sync object.
  *
@@ -62,25 +60,6 @@ struct drm_syncobj {
 	struct file *file;
 };
 
-typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
-				   struct drm_syncobj_cb *cb);
-
-/**
- * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
- * @node: used by drm_syncob_add_callback to append this struct to
- *	  &drm_syncobj.cb_list
- * @func: drm_syncobj_func_t to call
- *
- * This struct will be initialized by drm_syncobj_add_callback, additional
- * data can be passed along by embedding drm_syncobj_cb in another struct.
- * The callback will get called the next time drm_syncobj_replace_fence is
- * called.
- */
-struct drm_syncobj_cb {
-	struct list_head node;
-	drm_syncobj_func_t func;
-};
-
 void drm_syncobj_free(struct kref *kref);
 
 /**
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point interface
  2018-11-15 11:12 restart syncobj timeline changes Christian König
                   ` (4 preceding siblings ...)
  2018-11-15 11:12 ` [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into drm_syncobj.c Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-16  6:20   ` Zhou, David(ChunMing)
  2018-11-15 11:12 ` [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence Christian König
  6 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

Use the dma_fence_chain object to create a timeline of fence objects
instead of just replacing the existing fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_syncobj.c | 40 ++++++++++++++++++++++++++++++++++++++++
 include/drm/drm_syncobj.h     |  5 +++++
 2 files changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 4a2e6ef16979..589d884ccd58 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -193,6 +193,46 @@ void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
 	spin_unlock(&syncobj->lock);
 }
 
+/**
+ * drm_syncobj_add_point - add new timeline point to the syncobj
+ * @syncobj: sync object to add timeline point do
+ * @chain: chain node to use to add the point
+ * @fence: fence to encapsulate in the chain node
+ * @point: sequence number to use for the point
+ *
+ * Add the chain node as new timeline point to the syncobj.
+ */
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+			   struct dma_fence_chain *chain,
+			   struct dma_fence *fence,
+			   uint64_t point)
+{
+	struct drm_syncobj_cb *cur, *tmp;
+	struct dma_fence *prev;
+
+	dma_fence_get(fence);
+	dma_fence_get(fence);
+
+	spin_lock(&syncobj->lock);
+
+	prev = rcu_dereference_protected(syncobj->fence,
+					 lockdep_is_held(&syncobj->lock));
+	dma_fence_chain_init(chain, prev, fence, point);
+	rcu_assign_pointer(syncobj->fence, &chain->base);
+
+	list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
+		list_del_init(&cur->node);
+		cur->func(syncobj, cur);
+	}
+	spin_unlock(&syncobj->lock);
+
+	/* Walk the chain once to trigger garbage collection */
+	prev = fence;
+	dma_fence_chain_for_each(prev);
+
+	dma_fence_put(fence);
+}
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index c79f5ada7cdb..35a917241e30 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -27,6 +27,7 @@
 #define __DRM_SYNCOBJ_H__
 
 #include "linux/dma-fence.h"
+#include "linux/dma-fence-chain.h"
 
 /**
  * struct drm_syncobj - sync object.
@@ -110,6 +111,10 @@ drm_syncobj_fence_get(struct drm_syncobj *syncobj)
 
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
+void drm_syncobj_add_point(struct drm_syncobj *syncobj,
+			   struct dma_fence_chain *chain,
+			   struct dma_fence *fence,
+			   uint64_t point);
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
 int drm_syncobj_find_fence(struct drm_file *file_private,
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-15 11:12 restart syncobj timeline changes Christian König
                   ` (5 preceding siblings ...)
  2018-11-15 11:12 ` [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point interface Christian König
@ 2018-11-15 11:12 ` Christian König
  2018-11-22  6:52   ` zhoucm1
  6 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-15 11:12 UTC (permalink / raw)
  To: dri-devel; +Cc: daniel.vetter

Implement finding the right timeline point in drm_syncobj_find_fence.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 589d884ccd58..d42c51520da4 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file *file_private,
 		return -ENOENT;
 
 	*fence = drm_syncobj_fence_get(syncobj);
-	if (!*fence) {
+	if (!*fence)
 		ret = -EINVAL;
+
+	if (!ret && point) {
+		dma_fence_chain_for_each(*fence) {
+			if (!to_dma_fence_chain(*fence) ||
+			    (*fence)->seqno <= point)
+				break;
+		}
 	}
+
 	drm_syncobj_put(syncobj);
 	return ret;
 }
-- 
2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* RE: [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit
  2018-11-15 11:12 ` [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit Christian König
@ 2018-11-16  5:26   ` Zhou, David(ChunMing)
  0 siblings, 0 replies; 28+ messages in thread
From: Zhou, David(ChunMing) @ 2018-11-16  5:26 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: daniel.vetter

Acked-by: Chunming  Zhou <david1.zhou@amd.com>, it is better that other people from outside can take a look.

> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, November 15, 2018 7:13 PM
> To: dri-devel@lists.freedesktop.org
> Cc: chris@chris-wilson.co.uk; daniel.vetter@ffwll.ch; eric@anholt.net; Zhou,
> David(ChunMing) <David1.Zhou@amd.com>
> Subject: [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit
> 
> For a lot of use cases we need 64bit sequence numbers. Currently drivers
> overload the dma_fence structure to store the additional bits.
> 
> Stop doing that and make the sequence number in the dma_fence always
> 64bit.
> 
> For compatibility with hardware which can do only 32bit sequences the
> comparisons in __dma_fence_is_later still only takes the lower 32bits as
> significant.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/dma-fence.c            |  2 +-
>  drivers/dma-buf/sw_sync.c              |  2 +-
>  drivers/dma-buf/sync_file.c            |  4 ++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c |  2 +-
>  drivers/gpu/drm/i915/i915_sw_fence.c   |  2 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c |  2 +-
>  drivers/gpu/drm/vgem/vgem_fence.c      |  4 ++--
>  include/linux/dma-fence.h              | 14 +++++++-------
>  8 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 1551ca7df394..37e24b69e94b 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -615,7 +615,7 @@ EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>   */
>  void
>  dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops
> *ops,
> -	       spinlock_t *lock, u64 context, unsigned seqno)
> +	       spinlock_t *lock, u64 context, u64 seqno)
>  {
>  	BUG_ON(!lock);
>  	BUG_ON(!ops || !ops->get_driver_name || !ops-
> >get_timeline_name); diff --git a/drivers/dma-buf/sw_sync.c
> b/drivers/dma-buf/sw_sync.c index 53c1d6d36a64..32dcf7b4c935 100644
> --- a/drivers/dma-buf/sw_sync.c
> +++ b/drivers/dma-buf/sw_sync.c
> @@ -172,7 +172,7 @@ static bool timeline_fence_enable_signaling(struct
> dma_fence *fence)  static void timeline_fence_value_str(struct dma_fence
> *fence,
>  				    char *str, int size)
>  {
> -	snprintf(str, size, "%d", fence->seqno);
> +	snprintf(str, size, "%lld", fence->seqno);
>  }
> 
>  static void timeline_fence_timeline_value_str(struct dma_fence *fence, diff
> --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c index
> 35dd06479867..4f6305ca52c8 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -144,7 +144,7 @@ char *sync_file_get_name(struct sync_file *sync_file,
> char *buf, int len)
>  	} else {
>  		struct dma_fence *fence = sync_file->fence;
> 
> -		snprintf(buf, len, "%s-%s%llu-%d",
> +		snprintf(buf, len, "%s-%s%llu-%lld",
>  			 fence->ops->get_driver_name(fence),
>  			 fence->ops->get_timeline_name(fence),
>  			 fence->context,
> @@ -258,7 +258,7 @@ static struct sync_file *sync_file_merge(const char
> *name, struct sync_file *a,
> 
>  			i_b++;
>  		} else {
> -			if (pt_a->seqno - pt_b->seqno <= INT_MAX)
> +			if (__dma_fence_is_later(pt_a->seqno, pt_b-
> >seqno))
>  				add_fence(fences, &i, pt_a);
>  			else
>  				add_fence(fences, &i, pt_b);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> index 12f2bf97611f..bfaf5c6323be 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
> @@ -388,7 +388,7 @@ void amdgpu_sa_bo_dump_debug_info(struct
> amdgpu_sa_manager *sa_manager,
>  			   soffset, eoffset, eoffset - soffset);
> 
>  		if (i->fence)
> -			seq_printf(m, " protected by 0x%08x on
> context %llu",
> +			seq_printf(m, " protected by 0x%016llx on
> context %llu",
>  				   i->fence->seqno, i->fence->context);
> 
>  		seq_printf(m, "\n");
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c
> b/drivers/gpu/drm/i915/i915_sw_fence.c
> index 6dbeed079ae5..11bcdabd5177 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -393,7 +393,7 @@ static void timer_i915_sw_fence_wake(struct
> timer_list *t)
>  	if (!fence)
>  		return;
> 
> -	pr_notice("Asynchronous wait on fence %s:%s:%x timed out
> (hint:%pS)\n",
> +	pr_notice("Asynchronous wait on fence %s:%s:%llx timed out
> +(hint:%pS)\n",
>  		  cb->dma->ops->get_driver_name(cb->dma),
>  		  cb->dma->ops->get_timeline_name(cb->dma),
>  		  cb->dma->seqno,
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c
> b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 217ed3ee1cab..f28a66c67d34 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -1236,7 +1236,7 @@ static void print_request(struct drm_printer *m,
> 
>  	x = print_sched_attr(rq->i915, &rq->sched.attr, buf, x, sizeof(buf));
> 
> -	drm_printf(m, "%s%x%s [%llx:%x]%s @ %dms: %s\n",
> +	drm_printf(m, "%s%x%s [%llx:%llx]%s @ %dms: %s\n",
>  		   prefix,
>  		   rq->global_seqno,
>  		   i915_request_completed(rq) ? "!" : "", diff --git
> a/drivers/gpu/drm/vgem/vgem_fence.c
> b/drivers/gpu/drm/vgem/vgem_fence.c
> index e6ee71323a66..1fdb09fe71d3 100644
> --- a/drivers/gpu/drm/vgem/vgem_fence.c
> +++ b/drivers/gpu/drm/vgem/vgem_fence.c
> @@ -53,13 +53,13 @@ static void vgem_fence_release(struct dma_fence
> *base)
> 
>  static void vgem_fence_value_str(struct dma_fence *fence, char *str, int
> size)  {
> -	snprintf(str, size, "%u", fence->seqno);
> +	snprintf(str, size, "%llu", fence->seqno);
>  }
> 
>  static void vgem_fence_timeline_value_str(struct dma_fence *fence, char
> *str,
>  					  int size)
>  {
> -	snprintf(str, size, "%u",
> +	snprintf(str, size, "%llu",
>  		 dma_fence_is_signaled(fence) ? fence->seqno : 0);  }
> 
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index
> 02dba8cd033d..1393529c0c1f 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -77,7 +77,7 @@ struct dma_fence {
>  	struct list_head cb_list;
>  	spinlock_t *lock;
>  	u64 context;
> -	unsigned seqno;
> +	u64 seqno;
>  	unsigned long flags;
>  	ktime_t timestamp;
>  	int error;
> @@ -244,7 +244,7 @@ struct dma_fence_ops {  };
> 
>  void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops
> *ops,
> -		    spinlock_t *lock, u64 context, unsigned seqno);
> +		    spinlock_t *lock, u64 context, u64 seqno);
> 
>  void dma_fence_release(struct kref *kref);  void dma_fence_free(struct
> dma_fence *fence); @@ -414,9 +414,9 @@ dma_fence_is_signaled(struct
> dma_fence *fence)
>   * Returns true if f1 is chronologically later than f2. Both fences must be
>   * from the same context, since a seqno is not common across contexts.
>   */
> -static inline bool __dma_fence_is_later(u32 f1, u32 f2)
> +static inline bool __dma_fence_is_later(u64 f1, u64 f2)
>  {
> -	return (int)(f1 - f2) > 0;
> +	return (int)(lower_32_bits(f1) - lower_32_bits(f2)) > 0;
>  }
> 
>  /**
> @@ -547,21 +547,21 @@ u64 dma_fence_context_alloc(unsigned num);
>  	do {								\
>  		struct dma_fence *__ff = (f);				\
>  		if (IS_ENABLED(CONFIG_DMA_FENCE_TRACE))
> 		\
> -			pr_info("f %llu#%u: " fmt,			\
> +			pr_info("f %llu#%llu: " fmt,			\
>  				__ff->context, __ff->seqno, ##args);	\
>  	} while (0)
> 
>  #define DMA_FENCE_WARN(f, fmt, args...) \
>  	do {								\
>  		struct dma_fence *__ff = (f);				\
> -		pr_warn("f %llu#%u: " fmt, __ff->context, __ff->seqno,
> 	\
> +		pr_warn("f %llu#%llu: " fmt, __ff->context, __ff->seqno,\
>  			 ##args);					\
>  	} while (0)
> 
>  #define DMA_FENCE_ERR(f, fmt, args...) \
>  	do {								\
>  		struct dma_fence *__ff = (f);				\
> -		pr_err("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
> +		pr_err("f %llu#%llu: " fmt, __ff->context, __ff->seqno,
> 	\
>  			##args);					\
>  	} while (0)
> 
> --
> 2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 2/7] dma-buf: add new dma_fence_chain container
  2018-11-15 11:12 ` [PATCH 2/7] dma-buf: add new dma_fence_chain container Christian König
@ 2018-11-16  5:51   ` Zhou, David(ChunMing)
  0 siblings, 0 replies; 28+ messages in thread
From: Zhou, David(ChunMing) @ 2018-11-16  5:51 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: daniel.vetter



> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, November 15, 2018 7:13 PM
> To: dri-devel@lists.freedesktop.org
> Cc: chris@chris-wilson.co.uk; daniel.vetter@ffwll.ch; eric@anholt.net; Zhou,
> David(ChunMing) <David1.Zhou@amd.com>
> Subject: [PATCH 2/7] dma-buf: add new dma_fence_chain container
> 
> Lockless container implementation similar to a dma_fence_array, but with
> only two elements per node and automatic garbage collection.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/Makefile          |   3 +-
>  drivers/dma-buf/dma-fence-chain.c | 186
> ++++++++++++++++++++++++++++++++++++++
>  include/linux/dma-fence-chain.h   |  69 ++++++++++++++
>  3 files changed, 257 insertions(+), 1 deletion(-)  create mode 100644
> drivers/dma-buf/dma-fence-chain.c  create mode 100644 include/linux/dma-
> fence-chain.h
> 
> diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index
> 0913a6ccab5a..1f006e083eb9 100644
> --- a/drivers/dma-buf/Makefile
> +++ b/drivers/dma-buf/Makefile
> @@ -1,4 +1,5 @@
> -obj-y := dma-buf.o dma-fence.o dma-fence-array.o reservation.o seqno-
> fence.o
> +obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
> +	 reservation.o seqno-fence.o
>  obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
>  obj-$(CONFIG_SW_SYNC)		+= sw_sync.o sync_debug.o
>  obj-$(CONFIG_UDMABUF)		+= udmabuf.o
> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-
> fence-chain.c
> new file mode 100644
> index 000000000000..ac830b886589
> --- /dev/null
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -0,0 +1,186 @@
> +/*
> + * fence-chain: chain fences together in a timeline
> + *
> + * Copyright (C) 2018 Advanced Micro Devices, Inc.
> + * Authors:
> + *	Christian König <christian.koenig@amd.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> +it
> + * under the terms of the GNU General Public License version 2 as
> +published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> +WITHOUT
> + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY
> +or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +License for
> + * more details.
> + */
> +
> +#include <linux/dma-fence-chain.h>
> +
> +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
> +
> +/**
> + * dma_fence_chain_get_prev - use RCU to get a reference to the
> +previous fence
> + * @chain: chain node to get the previous node from
> + *
> + * Use dma_fence_get_rcu_safe to get a reference to the previous fence
> +of the
> + * chain node.
> + */
> +static struct dma_fence *dma_fence_chain_get_prev(struct
> +dma_fence_chain *chain) {
> +	struct dma_fence *prev;
> +
> +	rcu_read_lock();
> +	prev = dma_fence_get_rcu_safe(&chain->prev);
> +	rcu_read_unlock();
> +	return prev;
> +}
> +
> +/**
> + * dma_fence_chain_walk - chain walking function
> + * @fence: current chain node
> + *
> + * Walk the chain to the next node. Returns the next fence or NULL if
> +we are at
> + * the end of the chain. Garbage collects chain nodes which are already
> + * signaled.
> + */
> +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence) {
> +	struct dma_fence_chain *chain, *prev_chain;
> +	struct dma_fence *prev, *prev_prev, *tmp;
> +
> +	chain = to_dma_fence_chain(fence);
> +	if (!chain) {
> +		dma_fence_put(fence);
> +		return NULL;
> +	}
> +
> +	while ((prev = dma_fence_chain_get_prev(chain))) {
> +
> +		prev_chain = to_dma_fence_chain(prev);
> +		if (!prev_chain || !dma_fence_is_signaled(prev_chain-
> >fence))
> +			break;
> +
> +		prev_prev = dma_fence_chain_get_prev(prev_chain);
> +		tmp = cmpxchg(&chain->prev, prev, prev_prev);
> +		if (tmp == prev)
> +			dma_fence_put(tmp);
> +		else
> +			dma_fence_put(prev_prev);
> +		dma_fence_put(prev);
> +	}
> +
> +	dma_fence_put(fence);
> +	return prev;
> +}
> +EXPORT_SYMBOL(dma_fence_chain_walk);
> +
> +static const char *dma_fence_chain_get_driver_name(struct dma_fence
> +*fence) {
> +        return "dma_fence_chain";
> +}
> +
> +static const char *dma_fence_chain_get_timeline_name(struct dma_fence
> +*fence) {
> +        return "unbound";
> +}
> +
> +static void dma_fence_chain_irq_work(struct irq_work *work) {
> +	struct dma_fence_chain *chain;
> +
> +	chain = container_of(work, typeof(*chain), work);
> +
> +	/* Try to rearm the callback */
> +	if (!dma_fence_chain_enable_signaling(&chain->base))
> +		/* Ok, we are done. No more unsignaled fences left */
> +		dma_fence_signal(&chain->base);
> +}
> +
> +static void dma_fence_chain_cb(struct dma_fence *f, struct dma_fence_cb
> +*cb) {
> +	struct dma_fence_chain *chain;
> +
> +	chain = container_of(cb, typeof(*chain), cb);
> +	irq_work_queue(&chain->work);
> +	dma_fence_put(f);
> +}
> +
> +static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) {
> +	struct dma_fence_chain *head = to_dma_fence_chain(fence);
> +
> +	dma_fence_chain_for_each(fence) {
> +		struct dma_fence_chain *chain =
> to_dma_fence_chain(fence);
> +		struct dma_fence *f = chain ? chain->fence : fence;

Here chain seems never be NULL, so this line can be removed, except that, dma_fence_chain is fine with me.

-David
> +
> +		dma_fence_get(f);
> +		if (!dma_fence_add_callback(f, &head->cb,
> dma_fence_chain_cb))
> +			return true;
> +		dma_fence_put(f);
> +	}
> +	return false;
> +}
> +
> +static bool dma_fence_chain_signaled(struct dma_fence *fence) {
> +	dma_fence_chain_for_each(fence) {
> +		struct dma_fence_chain *chain =
> to_dma_fence_chain(fence);
> +		struct dma_fence *f = chain ? chain->fence : fence;
> +
> +		if (!dma_fence_is_signaled(f))
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
> +static void dma_fence_chain_release(struct dma_fence *fence) {
> +	struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> +
> +	dma_fence_put(chain->prev);
> +	dma_fence_put(chain->fence);
> +	dma_fence_free(fence);
> +}
> +
> +const struct dma_fence_ops dma_fence_chain_ops = {
> +	.get_driver_name = dma_fence_chain_get_driver_name,
> +	.get_timeline_name = dma_fence_chain_get_timeline_name,
> +	.enable_signaling = dma_fence_chain_enable_signaling,
> +	.signaled = dma_fence_chain_signaled,
> +	.release = dma_fence_chain_release,
> +};
> +EXPORT_SYMBOL(dma_fence_chain_ops);
> +
> +/**
> + * dma_fence_chain_init - initialize a fence chain
> + * @chain: the chain node to initialize
> + * @prev: the previous fence
> + * @fence: the current fence
> + *
> + * Initialize a new chain node and either start a new chain or add the
> +node to
> + * the existing chain of the previous fence.
> + */
> +void dma_fence_chain_init(struct dma_fence_chain *chain,
> +			  struct dma_fence *prev,
> +			  struct dma_fence *fence,
> +			  uint64_t seqno)
> +{
> +	uint64_t context;
> +
> +	/* Try to reuse the context of the previous chain node. */
> +	if (to_dma_fence_chain(prev) &&
> +	    __dma_fence_is_later(seqno, prev->seqno))
> +		context = prev->context;
> +	else
> +		context = dma_fence_context_alloc(1);
> +
> +	spin_lock_init(&chain->lock);
> +	dma_fence_init(&chain->base, &dma_fence_chain_ops,
> +		       &chain->lock, context, seqno);
> +	chain->prev = prev;
> +	chain->fence = fence;
> +	init_irq_work(&chain->work, dma_fence_chain_irq_work); }
> +EXPORT_SYMBOL(dma_fence_chain_init);
> diff --git a/include/linux/dma-fence-chain.h b/include/linux/dma-fence-
> chain.h new file mode 100644 index 000000000000..0e295698fcf8
> --- /dev/null
> +++ b/include/linux/dma-fence-chain.h
> @@ -0,0 +1,69 @@
> +/*
> + * fence-chain: chain fences together in a timeline
> + *
> + * Copyright (C) 2018 Advanced Micro Devices, Inc.
> + * Authors:
> + *	Christian König <christian.koenig@amd.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> +it
> + * under the terms of the GNU General Public License version 2 as
> +published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but
> +WITHOUT
> + * ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY
> +or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> +License for
> + * more details.
> + */
> +
> +#ifndef __LINUX_DMA_FENCE_CHAIN_H
> +#define __LINUX_DMA_FENCE_CHAIN_H
> +
> +#include <linux/dma-fence.h>
> +#include <linux/irq_work.h>
> +
> +/**
> + * struct dma_fence_chain - fence to represent an node of a fence chain
> + * @base: fence base class
> + * @lock: spinlock for fence handling
> + * @prev: previous fence of the chain
> + * @fence: encapsulated fence
> + * @cb: callback structure for signaling
> + * @work: irq work item for signaling
> + */
> +struct dma_fence_chain {
> +	struct dma_fence base;
> +	spinlock_t lock;
> +	struct dma_fence *prev;
> +	struct dma_fence *fence;
> +	struct dma_fence_cb cb;
> +	struct irq_work work;
> +};
> +
> +extern const struct dma_fence_ops dma_fence_chain_ops;
> +
> +/**
> + * to_dma_fence_chain - cast a fence to a dma_fence_chain
> + * @fence: fence to cast to a dma_fence_array
> + *
> + * Returns NULL if the fence is not a dma_fence_chain,
> + * or the dma_fence_chain otherwise.
> + */
> +static inline struct dma_fence_chain *
> +to_dma_fence_chain(struct dma_fence *fence) {
> +	if (fence->ops != &dma_fence_chain_ops)
> +		return NULL;
> +
> +	return container_of(fence, struct dma_fence_chain, base); }
> +
> +#define dma_fence_chain_for_each(fence)	\
> +	for
> (dma_fence_get(fence);fence;fence=dma_fence_chain_walk(fence))
> +
> +struct dma_fence *dma_fence_chain_walk(struct dma_fence *fence); void
> +dma_fence_chain_init(struct dma_fence_chain *chain,
> +			  struct dma_fence *prev,
> +			  struct dma_fence *fence,
> +			  uint64_t seqno);
> +
> +#endif /* __LINUX_DMA_FENCE_CHAIN_H */
> --
> 2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 4/7] drm/syncobj: use only a single stub fence
  2018-11-15 11:12 ` [PATCH 4/7] drm/syncobj: use only a single stub fence Christian König
@ 2018-11-16  5:54   ` Zhou, David(ChunMing)
  0 siblings, 0 replies; 28+ messages in thread
From: Zhou, David(ChunMing) @ 2018-11-16  5:54 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: daniel.vetter



> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, November 15, 2018 7:13 PM
> To: dri-devel@lists.freedesktop.org
> Cc: chris@chris-wilson.co.uk; daniel.vetter@ffwll.ch; eric@anholt.net; Zhou,
> David(ChunMing) <David1.Zhou@amd.com>
> Subject: [PATCH 4/7] drm/syncobj: use only a single stub fence
> 
> Extract of useful code from the timeline work. Let's use just a single stub
> fence instance instead of allocating a new one all the time.
> 
> Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
> Signed-off-by: Christian König <christian.koenig@amd.com>

It is a good conclusion during previous review, there already is my Sined-off, I cannot give RB on that, need other people take an action.

-David
> ---
>  drivers/gpu/drm/drm_syncobj.c | 67 ++++++++++++++++++++++------------
> ---------
>  1 file changed, 35 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_syncobj.c
> b/drivers/gpu/drm/drm_syncobj.c index f190414511ae..4c45acb326b9
> 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -56,10 +56,8 @@
>  #include "drm_internal.h"
>  #include <drm/drm_syncobj.h>
> 
> -struct drm_syncobj_stub_fence {
> -	struct dma_fence base;
> -	spinlock_t lock;
> -};
> +static DEFINE_SPINLOCK(stub_fence_lock); static struct dma_fence
> +stub_fence;
> 
>  static const char *drm_syncobj_stub_fence_get_name(struct dma_fence
> *fence)  { @@ -71,6 +69,25 @@ static const struct dma_fence_ops
> drm_syncobj_stub_fence_ops = {
>  	.get_timeline_name = drm_syncobj_stub_fence_get_name,  };
> 
> +/**
> + * drm_syncobj_get_stub_fence - return a signaled fence
> + *
> + * Return a stub fence which is already signaled.
> + */
> +static struct dma_fence *drm_syncobj_get_stub_fence(void) {
> +	spin_lock(&stub_fence_lock);
> +	if (!stub_fence.ops) {
> +		dma_fence_init(&stub_fence,
> +			       &drm_syncobj_stub_fence_ops,
> +			       &stub_fence_lock,
> +			       0, 0);
> +		dma_fence_signal_locked(&stub_fence);
> +	}
> +	spin_unlock(&stub_fence_lock);
> +
> +	return dma_fence_get(&stub_fence);
> +}
> 
>  /**
>   * drm_syncobj_find - lookup and reference a sync object.
> @@ -190,23 +207,18 @@ void drm_syncobj_replace_fence(struct
> drm_syncobj *syncobj,  }  EXPORT_SYMBOL(drm_syncobj_replace_fence);
> 
> -static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
> +/**
> + * drm_syncobj_assign_null_handle - assign a stub fence to the sync
> +object
> + * @syncobj: sync object to assign the fence on
> + *
> + * Assign a already signaled stub fence to the sync object.
> + */
> +static void drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
>  {
> -	struct drm_syncobj_stub_fence *fence;
> -	fence = kzalloc(sizeof(*fence), GFP_KERNEL);
> -	if (fence == NULL)
> -		return -ENOMEM;
> +	struct dma_fence *fence = drm_syncobj_get_stub_fence();
> 
> -	spin_lock_init(&fence->lock);
> -	dma_fence_init(&fence->base, &drm_syncobj_stub_fence_ops,
> -		       &fence->lock, 0, 0);
> -	dma_fence_signal(&fence->base);
> -
> -	drm_syncobj_replace_fence(syncobj, &fence->base);
> -
> -	dma_fence_put(&fence->base);
> -
> -	return 0;
> +	drm_syncobj_replace_fence(syncobj, fence);
> +	dma_fence_put(fence);
>  }
> 
>  /**
> @@ -273,7 +285,6 @@ EXPORT_SYMBOL(drm_syncobj_free);  int
> drm_syncobj_create(struct drm_syncobj **out_syncobj, uint32_t flags,
>  		       struct dma_fence *fence)
>  {
> -	int ret;
>  	struct drm_syncobj *syncobj;
> 
>  	syncobj = kzalloc(sizeof(struct drm_syncobj), GFP_KERNEL); @@ -
> 284,13 +295,8 @@ int drm_syncobj_create(struct drm_syncobj
> **out_syncobj, uint32_t flags,
>  	INIT_LIST_HEAD(&syncobj->cb_list);
>  	spin_lock_init(&syncobj->lock);
> 
> -	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
> -		ret = drm_syncobj_assign_null_handle(syncobj);
> -		if (ret < 0) {
> -			drm_syncobj_put(syncobj);
> -			return ret;
> -		}
> -	}
> +	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED)
> +		drm_syncobj_assign_null_handle(syncobj);
> 
>  	if (fence)
>  		drm_syncobj_replace_fence(syncobj, fence); @@ -984,11
> +990,8 @@ drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
>  	if (ret < 0)
>  		return ret;
> 
> -	for (i = 0; i < args->count_handles; i++) {
> -		ret = drm_syncobj_assign_null_handle(syncobjs[i]);
> -		if (ret < 0)
> -			break;
> -	}
> +	for (i = 0; i < args->count_handles; i++)
> +		drm_syncobj_assign_null_handle(syncobjs[i]);
> 
>  	drm_syncobj_array_free(syncobjs, args->count_handles);
> 
> --
> 2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into drm_syncobj.c
  2018-11-15 11:12 ` [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into drm_syncobj.c Christian König
@ 2018-11-16  5:55   ` Zhou, David(ChunMing)
  0 siblings, 0 replies; 28+ messages in thread
From: Zhou, David(ChunMing) @ 2018-11-16  5:55 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: daniel.vetter

Reviewed-by: Chunming Zhou <david1.zhou@amd.com>

> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, November 15, 2018 7:13 PM
> To: dri-devel@lists.freedesktop.org
> Cc: chris@chris-wilson.co.uk; daniel.vetter@ffwll.ch; eric@anholt.net; Zhou,
> David(ChunMing) <David1.Zhou@amd.com>
> Subject: [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into
> drm_syncobj.c
> 
> Not used outside the file.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/drm_syncobj.c | 21 +++++++++++++++++++++
>  include/drm/drm_syncobj.h     | 21 ---------------------
>  2 files changed, 21 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_syncobj.c
> b/drivers/gpu/drm/drm_syncobj.c index 4c45acb326b9..4a2e6ef16979
> 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -56,6 +56,27 @@
>  #include "drm_internal.h"
>  #include <drm/drm_syncobj.h>
> 
> +struct drm_syncobj_cb;
> +
> +typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
> +				   struct drm_syncobj_cb *cb);
> +
> +/**
> + * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
> + * @node: used by drm_syncob_add_callback to append this struct to
> + *	  &drm_syncobj.cb_list
> + * @func: drm_syncobj_func_t to call
> + *
> + * This struct will be initialized by drm_syncobj_add_callback,
> +additional
> + * data can be passed along by embedding drm_syncobj_cb in another
> struct.
> + * The callback will get called the next time drm_syncobj_replace_fence
> +is
> + * called.
> + */
> +struct drm_syncobj_cb {
> +	struct list_head node;
> +	drm_syncobj_func_t func;
> +};
> +
>  static DEFINE_SPINLOCK(stub_fence_lock);  static struct dma_fence
> stub_fence;
> 
> diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index
> ab9055f943c7..c79f5ada7cdb 100644
> --- a/include/drm/drm_syncobj.h
> +++ b/include/drm/drm_syncobj.h
> @@ -28,8 +28,6 @@
> 
>  #include "linux/dma-fence.h"
> 
> -struct drm_syncobj_cb;
> -
>  /**
>   * struct drm_syncobj - sync object.
>   *
> @@ -62,25 +60,6 @@ struct drm_syncobj {
>  	struct file *file;
>  };
> 
> -typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
> -				   struct drm_syncobj_cb *cb);
> -
> -/**
> - * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
> - * @node: used by drm_syncob_add_callback to append this struct to
> - *	  &drm_syncobj.cb_list
> - * @func: drm_syncobj_func_t to call
> - *
> - * This struct will be initialized by drm_syncobj_add_callback, additional
> - * data can be passed along by embedding drm_syncobj_cb in another
> struct.
> - * The callback will get called the next time drm_syncobj_replace_fence is
> - * called.
> - */
> -struct drm_syncobj_cb {
> -	struct list_head node;
> -	drm_syncobj_func_t func;
> -};
> -
>  void drm_syncobj_free(struct kref *kref);
> 
>  /**
> --
> 2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point interface
  2018-11-15 11:12 ` [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point interface Christian König
@ 2018-11-16  6:20   ` Zhou, David(ChunMing)
  0 siblings, 0 replies; 28+ messages in thread
From: Zhou, David(ChunMing) @ 2018-11-16  6:20 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: daniel.vetter

Don't know how to work, not completely yet.

> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Thursday, November 15, 2018 7:13 PM
> To: dri-devel@lists.freedesktop.org
> Cc: chris@chris-wilson.co.uk; daniel.vetter@ffwll.ch; eric@anholt.net; Zhou,
> David(ChunMing) <David1.Zhou@amd.com>
> Subject: [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point
> interface
> 
> Use the dma_fence_chain object to create a timeline of fence objects
> instead of just replacing the existing fence.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/drm_syncobj.c | 40
> ++++++++++++++++++++++++++++++++++++++++
>  include/drm/drm_syncobj.h     |  5 +++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_syncobj.c
> b/drivers/gpu/drm/drm_syncobj.c index 4a2e6ef16979..589d884ccd58
> 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -193,6 +193,46 @@ void drm_syncobj_remove_callback(struct
> drm_syncobj *syncobj,
>  	spin_unlock(&syncobj->lock);
>  }
> 
> +/**
> + * drm_syncobj_add_point - add new timeline point to the syncobj
> + * @syncobj: sync object to add timeline point do
> + * @chain: chain node to use to add the point
> + * @fence: fence to encapsulate in the chain node
> + * @point: sequence number to use for the point
> + *
> + * Add the chain node as new timeline point to the syncobj.
> + */
> +void drm_syncobj_add_point(struct drm_syncobj *syncobj,
> +			   struct dma_fence_chain *chain,
> +			   struct dma_fence *fence,
> +			   uint64_t point)
> +{
> +	struct drm_syncobj_cb *cur, *tmp;
> +	struct dma_fence *prev;
> +
> +	dma_fence_get(fence);
> +	dma_fence_get(fence);
> +
> +	spin_lock(&syncobj->lock);
> +
> +	prev = rcu_dereference_protected(syncobj->fence,
> +					 lockdep_is_held(&syncobj->lock));
> +	dma_fence_chain_init(chain, prev, fence, point);
> +	rcu_assign_pointer(syncobj->fence, &chain->base);
> +
> +	list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
> +		list_del_init(&cur->node);
> +		cur->func(syncobj, cur);
> +	}
> +	spin_unlock(&syncobj->lock);
> +
> +	/* Walk the chain once to trigger garbage collection */
> +	prev = fence;
> +	dma_fence_chain_for_each(prev);
> +
> +	dma_fence_put(fence);
> +}
> +
>  /**
>   * drm_syncobj_replace_fence - replace fence in a sync object.
>   * @syncobj: Sync object to replace fence in diff --git
> a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h index
> c79f5ada7cdb..35a917241e30 100644
> --- a/include/drm/drm_syncobj.h
> +++ b/include/drm/drm_syncobj.h
> @@ -27,6 +27,7 @@
>  #define __DRM_SYNCOBJ_H__
> 
>  #include "linux/dma-fence.h"
> +#include "linux/dma-fence-chain.h"
> 
>  /**
>   * struct drm_syncobj - sync object.
> @@ -110,6 +111,10 @@ drm_syncobj_fence_get(struct drm_syncobj
> *syncobj)
> 
>  struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
>  				     u32 handle);
> +void drm_syncobj_add_point(struct drm_syncobj *syncobj,
> +			   struct dma_fence_chain *chain,
> +			   struct dma_fence *fence,
> +			   uint64_t point);
>  void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
>  			       struct dma_fence *fence);
>  int drm_syncobj_find_fence(struct drm_file *file_private,
> --
> 2.14.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-15 11:12 ` [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence Christian König
@ 2018-11-22  6:52   ` zhoucm1
  2018-11-22 11:30     ` Christian König
  0 siblings, 1 reply; 28+ messages in thread
From: zhoucm1 @ 2018-11-22  6:52 UTC (permalink / raw)
  To: Christian König, dri-devel; +Cc: daniel.vetter



On 2018年11月15日 19:12, Christian König wrote:
> Implement finding the right timeline point in drm_syncobj_find_fence.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
> index 589d884ccd58..d42c51520da4 100644
> --- a/drivers/gpu/drm/drm_syncobj.c
> +++ b/drivers/gpu/drm/drm_syncobj.c
> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file *file_private,
>   		return -ENOENT;
>   
>   	*fence = drm_syncobj_fence_get(syncobj);
> -	if (!*fence) {
> +	if (!*fence)
>   		ret = -EINVAL;
> +
> +	if (!ret && point) {
> +		dma_fence_chain_for_each(*fence) {
> +			if (!to_dma_fence_chain(*fence) ||
> +			    (*fence)->seqno <= point)
> +				break;
This condition isn't enough to find proper point.
For two examples:
a. No garbage collection happens, the points in chain are 
1----3----6----9----12---18---20, if user wants to get point17, then we 
should return node 18.
b. garbage collection happens on point6, chain would be updated to 
1---3---9---12---18---20, if user wants to get point5, then we should 
return node 3, but if user wants to get point 7, then we should return 
node 9.

I still have no idea how to satisfy all these requirements with your 
current chain-fence. All these logic just are same we encountered 
before, we're walking them again. After solving these problems, I guess 
all design is similar as before.

In fact, I don't know what problem previous design has, maybe there are 
some bugs, can't we fix these bugs by time going? Who can make sure his 
implementation never have bugs?


-David
> +		}
>   	}
> +
>   	drm_syncobj_put(syncobj);
>   	return ret;
>   }

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-22  6:52   ` zhoucm1
@ 2018-11-22 11:30     ` Christian König
  2018-11-23  2:36       ` zhoucm1
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-22 11:30 UTC (permalink / raw)
  To: zhoucm1, dri-devel; +Cc: daniel.vetter

Am 22.11.18 um 07:52 schrieb zhoucm1:
>
>
> On 2018年11月15日 19:12, Christian König wrote:
>> Implement finding the right timeline point in drm_syncobj_find_fence.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/drm_syncobj.c 
>> b/drivers/gpu/drm/drm_syncobj.c
>> index 589d884ccd58..d42c51520da4 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file 
>> *file_private,
>>           return -ENOENT;
>>         *fence = drm_syncobj_fence_get(syncobj);
>> -    if (!*fence) {
>> +    if (!*fence)
>>           ret = -EINVAL;
>> +
>> +    if (!ret && point) {
>> +        dma_fence_chain_for_each(*fence) {
>> +            if (!to_dma_fence_chain(*fence) ||
>> +                (*fence)->seqno <= point)
>> +                break;
> This condition isn't enough to find proper point.
> For two examples:
> a. No garbage collection happens, the points in chain are 
> 1----3----6----9----12---18---20, if user wants to get point17, then 
> we should return node 18.

And that is exactly what's wrong in the original logic. In this case we 
need to return 12, not 18 because point 17 could have already been 
garbage collected.

> b. garbage collection happens on point6, chain would be updated to 
> 1---3---9---12---18---20, if user wants to get point5, then we should 
> return node 3, but if user wants to get point 7, then we should return 
> node 9.

Why? That doesn't seem to make any sense to me.

> I still have no idea how to satisfy all these requirements with your 
> current chain-fence. All these logic just are same we encountered 
> before, we're walking them again. After solving these problems, I 
> guess all design is similar as before.
>
> In fact, I don't know what problem previous design has, maybe there 
> are some bugs, can't we fix these bugs by time going? Who can make 
> sure his implementation never have bugs?

Well there where numerous problems with the original design. For example 
we need to reject the requirement that timeline fences are in order 
because that doesn't make sense in the kernel.

When userspace does something like submitting fences in the order 1, 5, 
3 then it is broken and can keep the pieces. In other words the kernel 
should not care about that, but rather make sure that it never looses 
any synchronization no matter what.

Regards,
Christian.

>
>
> -David
>> +        }
>>       }
>> +
>>       drm_syncobj_put(syncobj);
>>       return ret;
>>   }
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-22 11:30     ` Christian König
@ 2018-11-23  2:36       ` zhoucm1
  2018-11-23 10:10         ` Koenig, Christian
  0 siblings, 1 reply; 28+ messages in thread
From: zhoucm1 @ 2018-11-23  2:36 UTC (permalink / raw)
  To: christian.koenig, dri-devel; +Cc: daniel.vetter



On 2018年11月22日 19:30, Christian König wrote:
> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>
>>
>> On 2018年11月15日 19:12, Christian König wrote:
>>> Implement finding the right timeline point in drm_syncobj_find_fence.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/drm_syncobj.c 
>>> b/drivers/gpu/drm/drm_syncobj.c
>>> index 589d884ccd58..d42c51520da4 100644
>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file 
>>> *file_private,
>>>           return -ENOENT;
>>>         *fence = drm_syncobj_fence_get(syncobj);
>>> -    if (!*fence) {
>>> +    if (!*fence)
>>>           ret = -EINVAL;
>>> +
>>> +    if (!ret && point) {
>>> +        dma_fence_chain_for_each(*fence) {
>>> +            if (!to_dma_fence_chain(*fence) ||
>>> +                (*fence)->seqno <= point)
>>> +                break;
>> This condition isn't enough to find proper point.
>> For two examples:
>> a. No garbage collection happens, the points in chain are 
>> 1----3----6----9----12---18---20, if user wants to get point17, then 
>> we should return node 18.
>
> And that is exactly what's wrong in the original logic. In this case 
> we need to return 12, not 18 because point 17 could have already been 
> garbage collected.
I don't think so, the 'a' case I already assume there isn't garbage 
collection. If user wants to get point17, then we should return node 18.
timeline means point[N]  must be signaled later than point[N-1].
Point[12] just can make sure point[1] ~point[12] are signaled.
Point[18] signal can make sure point[17] is signaled.
So this case we need to return 18, not 12, which is key timeline concept.

-David
>
>> b. garbage collection happens on point6, chain would be updated to 
>> 1---3---9---12---18---20, if user wants to get point5, then we should 
>> return node 3, but if user wants to get point 7, then we should 
>> return node 9.
>
> Why? That doesn't seem to make any sense to me.
>
>> I still have no idea how to satisfy all these requirements with your 
>> current chain-fence. All these logic just are same we encountered 
>> before, we're walking them again. After solving these problems, I 
>> guess all design is similar as before.
>>
>> In fact, I don't know what problem previous design has, maybe there 
>> are some bugs, can't we fix these bugs by time going? Who can make 
>> sure his implementation never have bugs?
>
> Well there where numerous problems with the original design. For 
> example we need to reject the requirement that timeline fences are in 
> order because that doesn't make sense in the kernel.
>
> When userspace does something like submitting fences in the order 1, 
> 5, 3 then it is broken and can keep the pieces. In other words the 
> kernel should not care about that, but rather make sure that it never 
> looses any synchronization no matter what.
>
> Regards,
> Christian.
>
>>
>>
>> -David
>>> +        }
>>>       }
>>> +
>>>       drm_syncobj_put(syncobj);
>>>       return ret;
>>>   }
>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23  2:36       ` zhoucm1
@ 2018-11-23 10:10         ` Koenig, Christian
  2018-11-23 10:56           ` zhoucm1
  0 siblings, 1 reply; 28+ messages in thread
From: Koenig, Christian @ 2018-11-23 10:10 UTC (permalink / raw)
  To: Zhou, David(ChunMing), dri-devel; +Cc: daniel.vetter

Am 23.11.18 um 03:36 schrieb zhoucm1:
>
>
> On 2018年11月22日 19:30, Christian König wrote:
>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>
>>>
>>> On 2018年11月15日 19:12, Christian König wrote:
>>>> Implement finding the right timeline point in drm_syncobj_find_fence.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c 
>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>> index 589d884ccd58..d42c51520da4 100644
>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file 
>>>> *file_private,
>>>>           return -ENOENT;
>>>>         *fence = drm_syncobj_fence_get(syncobj);
>>>> -    if (!*fence) {
>>>> +    if (!*fence)
>>>>           ret = -EINVAL;
>>>> +
>>>> +    if (!ret && point) {
>>>> +        dma_fence_chain_for_each(*fence) {
>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>> +                (*fence)->seqno <= point)
>>>> +                break;
>>> This condition isn't enough to find proper point.
>>> For two examples:
>>> a. No garbage collection happens, the points in chain are 
>>> 1----3----6----9----12---18---20, if user wants to get point17, then 
>>> we should return node 18.
>>
>> And that is exactly what's wrong in the original logic. In this case 
>> we need to return 12, not 18 because point 17 could have already been 
>> garbage collected.
> I don't think so, the 'a' case I already assume there isn't garbage 
> collection. If user wants to get point17, then we should return node 18.
> timeline means point[N]  must be signaled later than point[N-1].
> Point[12] just can make sure point[1] ~point[12] are signaled.
> Point[18] signal can make sure point[17] is signaled.
> So this case we need to return 18, not 12, which is key timeline concept.

No, exactly that's incorrect. When we ask for 17 and can't find it then 
this means it either never existed or that it is signaled already.

Returning a lower number in this case or even a stub fence is perfectly 
fine since we only need to wait for that one in this case.

If we return 18 in this case then we add incorrect synchronization when 
there shouldn't be any.

Christian.

>
> -David
>>
>>> b. garbage collection happens on point6, chain would be updated to 
>>> 1---3---9---12---18---20, if user wants to get point5, then we 
>>> should return node 3, but if user wants to get point 7, then we 
>>> should return node 9.
>>
>> Why? That doesn't seem to make any sense to me.
>>
>>> I still have no idea how to satisfy all these requirements with your 
>>> current chain-fence. All these logic just are same we encountered 
>>> before, we're walking them again. After solving these problems, I 
>>> guess all design is similar as before.
>>>
>>> In fact, I don't know what problem previous design has, maybe there 
>>> are some bugs, can't we fix these bugs by time going? Who can make 
>>> sure his implementation never have bugs?
>>
>> Well there where numerous problems with the original design. For 
>> example we need to reject the requirement that timeline fences are in 
>> order because that doesn't make sense in the kernel.
>>
>> When userspace does something like submitting fences in the order 1, 
>> 5, 3 then it is broken and can keep the pieces. In other words the 
>> kernel should not care about that, but rather make sure that it never 
>> looses any synchronization no matter what.
>>
>> Regards,
>> Christian.
>>
>>>
>>>
>>> -David
>>>> +        }
>>>>       }
>>>> +
>>>>       drm_syncobj_put(syncobj);
>>>>       return ret;
>>>>   }
>>>
>>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 10:10         ` Koenig, Christian
@ 2018-11-23 10:56           ` zhoucm1
  2018-11-23 11:03             ` Christian König
  0 siblings, 1 reply; 28+ messages in thread
From: zhoucm1 @ 2018-11-23 10:56 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing), dri-devel; +Cc: daniel.vetter



On 2018年11月23日 18:10, Koenig, Christian wrote:
> Am 23.11.18 um 03:36 schrieb zhoucm1:
>>
>> On 2018年11月22日 19:30, Christian König wrote:
>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>>
>>>> On 2018年11月15日 19:12, Christian König wrote:
>>>>> Implement finding the right timeline point in drm_syncobj_find_fence.
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>> index 589d884ccd58..d42c51520da4 100644
>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
>>>>> *file_private,
>>>>>            return -ENOENT;
>>>>>          *fence = drm_syncobj_fence_get(syncobj);
>>>>> -    if (!*fence) {
>>>>> +    if (!*fence)
>>>>>            ret = -EINVAL;
>>>>> +
>>>>> +    if (!ret && point) {
>>>>> +        dma_fence_chain_for_each(*fence) {
>>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>>> +                (*fence)->seqno <= point)
>>>>> +                break;
>>>> This condition isn't enough to find proper point.
>>>> For two examples:
>>>> a. No garbage collection happens, the points in chain are
>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
>>>> we should return node 18.
>>> And that is exactly what's wrong in the original logic. In this case
>>> we need to return 12, not 18 because point 17 could have already been
>>> garbage collected.
>> I don't think so, the 'a' case I already assume there isn't garbage
>> collection. If user wants to get point17, then we should return node 18.
>> timeline means point[N]  must be signaled later than point[N-1].
>> Point[12] just can make sure point[1] ~point[12] are signaled.
>> Point[18] signal can make sure point[17] is signaled.
>> So this case we need to return 18, not 12, which is key timeline concept.
> No, exactly that's incorrect. When we ask for 17 and can't find it then
> this means it either never existed or that it is signaled already.
>
> Returning a lower number in this case or even a stub fence is perfectly
> fine since we only need to wait for that one in this case.
>
> If we return 18 in this case then we add incorrect synchronization when
> there shouldn't be any.
No, That will make timeline not work at all and break timeline semantics 
totally.

If there aren't point18 and point20, the chain is 
1----3----6----9----12, if user wants to get point 17, you also return 
12? if yes, which absolutely is incorrect. The answer should be NO, 
right? point17 should be waited on there until a bigger point is coming.

For chain is 1----3----6----9----12---18---20, if user wants to wait on 
any one of points 13,14,15,16,17,18, we must wait for point 18, this is 
timeline semantic.

You can also check sw_sync.c for timeline meaning.

-David
>
> Christian.
>
>> -David
>>>> b. garbage collection happens on point6, chain would be updated to
>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>> should return node 3, but if user wants to get point 7, then we
>>>> should return node 9.
>>> Why? That doesn't seem to make any sense to me.
>>>
>>>> I still have no idea how to satisfy all these requirements with your
>>>> current chain-fence. All these logic just are same we encountered
>>>> before, we're walking them again. After solving these problems, I
>>>> guess all design is similar as before.
>>>>
>>>> In fact, I don't know what problem previous design has, maybe there
>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>> sure his implementation never have bugs?
>>> Well there where numerous problems with the original design. For
>>> example we need to reject the requirement that timeline fences are in
>>> order because that doesn't make sense in the kernel.
>>>
>>> When userspace does something like submitting fences in the order 1,
>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>> kernel should not care about that, but rather make sure that it never
>>> looses any synchronization no matter what.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> -David
>>>>> +        }
>>>>>        }
>>>>> +
>>>>>        drm_syncobj_put(syncobj);
>>>>>        return ret;
>>>>>    }

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 10:56           ` zhoucm1
@ 2018-11-23 11:03             ` Christian König
  2018-11-23 12:02               ` Koenig, Christian
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-23 11:03 UTC (permalink / raw)
  To: zhoucm1, Koenig, Christian, Zhou, David(ChunMing), dri-devel
  Cc: daniel.vetter

Am 23.11.18 um 11:56 schrieb zhoucm1:
>
>
> On 2018年11月23日 18:10, Koenig, Christian wrote:
>> Am 23.11.18 um 03:36 schrieb zhoucm1:
>>>
>>> On 2018年11月22日 19:30, Christian König wrote:
>>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>>>
>>>>> On 2018年11月15日 19:12, Christian König wrote:
>>>>>> Implement finding the right timeline point in 
>>>>>> drm_syncobj_find_fence.
>>>>>>
>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>> ---
>>>>>>    drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>>> index 589d884ccd58..d42c51520da4 100644
>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
>>>>>> *file_private,
>>>>>>            return -ENOENT;
>>>>>>          *fence = drm_syncobj_fence_get(syncobj);
>>>>>> -    if (!*fence) {
>>>>>> +    if (!*fence)
>>>>>>            ret = -EINVAL;
>>>>>> +
>>>>>> +    if (!ret && point) {
>>>>>> +        dma_fence_chain_for_each(*fence) {
>>>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>>>> +                (*fence)->seqno <= point)
>>>>>> +                break;
>>>>> This condition isn't enough to find proper point.
>>>>> For two examples:
>>>>> a. No garbage collection happens, the points in chain are
>>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
>>>>> we should return node 18.
>>>> And that is exactly what's wrong in the original logic. In this case
>>>> we need to return 12, not 18 because point 17 could have already been
>>>> garbage collected.
>>> I don't think so, the 'a' case I already assume there isn't garbage
>>> collection. If user wants to get point17, then we should return node 
>>> 18.
>>> timeline means point[N]  must be signaled later than point[N-1].
>>> Point[12] just can make sure point[1] ~point[12] are signaled.
>>> Point[18] signal can make sure point[17] is signaled.
>>> So this case we need to return 18, not 12, which is key timeline 
>>> concept.
>> No, exactly that's incorrect. When we ask for 17 and can't find it then
>> this means it either never existed or that it is signaled already.
>>
>> Returning a lower number in this case or even a stub fence is perfectly
>> fine since we only need to wait for that one in this case.
>>
>> If we return 18 in this case then we add incorrect synchronization when
>> there shouldn't be any.
> No, That will make timeline not work at all and break timeline 
> semantics totally.
>
> If there aren't point18 and point20, the chain is 
> 1----3----6----9----12, if user wants to get point 17, you also return 
> 12? if yes, which absolutely is incorrect. The answer should be NO, 
> right? point17 should be waited on there until a bigger point is coming.

Correct, but this is a different case. In this situation we either 
return an error or wait for point 17 (or something >=17) to show up.

The key difference is if point 17 shows up then we return point 17, but 
if point 18 shows up then we need to return point 12.

> For chain is 1----3----6----9----12---18---20, if user wants to wait 
> on any one of points 13,14,15,16,17,18, we must wait for point 18, 
> this is timeline semantic.

Ah, now I understand. You are still sticking with the assumption of a 
fence number, right?

In other words what you imply here is that we have the same semantic as 
when somebody waits for a memory location to be written by number 17, 
right? In this case the semantics you describe here indeed applies.

But that is certainly not what we want to implement or otherwise we will 
never be able to garbage collect the numbers in between.

So if Vulkan has this requirement then we need to reject that.

Regards,
Christian.

>
> You can also check sw_sync.c for timeline meaning.
>
> -David
>>
>> Christian.
>>
>>> -David
>>>>> b. garbage collection happens on point6, chain would be updated to
>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>>> should return node 3, but if user wants to get point 7, then we
>>>>> should return node 9.
>>>> Why? That doesn't seem to make any sense to me.
>>>>
>>>>> I still have no idea how to satisfy all these requirements with your
>>>>> current chain-fence. All these logic just are same we encountered
>>>>> before, we're walking them again. After solving these problems, I
>>>>> guess all design is similar as before.
>>>>>
>>>>> In fact, I don't know what problem previous design has, maybe there
>>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>>> sure his implementation never have bugs?
>>>> Well there where numerous problems with the original design. For
>>>> example we need to reject the requirement that timeline fences are in
>>>> order because that doesn't make sense in the kernel.
>>>>
>>>> When userspace does something like submitting fences in the order 1,
>>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>>> kernel should not care about that, but rather make sure that it never
>>>> looses any synchronization no matter what.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> -David
>>>>>> +        }
>>>>>>        }
>>>>>> +
>>>>>>        drm_syncobj_put(syncobj);
>>>>>>        return ret;
>>>>>>    }
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 11:03             ` Christian König
@ 2018-11-23 12:02               ` Koenig, Christian
  2018-11-23 12:26                 ` Daniel Vetter
  2018-11-23 13:15                 ` Chunming Zhou
  0 siblings, 2 replies; 28+ messages in thread
From: Koenig, Christian @ 2018-11-23 12:02 UTC (permalink / raw)
  To: Zhou, David(ChunMing); +Cc: daniel.vetter

Am 23.11.18 um 12:03 schrieb Christian König:
> Am 23.11.18 um 11:56 schrieb zhoucm1:
>>
>>
>> On 2018年11月23日 18:10, Koenig, Christian wrote:
>>> Am 23.11.18 um 03:36 schrieb zhoucm1:
>>>>
>>>> On 2018年11月22日 19:30, Christian König wrote:
>>>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>>>>
>>>>>> On 2018年11月15日 19:12, Christian König wrote:
>>>>>>> Implement finding the right timeline point in 
>>>>>>> drm_syncobj_find_fence.
>>>>>>>
>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>>>> index 589d884ccd58..d42c51520da4 100644
>>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
>>>>>>> *file_private,
>>>>>>>            return -ENOENT;
>>>>>>>          *fence = drm_syncobj_fence_get(syncobj);
>>>>>>> -    if (!*fence) {
>>>>>>> +    if (!*fence)
>>>>>>>            ret = -EINVAL;
>>>>>>> +
>>>>>>> +    if (!ret && point) {
>>>>>>> +        dma_fence_chain_for_each(*fence) {
>>>>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>>>>> +                (*fence)->seqno <= point)
>>>>>>> +                break;
>>>>>> This condition isn't enough to find proper point.
>>>>>> For two examples:
>>>>>> a. No garbage collection happens, the points in chain are
>>>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
>>>>>> we should return node 18.
>>>>> And that is exactly what's wrong in the original logic. In this case
>>>>> we need to return 12, not 18 because point 17 could have already been
>>>>> garbage collected.
>>>> I don't think so, the 'a' case I already assume there isn't garbage
>>>> collection. If user wants to get point17, then we should return 
>>>> node 18.
>>>> timeline means point[N]  must be signaled later than point[N-1].
>>>> Point[12] just can make sure point[1] ~point[12] are signaled.
>>>> Point[18] signal can make sure point[17] is signaled.
>>>> So this case we need to return 18, not 12, which is key timeline 
>>>> concept.
>>> No, exactly that's incorrect. When we ask for 17 and can't find it then
>>> this means it either never existed or that it is signaled already.
>>>
>>> Returning a lower number in this case or even a stub fence is perfectly
>>> fine since we only need to wait for that one in this case.
>>>
>>> If we return 18 in this case then we add incorrect synchronization when
>>> there shouldn't be any.
>> No, That will make timeline not work at all and break timeline 
>> semantics totally.
>>
>> If there aren't point18 and point20, the chain is 
>> 1----3----6----9----12, if user wants to get point 17, you also 
>> return 12? if yes, which absolutely is incorrect. The answer should 
>> be NO, right? point17 should be waited on there until a bigger point 
>> is coming.
>
> Correct, but this is a different case. In this situation we either 
> return an error or wait for point 17 (or something >=17) to show up.
>
> The key difference is if point 17 shows up then we return point 17, 
> but if point 18 shows up then we need to return point 12.
>
>> For chain is 1----3----6----9----12---18---20, if user wants to wait 
>> on any one of points 13,14,15,16,17,18, we must wait for point 18, 
>> this is timeline semantic.
>
> Ah, now I understand. You are still sticking with the assumption of a 
> fence number, right?
>
> In other words what you imply here is that we have the same semantic 
> as when somebody waits for a memory location to be written by number 
> 17, right? In this case the semantics you describe here indeed applies.
>
> But that is certainly not what we want to implement or otherwise we 
> will never be able to garbage collect the numbers in between.
>
> So if Vulkan has this requirement then we need to reject that.

Backing of and reconsidering this I came to the conclusion that what you 
suggest here is actually the most defensive solution.

In other words it is the solution where it's most likely that nothing 
goes wrong because the worst thing that can happen is that we 
synchronize to much, but never to less.

Going to think about it how we can bring that into alignment with the 
proposed garbage collection.

Thanks,
Christian.

>
> Regards,
> Christian.
>
>>
>> You can also check sw_sync.c for timeline meaning.
>>
>> -David
>>>
>>> Christian.
>>>
>>>> -David
>>>>>> b. garbage collection happens on point6, chain would be updated to
>>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>>>> should return node 3, but if user wants to get point 7, then we
>>>>>> should return node 9.
>>>>> Why? That doesn't seem to make any sense to me.
>>>>>
>>>>>> I still have no idea how to satisfy all these requirements with your
>>>>>> current chain-fence. All these logic just are same we encountered
>>>>>> before, we're walking them again. After solving these problems, I
>>>>>> guess all design is similar as before.
>>>>>>
>>>>>> In fact, I don't know what problem previous design has, maybe there
>>>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>>>> sure his implementation never have bugs?
>>>>> Well there where numerous problems with the original design. For
>>>>> example we need to reject the requirement that timeline fences are in
>>>>> order because that doesn't make sense in the kernel.
>>>>>
>>>>> When userspace does something like submitting fences in the order 1,
>>>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>>>> kernel should not care about that, but rather make sure that it never
>>>>> looses any synchronization no matter what.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>>
>>>>>> -David
>>>>>>> +        }
>>>>>>>        }
>>>>>>> +
>>>>>>>        drm_syncobj_put(syncobj);
>>>>>>>        return ret;
>>>>>>>    }
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 12:02               ` Koenig, Christian
@ 2018-11-23 12:26                 ` Daniel Vetter
  2018-11-23 12:40                   ` Christian König
  2018-11-23 13:15                 ` Chunming Zhou
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Vetter @ 2018-11-23 12:26 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: dri-devel, daniel.vetter

On Fri, Nov 23, 2018 at 12:02:41PM +0000, Koenig, Christian wrote:
> Am 23.11.18 um 12:03 schrieb Christian König:
> > Am 23.11.18 um 11:56 schrieb zhoucm1:
> >>
> >>
> >> On 2018年11月23日 18:10, Koenig, Christian wrote:
> >>> Am 23.11.18 um 03:36 schrieb zhoucm1:
> >>>>
> >>>> On 2018年11月22日 19:30, Christian König wrote:
> >>>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
> >>>>>>
> >>>>>> On 2018年11月15日 19:12, Christian König wrote:
> >>>>>>> Implement finding the right timeline point in 
> >>>>>>> drm_syncobj_find_fence.
> >>>>>>>
> >>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>>>> ---
> >>>>>>>    drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
> >>>>>>>    1 file changed, 9 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
> >>>>>>> b/drivers/gpu/drm/drm_syncobj.c
> >>>>>>> index 589d884ccd58..d42c51520da4 100644
> >>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
> >>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
> >>>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
> >>>>>>> *file_private,
> >>>>>>>            return -ENOENT;
> >>>>>>>          *fence = drm_syncobj_fence_get(syncobj);
> >>>>>>> -    if (!*fence) {
> >>>>>>> +    if (!*fence)
> >>>>>>>            ret = -EINVAL;
> >>>>>>> +
> >>>>>>> +    if (!ret && point) {
> >>>>>>> +        dma_fence_chain_for_each(*fence) {
> >>>>>>> +            if (!to_dma_fence_chain(*fence) ||
> >>>>>>> +                (*fence)->seqno <= point)
> >>>>>>> +                break;
> >>>>>> This condition isn't enough to find proper point.
> >>>>>> For two examples:
> >>>>>> a. No garbage collection happens, the points in chain are
> >>>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
> >>>>>> we should return node 18.
> >>>>> And that is exactly what's wrong in the original logic. In this case
> >>>>> we need to return 12, not 18 because point 17 could have already been
> >>>>> garbage collected.
> >>>> I don't think so, the 'a' case I already assume there isn't garbage
> >>>> collection. If user wants to get point17, then we should return 
> >>>> node 18.
> >>>> timeline means point[N]  must be signaled later than point[N-1].
> >>>> Point[12] just can make sure point[1] ~point[12] are signaled.
> >>>> Point[18] signal can make sure point[17] is signaled.
> >>>> So this case we need to return 18, not 12, which is key timeline 
> >>>> concept.
> >>> No, exactly that's incorrect. When we ask for 17 and can't find it then
> >>> this means it either never existed or that it is signaled already.
> >>>
> >>> Returning a lower number in this case or even a stub fence is perfectly
> >>> fine since we only need to wait for that one in this case.
> >>>
> >>> If we return 18 in this case then we add incorrect synchronization when
> >>> there shouldn't be any.
> >> No, That will make timeline not work at all and break timeline 
> >> semantics totally.
> >>
> >> If there aren't point18 and point20, the chain is 
> >> 1----3----6----9----12, if user wants to get point 17, you also 
> >> return 12? if yes, which absolutely is incorrect. The answer should 
> >> be NO, right? point17 should be waited on there until a bigger point 
> >> is coming.
> >
> > Correct, but this is a different case. In this situation we either 
> > return an error or wait for point 17 (or something >=17) to show up.
> >
> > The key difference is if point 17 shows up then we return point 17, 
> > but if point 18 shows up then we need to return point 12.
> >
> >> For chain is 1----3----6----9----12---18---20, if user wants to wait 
> >> on any one of points 13,14,15,16,17,18, we must wait for point 18, 
> >> this is timeline semantic.
> >
> > Ah, now I understand. You are still sticking with the assumption of a 
> > fence number, right?
> >
> > In other words what you imply here is that we have the same semantic 
> > as when somebody waits for a memory location to be written by number 
> > 17, right? In this case the semantics you describe here indeed applies.
> >
> > But that is certainly not what we want to implement or otherwise we 
> > will never be able to garbage collect the numbers in between.
> >
> > So if Vulkan has this requirement then we need to reject that.
> 
> Backing of and reconsidering this I came to the conclusion that what you 
> suggest here is actually the most defensive solution.
> 
> In other words it is the solution where it's most likely that nothing 
> goes wrong because the worst thing that can happen is that we 
> synchronize to much, but never to less.
> 
> Going to think about it how we can bring that into alignment with the 
> proposed garbage collection.

Should we implement the tests first (either as in-kernel unit tests, like
we have some, or in igt on top of vgem), agree on the semantics we want,
then work on the implementation?

All these discussions and gotchas and "oops another corner case we missed"
when only looking at the implementation feels like it could work out
better if we attack this from the other side of the uapi barrier ...

Just a thought.
-Daniel

> 
> Thanks,
> Christian.
> 
> >
> > Regards,
> > Christian.
> >
> >>
> >> You can also check sw_sync.c for timeline meaning.
> >>
> >> -David
> >>>
> >>> Christian.
> >>>
> >>>> -David
> >>>>>> b. garbage collection happens on point6, chain would be updated to
> >>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
> >>>>>> should return node 3, but if user wants to get point 7, then we
> >>>>>> should return node 9.
> >>>>> Why? That doesn't seem to make any sense to me.
> >>>>>
> >>>>>> I still have no idea how to satisfy all these requirements with your
> >>>>>> current chain-fence. All these logic just are same we encountered
> >>>>>> before, we're walking them again. After solving these problems, I
> >>>>>> guess all design is similar as before.
> >>>>>>
> >>>>>> In fact, I don't know what problem previous design has, maybe there
> >>>>>> are some bugs, can't we fix these bugs by time going? Who can make
> >>>>>> sure his implementation never have bugs?
> >>>>> Well there where numerous problems with the original design. For
> >>>>> example we need to reject the requirement that timeline fences are in
> >>>>> order because that doesn't make sense in the kernel.
> >>>>>
> >>>>> When userspace does something like submitting fences in the order 1,
> >>>>> 5, 3 then it is broken and can keep the pieces. In other words the
> >>>>> kernel should not care about that, but rather make sure that it never
> >>>>> looses any synchronization no matter what.
> >>>>>
> >>>>> Regards,
> >>>>> Christian.
> >>>>>
> >>>>>>
> >>>>>> -David
> >>>>>>> +        }
> >>>>>>>        }
> >>>>>>> +
> >>>>>>>        drm_syncobj_put(syncobj);
> >>>>>>>        return ret;
> >>>>>>>    }
> >>
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 12:26                 ` Daniel Vetter
@ 2018-11-23 12:40                   ` Christian König
  2018-11-27  7:53                     ` Daniel Vetter
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-23 12:40 UTC (permalink / raw)
  To: Daniel Vetter, Koenig, Christian; +Cc: daniel.vetter, dri-devel

Am 23.11.18 um 13:26 schrieb Daniel Vetter:
> On Fri, Nov 23, 2018 at 12:02:41PM +0000, Koenig, Christian wrote:
>> Am 23.11.18 um 12:03 schrieb Christian König:
>>> Am 23.11.18 um 11:56 schrieb zhoucm1:
>>>>
>>>> On 2018年11月23日 18:10, Koenig, Christian wrote:
>>>>> Am 23.11.18 um 03:36 schrieb zhoucm1:
>>>>>> On 2018年11月22日 19:30, Christian König wrote:
>>>>>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>>>>>> On 2018年11月15日 19:12, Christian König wrote:
>>>>>>>>> Implement finding the right timeline point in
>>>>>>>>> drm_syncobj_find_fence.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>>>> ---
>>>>>>>>>     drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>>>>>>     1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> index 589d884ccd58..d42c51520da4 100644
>>>>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
>>>>>>>>> *file_private,
>>>>>>>>>             return -ENOENT;
>>>>>>>>>           *fence = drm_syncobj_fence_get(syncobj);
>>>>>>>>> -    if (!*fence) {
>>>>>>>>> +    if (!*fence)
>>>>>>>>>             ret = -EINVAL;
>>>>>>>>> +
>>>>>>>>> +    if (!ret && point) {
>>>>>>>>> +        dma_fence_chain_for_each(*fence) {
>>>>>>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>>>>>>> +                (*fence)->seqno <= point)
>>>>>>>>> +                break;
>>>>>>>> This condition isn't enough to find proper point.
>>>>>>>> For two examples:
>>>>>>>> a. No garbage collection happens, the points in chain are
>>>>>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
>>>>>>>> we should return node 18.
>>>>>>> And that is exactly what's wrong in the original logic. In this case
>>>>>>> we need to return 12, not 18 because point 17 could have already been
>>>>>>> garbage collected.
>>>>>> I don't think so, the 'a' case I already assume there isn't garbage
>>>>>> collection. If user wants to get point17, then we should return
>>>>>> node 18.
>>>>>> timeline means point[N]  must be signaled later than point[N-1].
>>>>>> Point[12] just can make sure point[1] ~point[12] are signaled.
>>>>>> Point[18] signal can make sure point[17] is signaled.
>>>>>> So this case we need to return 18, not 12, which is key timeline
>>>>>> concept.
>>>>> No, exactly that's incorrect. When we ask for 17 and can't find it then
>>>>> this means it either never existed or that it is signaled already.
>>>>>
>>>>> Returning a lower number in this case or even a stub fence is perfectly
>>>>> fine since we only need to wait for that one in this case.
>>>>>
>>>>> If we return 18 in this case then we add incorrect synchronization when
>>>>> there shouldn't be any.
>>>> No, That will make timeline not work at all and break timeline
>>>> semantics totally.
>>>>
>>>> If there aren't point18 and point20, the chain is
>>>> 1----3----6----9----12, if user wants to get point 17, you also
>>>> return 12? if yes, which absolutely is incorrect. The answer should
>>>> be NO, right? point17 should be waited on there until a bigger point
>>>> is coming.
>>> Correct, but this is a different case. In this situation we either
>>> return an error or wait for point 17 (or something >=17) to show up.
>>>
>>> The key difference is if point 17 shows up then we return point 17,
>>> but if point 18 shows up then we need to return point 12.
>>>
>>>> For chain is 1----3----6----9----12---18---20, if user wants to wait
>>>> on any one of points 13,14,15,16,17,18, we must wait for point 18,
>>>> this is timeline semantic.
>>> Ah, now I understand. You are still sticking with the assumption of a
>>> fence number, right?
>>>
>>> In other words what you imply here is that we have the same semantic
>>> as when somebody waits for a memory location to be written by number
>>> 17, right? In this case the semantics you describe here indeed applies.
>>>
>>> But that is certainly not what we want to implement or otherwise we
>>> will never be able to garbage collect the numbers in between.
>>>
>>> So if Vulkan has this requirement then we need to reject that.
>> Backing of and reconsidering this I came to the conclusion that what you
>> suggest here is actually the most defensive solution.
>>
>> In other words it is the solution where it's most likely that nothing
>> goes wrong because the worst thing that can happen is that we
>> synchronize to much, but never to less.
>>
>> Going to think about it how we can bring that into alignment with the
>> proposed garbage collection.
> Should we implement the tests first (either as in-kernel unit tests, like
> we have some, or in igt on top of vgem), agree on the semantics we want,
> then work on the implementation?
>
> All these discussions and gotchas and "oops another corner case we missed"
> when only looking at the implementation feels like it could work out
> better if we attack this from the other side of the uapi barrier ...

Well I agree that this "oops another corner case we missed" is exactly 
the key problem here.

But when the kernel developers implement the test cases we just move the 
issue to another place and not fundamentally solve it.

What we should do is to push the anv/radv/amdvlk devs to go ahead and 
implement test cases for all the ugly corner cases they have in mind.

Additional to that I really think that the UAPI David(ChunMing) came up 
with is actually pretty solid, so everything should actually be good to 
go to do this.

Regards,
Christian.

>
> Just a thought.
> -Daniel
>
>> Thanks,
>> Christian.
>>
>>> Regards,
>>> Christian.
>>>
>>>> You can also check sw_sync.c for timeline meaning.
>>>>
>>>> -David
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>>> b. garbage collection happens on point6, chain would be updated to
>>>>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>>>>>> should return node 3, but if user wants to get point 7, then we
>>>>>>>> should return node 9.
>>>>>>> Why? That doesn't seem to make any sense to me.
>>>>>>>
>>>>>>>> I still have no idea how to satisfy all these requirements with your
>>>>>>>> current chain-fence. All these logic just are same we encountered
>>>>>>>> before, we're walking them again. After solving these problems, I
>>>>>>>> guess all design is similar as before.
>>>>>>>>
>>>>>>>> In fact, I don't know what problem previous design has, maybe there
>>>>>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>>>>>> sure his implementation never have bugs?
>>>>>>> Well there where numerous problems with the original design. For
>>>>>>> example we need to reject the requirement that timeline fences are in
>>>>>>> order because that doesn't make sense in the kernel.
>>>>>>>
>>>>>>> When userspace does something like submitting fences in the order 1,
>>>>>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>>>>>> kernel should not care about that, but rather make sure that it never
>>>>>>> looses any synchronization no matter what.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> +        }
>>>>>>>>>         }
>>>>>>>>> +
>>>>>>>>>         drm_syncobj_put(syncobj);
>>>>>>>>>         return ret;
>>>>>>>>>     }
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 12:02               ` Koenig, Christian
  2018-11-23 12:26                 ` Daniel Vetter
@ 2018-11-23 13:15                 ` Chunming Zhou
  2018-11-23 13:27                   ` Koenig, Christian
  1 sibling, 1 reply; 28+ messages in thread
From: Chunming Zhou @ 2018-11-23 13:15 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing), dri-devel; +Cc: daniel.vetter



在 2018/11/23 20:02, Koenig, Christian 写道:
> Am 23.11.18 um 12:03 schrieb Christian König:
>> Am 23.11.18 um 11:56 schrieb zhoucm1:
>>>
>>> On 2018年11月23日 18:10, Koenig, Christian wrote:
>>>> Am 23.11.18 um 03:36 schrieb zhoucm1:
>>>>> On 2018年11月22日 19:30, Christian König wrote:
>>>>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>>>>> On 2018年11月15日 19:12, Christian König wrote:
>>>>>>>> Implement finding the right timeline point in
>>>>>>>> drm_syncobj_find_fence.
>>>>>>>>
>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>>>>>     1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>>>>> index 589d884ccd58..d42c51520da4 100644
>>>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
>>>>>>>> *file_private,
>>>>>>>>             return -ENOENT;
>>>>>>>>           *fence = drm_syncobj_fence_get(syncobj);
>>>>>>>> -    if (!*fence) {
>>>>>>>> +    if (!*fence)
>>>>>>>>             ret = -EINVAL;
>>>>>>>> +
>>>>>>>> +    if (!ret && point) {
>>>>>>>> +        dma_fence_chain_for_each(*fence) {
>>>>>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>>>>>> +                (*fence)->seqno <= point)
>>>>>>>> +                break;
>>>>>>> This condition isn't enough to find proper point.
>>>>>>> For two examples:
>>>>>>> a. No garbage collection happens, the points in chain are
>>>>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
>>>>>>> we should return node 18.
>>>>>> And that is exactly what's wrong in the original logic. In this case
>>>>>> we need to return 12, not 18 because point 17 could have already been
>>>>>> garbage collected.
>>>>> I don't think so, the 'a' case I already assume there isn't garbage
>>>>> collection. If user wants to get point17, then we should return
>>>>> node 18.
>>>>> timeline means point[N]  must be signaled later than point[N-1].
>>>>> Point[12] just can make sure point[1] ~point[12] are signaled.
>>>>> Point[18] signal can make sure point[17] is signaled.
>>>>> So this case we need to return 18, not 12, which is key timeline
>>>>> concept.
>>>> No, exactly that's incorrect. When we ask for 17 and can't find it then
>>>> this means it either never existed or that it is signaled already.
>>>>
>>>> Returning a lower number in this case or even a stub fence is perfectly
>>>> fine since we only need to wait for that one in this case.
>>>>
>>>> If we return 18 in this case then we add incorrect synchronization when
>>>> there shouldn't be any.
>>> No, That will make timeline not work at all and break timeline
>>> semantics totally.
>>>
>>> If there aren't point18 and point20, the chain is
>>> 1----3----6----9----12, if user wants to get point 17, you also
>>> return 12? if yes, which absolutely is incorrect. The answer should
>>> be NO, right? point17 should be waited on there until a bigger point
>>> is coming.
>> Correct, but this is a different case. In this situation we either
>> return an error or wait for point 17 (or something >=17) to show up.
>>
>> The key difference is if point 17 shows up then we return point 17,
>> but if point 18 shows up then we need to return point 12.
>>
>>> For chain is 1----3----6----9----12---18---20, if user wants to wait
>>> on any one of points 13,14,15,16,17,18, we must wait for point 18,
>>> this is timeline semantic.
>> Ah, now I understand. You are still sticking with the assumption of a
>> fence number, right?
>>
>> In other words what you imply here is that we have the same semantic
>> as when somebody waits for a memory location to be written by number
>> 17, right? In this case the semantics you describe here indeed applies.
>>
>> But that is certainly not what we want to implement or otherwise we
>> will never be able to garbage collect the numbers in between.
>>
>> So if Vulkan has this requirement then we need to reject that.
> Backing of and reconsidering this I came to the conclusion that what you
> suggest here is actually the most defensive solution.
>
> In other words it is the solution where it's most likely that nothing
> goes wrong because the worst thing that can happen is that we
> synchronize to much, but never to less.
>
> Going to think about it how we can bring that into alignment with the
> proposed garbage collection.
Yeah, for garbae collection, I came up an idea this morning, we can pass 
the signaled stub fence when chain node is created, when you walk out 
all chain node, you can replace chain->fence with stub fence, that way, 
there is no redudant fence referenced in chain node, and we can keep the 
signal point in there.

Another use case, I'm not sure if you considered:
if chain is  1----3----6----9----12---18, a wait operation is on point 
17, then we return 18, another signal point 17 comes, then we still wait 
on 18(assume 18 takes very long time), that looks not reseonable, but 
this is just performance problem potientially. Seems the way of timeline 
sw_sync.c with comparing point for signal status can sovle it.

-David

>
> Thanks,
> Christian.
>
>> Regards,
>> Christian.
>>
>>> You can also check sw_sync.c for timeline meaning.
>>>
>>> -David
>>>> Christian.
>>>>
>>>>> -David
>>>>>>> b. garbage collection happens on point6, chain would be updated to
>>>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>>>>> should return node 3, but if user wants to get point 7, then we
>>>>>>> should return node 9.
>>>>>> Why? That doesn't seem to make any sense to me.
>>>>>>
>>>>>>> I still have no idea how to satisfy all these requirements with your
>>>>>>> current chain-fence. All these logic just are same we encountered
>>>>>>> before, we're walking them again. After solving these problems, I
>>>>>>> guess all design is similar as before.
>>>>>>>
>>>>>>> In fact, I don't know what problem previous design has, maybe there
>>>>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>>>>> sure his implementation never have bugs?
>>>>>> Well there where numerous problems with the original design. For
>>>>>> example we need to reject the requirement that timeline fences are in
>>>>>> order because that doesn't make sense in the kernel.
>>>>>>
>>>>>> When userspace does something like submitting fences in the order 1,
>>>>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>>>>> kernel should not care about that, but rather make sure that it never
>>>>>> looses any synchronization no matter what.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>> -David
>>>>>>>> +        }
>>>>>>>>         }
>>>>>>>> +
>>>>>>>>         drm_syncobj_put(syncobj);
>>>>>>>>         return ret;
>>>>>>>>     }
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 13:15                 ` Chunming Zhou
@ 2018-11-23 13:27                   ` Koenig, Christian
  2018-11-23 13:42                     ` Chunming Zhou
  0 siblings, 1 reply; 28+ messages in thread
From: Koenig, Christian @ 2018-11-23 13:27 UTC (permalink / raw)
  To: Zhou, David(ChunMing), dri-devel; +Cc: daniel.vetter

Am 23.11.18 um 14:15 schrieb Zhou, David(ChunMing):
>
> 在 2018/11/23 20:02, Koenig, Christian 写道:
>> Am 23.11.18 um 12:03 schrieb Christian König:
>>> Am 23.11.18 um 11:56 schrieb zhoucm1:
>>>> On 2018年11月23日 18:10, Koenig, Christian wrote:
>>>>> Am 23.11.18 um 03:36 schrieb zhoucm1:
>>>>>> On 2018年11月22日 19:30, Christian König wrote:
>>>>>>> Am 22.11.18 um 07:52 schrieb zhoucm1:
>>>>>>>> On 2018年11月15日 19:12, Christian König wrote:
>>>>>>>>> Implement finding the right timeline point in
>>>>>>>>> drm_syncobj_find_fence.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>>>> ---
>>>>>>>>>      drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
>>>>>>>>>      1 file changed, 9 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> b/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> index 589d884ccd58..d42c51520da4 100644
>>>>>>>>> --- a/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> +++ b/drivers/gpu/drm/drm_syncobj.c
>>>>>>>>> @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
>>>>>>>>> *file_private,
>>>>>>>>>              return -ENOENT;
>>>>>>>>>            *fence = drm_syncobj_fence_get(syncobj);
>>>>>>>>> -    if (!*fence) {
>>>>>>>>> +    if (!*fence)
>>>>>>>>>              ret = -EINVAL;
>>>>>>>>> +
>>>>>>>>> +    if (!ret && point) {
>>>>>>>>> +        dma_fence_chain_for_each(*fence) {
>>>>>>>>> +            if (!to_dma_fence_chain(*fence) ||
>>>>>>>>> +                (*fence)->seqno <= point)
>>>>>>>>> +                break;
>>>>>>>> This condition isn't enough to find proper point.
>>>>>>>> For two examples:
>>>>>>>> a. No garbage collection happens, the points in chain are
>>>>>>>> 1----3----6----9----12---18---20, if user wants to get point17, then
>>>>>>>> we should return node 18.
>>>>>>> And that is exactly what's wrong in the original logic. In this case
>>>>>>> we need to return 12, not 18 because point 17 could have already been
>>>>>>> garbage collected.
>>>>>> I don't think so, the 'a' case I already assume there isn't garbage
>>>>>> collection. If user wants to get point17, then we should return
>>>>>> node 18.
>>>>>> timeline means point[N]  must be signaled later than point[N-1].
>>>>>> Point[12] just can make sure point[1] ~point[12] are signaled.
>>>>>> Point[18] signal can make sure point[17] is signaled.
>>>>>> So this case we need to return 18, not 12, which is key timeline
>>>>>> concept.
>>>>> No, exactly that's incorrect. When we ask for 17 and can't find it then
>>>>> this means it either never existed or that it is signaled already.
>>>>>
>>>>> Returning a lower number in this case or even a stub fence is perfectly
>>>>> fine since we only need to wait for that one in this case.
>>>>>
>>>>> If we return 18 in this case then we add incorrect synchronization when
>>>>> there shouldn't be any.
>>>> No, That will make timeline not work at all and break timeline
>>>> semantics totally.
>>>>
>>>> If there aren't point18 and point20, the chain is
>>>> 1----3----6----9----12, if user wants to get point 17, you also
>>>> return 12? if yes, which absolutely is incorrect. The answer should
>>>> be NO, right? point17 should be waited on there until a bigger point
>>>> is coming.
>>> Correct, but this is a different case. In this situation we either
>>> return an error or wait for point 17 (or something >=17) to show up.
>>>
>>> The key difference is if point 17 shows up then we return point 17,
>>> but if point 18 shows up then we need to return point 12.
>>>
>>>> For chain is 1----3----6----9----12---18---20, if user wants to wait
>>>> on any one of points 13,14,15,16,17,18, we must wait for point 18,
>>>> this is timeline semantic.
>>> Ah, now I understand. You are still sticking with the assumption of a
>>> fence number, right?
>>>
>>> In other words what you imply here is that we have the same semantic
>>> as when somebody waits for a memory location to be written by number
>>> 17, right? In this case the semantics you describe here indeed applies.
>>>
>>> But that is certainly not what we want to implement or otherwise we
>>> will never be able to garbage collect the numbers in between.
>>>
>>> So if Vulkan has this requirement then we need to reject that.
>> Backing of and reconsidering this I came to the conclusion that what you
>> suggest here is actually the most defensive solution.
>>
>> In other words it is the solution where it's most likely that nothing
>> goes wrong because the worst thing that can happen is that we
>> synchronize to much, but never to less.
>>
>> Going to think about it how we can bring that into alignment with the
>> proposed garbage collection.
> Yeah, for garbae collection, I came up an idea this morning, we can pass
> the signaled stub fence when chain node is created, when you walk out
> all chain node, you can replace chain->fence with stub fence, that way,
> there is no redudant fence referenced in chain node, and we can keep the
> signal point in there.

That would still not allow to garbage collect the chain node itself.

But I've came up with something which should work. Assume the original 
chain is:

1----3----6----9----12---18

And we garbage collect everything but 6 and 18 then all we need to know 
to return the correct node is what the original previous sequence number 
was.

6 (3)----18 (12)

When then somebody asks for 17 we can still return 18 and if somebody 
asks for 9 we would return 6.

> Another use case, I'm not sure if you considered:
> if chain is  1----3----6----9----12---18, a wait operation is on point
> 17, then we return 18, another signal point 17 comes, then we still wait
> on 18(assume 18 takes very long time), that looks not reseonable, but
> this is just performance problem potientially. Seems the way of timeline
> sw_sync.c with comparing point for signal status can sovle it.

Well I thought that we declared that signaling lower numbers is illegal?

My current solution to that is when userspace messes up the sequence 
numbers and submit 1-3-6-9-12-18-17 we start a new chain with 17 and 
never look back.

E.g. when somebody then asks for anything below 17 we always return 17 
and if somebody asks for 18 we return an error because that is handled 
as not signaled yet.

Regards,
Christian.

>
> -David
>
>> Thanks,
>> Christian.
>>
>>> Regards,
>>> Christian.
>>>
>>>> You can also check sw_sync.c for timeline meaning.
>>>>
>>>> -David
>>>>> Christian.
>>>>>
>>>>>> -David
>>>>>>>> b. garbage collection happens on point6, chain would be updated to
>>>>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>>>>>> should return node 3, but if user wants to get point 7, then we
>>>>>>>> should return node 9.
>>>>>>> Why? That doesn't seem to make any sense to me.
>>>>>>>
>>>>>>>> I still have no idea how to satisfy all these requirements with your
>>>>>>>> current chain-fence. All these logic just are same we encountered
>>>>>>>> before, we're walking them again. After solving these problems, I
>>>>>>>> guess all design is similar as before.
>>>>>>>>
>>>>>>>> In fact, I don't know what problem previous design has, maybe there
>>>>>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>>>>>> sure his implementation never have bugs?
>>>>>>> Well there where numerous problems with the original design. For
>>>>>>> example we need to reject the requirement that timeline fences are in
>>>>>>> order because that doesn't make sense in the kernel.
>>>>>>>
>>>>>>> When userspace does something like submitting fences in the order 1,
>>>>>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>>>>>> kernel should not care about that, but rather make sure that it never
>>>>>>> looses any synchronization no matter what.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>> +        }
>>>>>>>>>          }
>>>>>>>>> +
>>>>>>>>>          drm_syncobj_put(syncobj);
>>>>>>>>>          return ret;
>>>>>>>>>      }
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel@lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 13:27                   ` Koenig, Christian
@ 2018-11-23 13:42                     ` Chunming Zhou
  2018-11-23 18:16                       ` Christian König
  0 siblings, 1 reply; 28+ messages in thread
From: Chunming Zhou @ 2018-11-23 13:42 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing), dri-devel; +Cc: daniel.vetter


[-- Attachment #1.1: Type: text/plain, Size: 8176 bytes --]



在 2018/11/23 21:27, Koenig, Christian 写道:

Am 23.11.18 um 14:15 schrieb Zhou, David(ChunMing):



在 2018/11/23 20:02, Koenig, Christian 写道:


Am 23.11.18 um 12:03 schrieb Christian König:


Am 23.11.18 um 11:56 schrieb zhoucm1:


On 2018年11月23日 18:10, Koenig, Christian wrote:


Am 23.11.18 um 03:36 schrieb zhoucm1:


On 2018年11月22日 19:30, Christian König wrote:


Am 22.11.18 um 07:52 schrieb zhoucm1:


On 2018年11月15日 19:12, Christian König wrote:


Implement finding the right timeline point in
drm_syncobj_find_fence.

Signed-off-by: Christian König <christian.koenig@amd.com><mailto:christian.koenig@amd.com>
---
     drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
     1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_syncobj.c
b/drivers/gpu/drm/drm_syncobj.c
index 589d884ccd58..d42c51520da4 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
*file_private,
             return -ENOENT;
           *fence = drm_syncobj_fence_get(syncobj);
-    if (!*fence) {
+    if (!*fence)
             ret = -EINVAL;
+
+    if (!ret && point) {
+        dma_fence_chain_for_each(*fence) {
+            if (!to_dma_fence_chain(*fence) ||
+                (*fence)->seqno <= point)
+                break;


This condition isn't enough to find proper point.
For two examples:
a. No garbage collection happens, the points in chain are
1----3----6----9----12---18---20, if user wants to get point17, then
we should return node 18.


And that is exactly what's wrong in the original logic. In this case
we need to return 12, not 18 because point 17 could have already been
garbage collected.


I don't think so, the 'a' case I already assume there isn't garbage
collection. If user wants to get point17, then we should return
node 18.
timeline means point[N]  must be signaled later than point[N-1].
Point[12] just can make sure point[1] ~point[12] are signaled.
Point[18] signal can make sure point[17] is signaled.
So this case we need to return 18, not 12, which is key timeline
concept.


No, exactly that's incorrect. When we ask for 17 and can't find it then
this means it either never existed or that it is signaled already.

Returning a lower number in this case or even a stub fence is perfectly
fine since we only need to wait for that one in this case.

If we return 18 in this case then we add incorrect synchronization when
there shouldn't be any.


No, That will make timeline not work at all and break timeline
semantics totally.

If there aren't point18 and point20, the chain is
1----3----6----9----12, if user wants to get point 17, you also
return 12? if yes, which absolutely is incorrect. The answer should
be NO, right? point17 should be waited on there until a bigger point
is coming.


Correct, but this is a different case. In this situation we either
return an error or wait for point 17 (or something >=17) to show up.

The key difference is if point 17 shows up then we return point 17,
but if point 18 shows up then we need to return point 12.



For chain is 1----3----6----9----12---18---20, if user wants to wait
on any one of points 13,14,15,16,17,18, we must wait for point 18,
this is timeline semantic.


Ah, now I understand. You are still sticking with the assumption of a
fence number, right?

In other words what you imply here is that we have the same semantic
as when somebody waits for a memory location to be written by number
17, right? In this case the semantics you describe here indeed applies.

But that is certainly not what we want to implement or otherwise we
will never be able to garbage collect the numbers in between.

So if Vulkan has this requirement then we need to reject that.


Backing of and reconsidering this I came to the conclusion that what you
suggest here is actually the most defensive solution.

In other words it is the solution where it's most likely that nothing
goes wrong because the worst thing that can happen is that we
synchronize to much, but never to less.

Going to think about it how we can bring that into alignment with the
proposed garbage collection.


Yeah, for garbae collection, I came up an idea this morning, we can pass
the signaled stub fence when chain node is created, when you walk out
all chain node, you can replace chain->fence with stub fence, that way,
there is no redudant fence referenced in chain node, and we can keep the
signal point in there.



That would still not allow to garbage collect the chain node itself.

only middle chain nodes are need, so the chain shouldn't be too long, right?




But I've came up with something which should work. Assume the original
chain is:

1----3----6----9----12---18

And we garbage collect everything but 6 and 18 then all we need to know
to return the correct node is what the original previous sequence number
was.

6 (3)----18 (12)

When then somebody asks for 17 we can still return 18 and if somebody
asks for 9 we would return 6.

then what point we return when somebody asks for 11?






Another use case, I'm not sure if you considered:
if chain is  1----3----6----9----12---18, a wait operation is on point
17, then we return 18, another signal point 17 comes, then we still wait
on 18(assume 18 takes very long time), that looks not reseonable, but
this is just performance problem potientially. Seems the way of timeline
sw_sync.c with comparing point for signal status can sovle it.



Well I thought that we declared that signaling lower numbers is illegal?

Sorry, I forgot it, quote from spec: "
*RESOLVED*: A 64-bit unsigned integer that can only be set to monotonically
increasing values by signal operations and is not changed by wait operations."

Can we think signaling lower numbers is forbidden?

If that's true, we can directly ignore lower number and return without error, keep the larger signal point.

Thanks,
David



My current solution to that is when userspace messes up the sequence
numbers and submit 1-3-6-9-12-18-17 we start a new chain with 17 and
never look back.

E.g. when somebody then asks for anything below 17 we always return 17
and if somebody asks for 18 we return an error because that is handled
as not signaled yet.

Regards,
Christian.




-David



Thanks,
Christian.



Regards,
Christian.



You can also check sw_sync.c for timeline meaning.

-David


Christian.



-David


b. garbage collection happens on point6, chain would be updated to
1---3---9---12---18---20, if user wants to get point5, then we
should return node 3, but if user wants to get point 7, then we
should return node 9.


Why? That doesn't seem to make any sense to me.



I still have no idea how to satisfy all these requirements with your
current chain-fence. All these logic just are same we encountered
before, we're walking them again. After solving these problems, I
guess all design is similar as before.

In fact, I don't know what problem previous design has, maybe there
are some bugs, can't we fix these bugs by time going? Who can make
sure his implementation never have bugs?


Well there where numerous problems with the original design. For
example we need to reject the requirement that timeline fences are in
order because that doesn't make sense in the kernel.

When userspace does something like submitting fences in the order 1,
5, 3 then it is broken and can keep the pieces. In other words the
kernel should not care about that, but rather make sure that it never
looses any synchronization no matter what.

Regards,
Christian.



-David


+        }
         }
+
         drm_syncobj_put(syncobj);
         return ret;
     }


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/dri-devel






[-- Attachment #1.2: Type: text/html, Size: 11415 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 13:42                     ` Chunming Zhou
@ 2018-11-23 18:16                       ` Christian König
  2018-11-24  8:28                         ` Chunming Zhou
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2018-11-23 18:16 UTC (permalink / raw)
  To: Chunming Zhou, Koenig, Christian, Zhou, David(ChunMing), dri-devel
  Cc: daniel.vetter


[-- Attachment #1.1: Type: text/plain, Size: 4293 bytes --]

Am 23.11.18 um 14:42 schrieb Chunming Zhou:
>
>> But I've came up with something which should work. Assume the original
>> chain is:
>>
>> 1----3----6----9----12---18
>>
>> And we garbage collect everything but 6 and 18 then all we need to know
>> to return the correct node is what the original previous sequence number
>> was.
>>
>> 6 (3)----18 (12)
>>
>> When then somebody asks for 17 we can still return 18 and if somebody
>> asks for 9 we would return 6.
> then what point we return when somebody asks for 11?

In this case 6. If 9 or 12 wouldn't be signaled yet than those.

>
>>> Another use case, I'm not sure if you considered:
>>> if chain is  1----3----6----9----12---18, a wait operation is on point
>>> 17, then we return 18, another signal point 17 comes, then we still wait
>>> on 18(assume 18 takes very long time), that looks not reseonable, but
>>> this is just performance problem potientially. Seems the way of timeline
>>> sw_sync.c with comparing point for signal status can sovle it.
>> Well I thought that we declared that signaling lower numbers is illegal?
> Sorry, I forgot it, quote from spec: "
>
> *RESOLVED*: A 64-bit unsigned integer that can only be set to 
> monotonically
>
> increasing values by signal operations and is not changed by wait 
> operations."
>
> Can we think signaling lower numbers is forbidden?
>
> If that's true, we can directly ignore lower number and return without 
> error, keep the larger signal point.

I've considered this as well, but came to the conclusion that we then 
would lose some sync fence.

Starting a new sequence when userspace does that is the better 
alternative, cause then we again always sync to much but never to less.

Christian.

>
> Thanks,
> David
>> My current solution to that is when userspace messes up the sequence
>> numbers and submit 1-3-6-9-12-18-17 we start a new chain with 17 and
>> never look back.
>>
>> E.g. when somebody then asks for anything below 17 we always return 17
>> and if somebody asks for 18 we return an error because that is handled
>> as not signaled yet.
>>
>> Regards,
>> Christian.
>>
>>> -David
>>>
>>>> Thanks,
>>>> Christian.
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> You can also check sw_sync.c for timeline meaning.
>>>>>>
>>>>>> -David
>>>>>>> Christian.
>>>>>>>
>>>>>>>> -David
>>>>>>>>>> b. garbage collection happens on point6, chain would be updated to
>>>>>>>>>> 1---3---9---12---18---20, if user wants to get point5, then we
>>>>>>>>>> should return node 3, but if user wants to get point 7, then we
>>>>>>>>>> should return node 9.
>>>>>>>>> Why? That doesn't seem to make any sense to me.
>>>>>>>>>
>>>>>>>>>> I still have no idea how to satisfy all these requirements with your
>>>>>>>>>> current chain-fence. All these logic just are same we encountered
>>>>>>>>>> before, we're walking them again. After solving these problems, I
>>>>>>>>>> guess all design is similar as before.
>>>>>>>>>>
>>>>>>>>>> In fact, I don't know what problem previous design has, maybe there
>>>>>>>>>> are some bugs, can't we fix these bugs by time going? Who can make
>>>>>>>>>> sure his implementation never have bugs?
>>>>>>>>> Well there where numerous problems with the original design. For
>>>>>>>>> example we need to reject the requirement that timeline fences are in
>>>>>>>>> order because that doesn't make sense in the kernel.
>>>>>>>>>
>>>>>>>>> When userspace does something like submitting fences in the order 1,
>>>>>>>>> 5, 3 then it is broken and can keep the pieces. In other words the
>>>>>>>>> kernel should not care about that, but rather make sure that it never
>>>>>>>>> looses any synchronization no matter what.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Christian.
>>>>>>>>>
>>>>>>>>>> -David
>>>>>>>>>>> +        }
>>>>>>>>>>>           }
>>>>>>>>>>> +
>>>>>>>>>>>           drm_syncobj_put(syncobj);
>>>>>>>>>>>           return ret;
>>>>>>>>>>>       }
>>>>>> _______________________________________________
>>>>>> dri-devel mailing list
>>>>>> dri-devel@lists.freedesktop.org
>>>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


[-- Attachment #1.2: Type: text/html, Size: 6955 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 18:16                       ` Christian König
@ 2018-11-24  8:28                         ` Chunming Zhou
  0 siblings, 0 replies; 28+ messages in thread
From: Chunming Zhou @ 2018-11-24  8:28 UTC (permalink / raw)
  To: Koenig, Christian, Zhou, David(ChunMing), dri-devel; +Cc: daniel.vetter


[-- Attachment #1.1: Type: text/plain, Size: 4385 bytes --]



在 2018/11/24 2:16, Christian König 写道:
Am 23.11.18 um 14:42 schrieb Chunming Zhou:


But I've came up with something which should work. Assume the original
chain is:

1----3----6----9----12---18

And we garbage collect everything but 6 and 18 then all we need to know
to return the correct node is what the original previous sequence number
was.

6 (3)----18 (12)

When then somebody asks for 17 we can still return 18 and if somebody
asks for 9 we would return 6.

then what point we return when somebody asks for 11?

In this case 6. If 9 or 12 wouldn't be signaled yet than those.



Another use case, I'm not sure if you considered:
if chain is  1----3----6----9----12---18, a wait operation is on point
17, then we return 18, another signal point 17 comes, then we still wait
on 18(assume 18 takes very long time), that looks not reseonable, but
this is just performance problem potientially. Seems the way of timeline
sw_sync.c with comparing point for signal status can sovle it.


Well I thought that we declared that signaling lower numbers is illegal?

Sorry, I forgot it, quote from spec: "
*RESOLVED*: A 64-bit unsigned integer that can only be set to monotonically
increasing values by signal operations and is not changed by wait operations."

Can we think signaling lower numbers is forbidden?

If that's true, we can directly ignore lower number and return without error, keep the larger signal point.

I've considered this as well, but came to the conclusion that we then would lose some sync fence.

Starting a new sequence when userspace does that is the better alternative, cause then we again always sync to much but never to less.
But that will lead timeline not monotonically increasing, won't it? Seems it's not good.
Which could result in some problems, e.g. if timeline is already up to 18, then userA think he can wait on 18, another signal point17 comes, the new sequence is changed to 17, but userA still send its command by appending wait on point18, which would never be signaled if no more signal comes, right?

-David

Christian.


Thanks,
David

My current solution to that is when userspace messes up the sequence
numbers and submit 1-3-6-9-12-18-17 we start a new chain with 17 and
never look back.

E.g. when somebody then asks for anything below 17 we always return 17
and if somebody asks for 18 we return an error because that is handled
as not signaled yet.

Regards,
Christian.



-David



Thanks,
Christian.



Regards,
Christian.



You can also check sw_sync.c for timeline meaning.

-David


Christian.



-David


b. garbage collection happens on point6, chain would be updated to
1---3---9---12---18---20, if user wants to get point5, then we
should return node 3, but if user wants to get point 7, then we
should return node 9.


Why? That doesn't seem to make any sense to me.



I still have no idea how to satisfy all these requirements with your
current chain-fence. All these logic just are same we encountered
before, we're walking them again. After solving these problems, I
guess all design is similar as before.

In fact, I don't know what problem previous design has, maybe there
are some bugs, can't we fix these bugs by time going? Who can make
sure his implementation never have bugs?


Well there where numerous problems with the original design. For
example we need to reject the requirement that timeline fences are in
order because that doesn't make sense in the kernel.

When userspace does something like submitting fences in the order 1,
5, 3 then it is broken and can keep the pieces. In other words the
kernel should not care about that, but rather make sure that it never
looses any synchronization no matter what.

Regards,
Christian.



-David


+        }
         }
+
         drm_syncobj_put(syncobj);
         return ret;
     }


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/dri-devel





_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org<mailto:dri-devel@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[-- Attachment #1.2: Type: text/html, Size: 6998 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence
  2018-11-23 12:40                   ` Christian König
@ 2018-11-27  7:53                     ` Daniel Vetter
  0 siblings, 0 replies; 28+ messages in thread
From: Daniel Vetter @ 2018-11-27  7:53 UTC (permalink / raw)
  To: christian.koenig; +Cc: daniel.vetter, dri-devel

On Fri, Nov 23, 2018 at 01:40:16PM +0100, Christian König wrote:
> Am 23.11.18 um 13:26 schrieb Daniel Vetter:
> > On Fri, Nov 23, 2018 at 12:02:41PM +0000, Koenig, Christian wrote:
> > > Am 23.11.18 um 12:03 schrieb Christian König:
> > > > Am 23.11.18 um 11:56 schrieb zhoucm1:
> > > > > 
> > > > > On 2018年11月23日 18:10, Koenig, Christian wrote:
> > > > > > Am 23.11.18 um 03:36 schrieb zhoucm1:
> > > > > > > On 2018年11月22日 19:30, Christian König wrote:
> > > > > > > > Am 22.11.18 um 07:52 schrieb zhoucm1:
> > > > > > > > > On 2018年11月15日 19:12, Christian König wrote:
> > > > > > > > > > Implement finding the right timeline point in
> > > > > > > > > > drm_syncobj_find_fence.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Christian König <christian.koenig@amd.com>
> > > > > > > > > > ---
> > > > > > > > > >     drivers/gpu/drm/drm_syncobj.c | 10 +++++++++-
> > > > > > > > > >     1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/drivers/gpu/drm/drm_syncobj.c
> > > > > > > > > > b/drivers/gpu/drm/drm_syncobj.c
> > > > > > > > > > index 589d884ccd58..d42c51520da4 100644
> > > > > > > > > > --- a/drivers/gpu/drm/drm_syncobj.c
> > > > > > > > > > +++ b/drivers/gpu/drm/drm_syncobj.c
> > > > > > > > > > @@ -307,9 +307,17 @@ int drm_syncobj_find_fence(struct drm_file
> > > > > > > > > > *file_private,
> > > > > > > > > >             return -ENOENT;
> > > > > > > > > >           *fence = drm_syncobj_fence_get(syncobj);
> > > > > > > > > > -    if (!*fence) {
> > > > > > > > > > +    if (!*fence)
> > > > > > > > > >             ret = -EINVAL;
> > > > > > > > > > +
> > > > > > > > > > +    if (!ret && point) {
> > > > > > > > > > +        dma_fence_chain_for_each(*fence) {
> > > > > > > > > > +            if (!to_dma_fence_chain(*fence) ||
> > > > > > > > > > +                (*fence)->seqno <= point)
> > > > > > > > > > +                break;
> > > > > > > > > This condition isn't enough to find proper point.
> > > > > > > > > For two examples:
> > > > > > > > > a. No garbage collection happens, the points in chain are
> > > > > > > > > 1----3----6----9----12---18---20, if user wants to get point17, then
> > > > > > > > > we should return node 18.
> > > > > > > > And that is exactly what's wrong in the original logic. In this case
> > > > > > > > we need to return 12, not 18 because point 17 could have already been
> > > > > > > > garbage collected.
> > > > > > > I don't think so, the 'a' case I already assume there isn't garbage
> > > > > > > collection. If user wants to get point17, then we should return
> > > > > > > node 18.
> > > > > > > timeline means point[N]  must be signaled later than point[N-1].
> > > > > > > Point[12] just can make sure point[1] ~point[12] are signaled.
> > > > > > > Point[18] signal can make sure point[17] is signaled.
> > > > > > > So this case we need to return 18, not 12, which is key timeline
> > > > > > > concept.
> > > > > > No, exactly that's incorrect. When we ask for 17 and can't find it then
> > > > > > this means it either never existed or that it is signaled already.
> > > > > > 
> > > > > > Returning a lower number in this case or even a stub fence is perfectly
> > > > > > fine since we only need to wait for that one in this case.
> > > > > > 
> > > > > > If we return 18 in this case then we add incorrect synchronization when
> > > > > > there shouldn't be any.
> > > > > No, That will make timeline not work at all and break timeline
> > > > > semantics totally.
> > > > > 
> > > > > If there aren't point18 and point20, the chain is
> > > > > 1----3----6----9----12, if user wants to get point 17, you also
> > > > > return 12? if yes, which absolutely is incorrect. The answer should
> > > > > be NO, right? point17 should be waited on there until a bigger point
> > > > > is coming.
> > > > Correct, but this is a different case. In this situation we either
> > > > return an error or wait for point 17 (or something >=17) to show up.
> > > > 
> > > > The key difference is if point 17 shows up then we return point 17,
> > > > but if point 18 shows up then we need to return point 12.
> > > > 
> > > > > For chain is 1----3----6----9----12---18---20, if user wants to wait
> > > > > on any one of points 13,14,15,16,17,18, we must wait for point 18,
> > > > > this is timeline semantic.
> > > > Ah, now I understand. You are still sticking with the assumption of a
> > > > fence number, right?
> > > > 
> > > > In other words what you imply here is that we have the same semantic
> > > > as when somebody waits for a memory location to be written by number
> > > > 17, right? In this case the semantics you describe here indeed applies.
> > > > 
> > > > But that is certainly not what we want to implement or otherwise we
> > > > will never be able to garbage collect the numbers in between.
> > > > 
> > > > So if Vulkan has this requirement then we need to reject that.
> > > Backing of and reconsidering this I came to the conclusion that what you
> > > suggest here is actually the most defensive solution.
> > > 
> > > In other words it is the solution where it's most likely that nothing
> > > goes wrong because the worst thing that can happen is that we
> > > synchronize to much, but never to less.
> > > 
> > > Going to think about it how we can bring that into alignment with the
> > > proposed garbage collection.
> > Should we implement the tests first (either as in-kernel unit tests, like
> > we have some, or in igt on top of vgem), agree on the semantics we want,
> > then work on the implementation?
> > 
> > All these discussions and gotchas and "oops another corner case we missed"
> > when only looking at the implementation feels like it could work out
> > better if we attack this from the other side of the uapi barrier ...
> 
> Well I agree that this "oops another corner case we missed" is exactly the
> key problem here.
> 
> But when the kernel developers implement the test cases we just move the
> issue to another place and not fundamentally solve it.
> 
> What we should do is to push the anv/radv/amdvlk devs to go ahead and
> implement test cases for all the ugly corner cases they have in mind.
> 
> Additional to that I really think that the UAPI David(ChunMing) came up with
> is actually pretty solid, so everything should actually be good to go to do
> this.

Yeah I guess khr test suite would be best, with mesa implementation
running. E.g. looking at what these timeline fences will be on the windows
side as monitored fences, I'm not sure all the things we have around
guaranteeing ordering and allowing out-of-order signalling is a good idea.
In windows a monitored fence is just a memory location you write. Hence
whomever issues the writes must guarantee that they're ordered. Once we
have khr (or at least piglit) tests we could figure out what this means
for us.

Anyway, I think we need more userspace code here :-)
-Daniel

> 
> Regards,
> Christian.
> 
> > 
> > Just a thought.
> > -Daniel
> > 
> > > Thanks,
> > > Christian.
> > > 
> > > > Regards,
> > > > Christian.
> > > > 
> > > > > You can also check sw_sync.c for timeline meaning.
> > > > > 
> > > > > -David
> > > > > > Christian.
> > > > > > 
> > > > > > > -David
> > > > > > > > > b. garbage collection happens on point6, chain would be updated to
> > > > > > > > > 1---3---9---12---18---20, if user wants to get point5, then we
> > > > > > > > > should return node 3, but if user wants to get point 7, then we
> > > > > > > > > should return node 9.
> > > > > > > > Why? That doesn't seem to make any sense to me.
> > > > > > > > 
> > > > > > > > > I still have no idea how to satisfy all these requirements with your
> > > > > > > > > current chain-fence. All these logic just are same we encountered
> > > > > > > > > before, we're walking them again. After solving these problems, I
> > > > > > > > > guess all design is similar as before.
> > > > > > > > > 
> > > > > > > > > In fact, I don't know what problem previous design has, maybe there
> > > > > > > > > are some bugs, can't we fix these bugs by time going? Who can make
> > > > > > > > > sure his implementation never have bugs?
> > > > > > > > Well there where numerous problems with the original design. For
> > > > > > > > example we need to reject the requirement that timeline fences are in
> > > > > > > > order because that doesn't make sense in the kernel.
> > > > > > > > 
> > > > > > > > When userspace does something like submitting fences in the order 1,
> > > > > > > > 5, 3 then it is broken and can keep the pieces. In other words the
> > > > > > > > kernel should not care about that, but rather make sure that it never
> > > > > > > > looses any synchronization no matter what.
> > > > > > > > 
> > > > > > > > Regards,
> > > > > > > > Christian.
> > > > > > > > 
> > > > > > > > > -David
> > > > > > > > > > +        }
> > > > > > > > > >         }
> > > > > > > > > > +
> > > > > > > > > >         drm_syncobj_put(syncobj);
> > > > > > > > > >         return ret;
> > > > > > > > > >     }
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2018-11-27  7:53 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-15 11:12 restart syncobj timeline changes Christian König
2018-11-15 11:12 ` [PATCH 1/7] dma-buf: make fence sequence numbers 64 bit Christian König
2018-11-16  5:26   ` Zhou, David(ChunMing)
2018-11-15 11:12 ` [PATCH 2/7] dma-buf: add new dma_fence_chain container Christian König
2018-11-16  5:51   ` Zhou, David(ChunMing)
2018-11-15 11:12 ` [PATCH 3/7] drm: revert "expand replace_fence to support timeline point v2" Christian König
2018-11-15 11:12 ` [PATCH 4/7] drm/syncobj: use only a single stub fence Christian König
2018-11-16  5:54   ` Zhou, David(ChunMing)
2018-11-15 11:12 ` [PATCH 5/7] drm/syncobj: move drm_syncobj_cb into drm_syncobj.c Christian König
2018-11-16  5:55   ` Zhou, David(ChunMing)
2018-11-15 11:12 ` [PATCH 6/7] drm/syncobj: add new drm_syncobj_add_point interface Christian König
2018-11-16  6:20   ` Zhou, David(ChunMing)
2018-11-15 11:12 ` [PATCH 7/7] drm/syncobj: use the timeline point in drm_syncobj_find_fence Christian König
2018-11-22  6:52   ` zhoucm1
2018-11-22 11:30     ` Christian König
2018-11-23  2:36       ` zhoucm1
2018-11-23 10:10         ` Koenig, Christian
2018-11-23 10:56           ` zhoucm1
2018-11-23 11:03             ` Christian König
2018-11-23 12:02               ` Koenig, Christian
2018-11-23 12:26                 ` Daniel Vetter
2018-11-23 12:40                   ` Christian König
2018-11-27  7:53                     ` Daniel Vetter
2018-11-23 13:15                 ` Chunming Zhou
2018-11-23 13:27                   ` Koenig, Christian
2018-11-23 13:42                     ` Chunming Zhou
2018-11-23 18:16                       ` Christian König
2018-11-24  8:28                         ` Chunming Zhou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.