linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fence array patchset
@ 2016-06-01 13:10 Christian König
  2016-06-01 13:10 ` [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2 Christian König
                   ` (11 more replies)
  0 siblings, 12 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

Hi guys,

this is the next iteration of the fence array patch set.

Daniel suggested that I provide an example on how this functionality might be
used by a driver. So I added a few additional patches in this series to show
what I want to do with this in the amdgpu driver.

The main idea is that for each VMID we have a set of hardware fences which are
currently using this VMID. Now when a new command submission needs a VMID we
construct a fence array which should signal when any of the VMIDs becomes
available and gives that back to our the scheduler.

This effort and my testing also found a rather stupid typo in the code and I
also tried to incorporate the comments from Chris and Daniel as well.

I think it's ready to land now, but as usual feel free to take it apart.

Cheers,
Christian.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2
  2016-06-01 13:10 Fence array patchset Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 15:23   ` Gustavo Padovan
  2016-06-01 13:10 ` [PATCH 02/11] dma-buf/fence: add fence_array fences v6 Christian König
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

Fence contexts are created on the fly (for example) by the GPU scheduler used
in the amdgpu driver as a result of an userspace request. Because of this
userspace could in theory force a wrap around of the 32bit context number
if it doesn't behave well.

Avoid this by increasing the context number to 64bits. This way even when
userspace manages to allocate a billion contexts per second it takes more
than 500 years for the context number to wrap around.

v2: fix printf formats as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/fence.c                 |  8 ++++----
 drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c  |  2 +-
 drivers/gpu/drm/etnaviv/etnaviv_gpu.h   |  2 +-
 drivers/gpu/drm/nouveau/nouveau_fence.h |  3 ++-
 drivers/gpu/drm/qxl/qxl_release.c       |  2 +-
 drivers/gpu/drm/radeon/radeon.h         |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_fence.c   |  2 +-
 drivers/staging/android/sync.h          |  3 ++-
 include/linux/fence.h                   | 13 +++++++------
 10 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/drivers/dma-buf/fence.c b/drivers/dma-buf/fence.c
index 7b05dbe..4d51f9e 100644
--- a/drivers/dma-buf/fence.c
+++ b/drivers/dma-buf/fence.c
@@ -35,7 +35,7 @@ EXPORT_TRACEPOINT_SYMBOL(fence_emit);
  * context or not. One device can have multiple separate contexts,
  * and they're used if some engine can run independently of another.
  */
-static atomic_t fence_context_counter = ATOMIC_INIT(0);
+static atomic64_t fence_context_counter = ATOMIC64_INIT(0);
 
 /**
  * fence_context_alloc - allocate an array of fence contexts
@@ -44,10 +44,10 @@ static atomic_t fence_context_counter = ATOMIC_INIT(0);
  * This function will return the first index of the number of fences allocated.
  * The fence context is used for setting fence->context to a unique number.
  */
-unsigned fence_context_alloc(unsigned num)
+u64 fence_context_alloc(unsigned num)
 {
 	BUG_ON(!num);
-	return atomic_add_return(num, &fence_context_counter) - num;
+	return atomic64_add_return(num, &fence_context_counter) - num;
 }
 EXPORT_SYMBOL(fence_context_alloc);
 
@@ -513,7 +513,7 @@ EXPORT_SYMBOL(fence_wait_any_timeout);
  */
 void
 fence_init(struct fence *fence, const struct fence_ops *ops,
-	     spinlock_t *lock, unsigned context, unsigned seqno)
+	     spinlock_t *lock, u64 context, unsigned seqno)
 {
 	BUG_ON(!lock);
 	BUG_ON(!ops || !ops->wait || !ops->enable_signaling ||
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 3d0cec2..7d25977 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -2045,7 +2045,7 @@ struct amdgpu_device {
 	struct amdgpu_irq_src		hpd_irq;
 
 	/* rings */
-	unsigned			fence_context;
+	u64				fence_context;
 	unsigned			num_rings;
 	struct amdgpu_ring		*rings[AMDGPU_MAX_RINGS];
 	bool				ib_pool_ready;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
index 48618ee..d8af37a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
@@ -428,7 +428,7 @@ void amdgpu_sa_bo_dump_debug_info(struct amdgpu_sa_manager *sa_manager,
 			   soffset, eoffset, eoffset - soffset);
 
 		if (i->fence)
-			seq_printf(m, " protected by 0x%08x on context %d",
+			seq_printf(m, " protected by 0x%08x on context %llu",
 				   i->fence->seqno, i->fence->context);
 
 		seq_printf(m, "\n");
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
index f233ac4..6a69214 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.h
@@ -123,7 +123,7 @@ struct etnaviv_gpu {
 	u32 completed_fence;
 	u32 retired_fence;
 	wait_queue_head_t fence_event;
-	unsigned int fence_context;
+	u64 fence_context;
 	spinlock_t fence_spinlock;
 
 	/* worker for handling active-list retiring: */
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 2e3a62d..64c4ce7 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -57,7 +57,8 @@ struct nouveau_fence_priv {
 	int  (*context_new)(struct nouveau_channel *);
 	void (*context_del)(struct nouveau_channel *);
 
-	u32 contexts, context_base;
+	u32 contexts;
+	u64 context_base;
 	bool uevent;
 };
 
diff --git a/drivers/gpu/drm/qxl/qxl_release.c b/drivers/gpu/drm/qxl/qxl_release.c
index 4efa8e2..f599cd0 100644
--- a/drivers/gpu/drm/qxl/qxl_release.c
+++ b/drivers/gpu/drm/qxl/qxl_release.c
@@ -96,7 +96,7 @@ retry:
 			return 0;
 
 		if (have_drawable_releases && sc > 300) {
-			FENCE_WARN(fence, "failed to wait on release %d "
+			FENCE_WARN(fence, "failed to wait on release %llu "
 					  "after spincount %d\n",
 					  fence->context & ~0xf0000000, sc);
 			goto signaled;
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 80b24a4..5633ee3 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2386,7 +2386,7 @@ struct radeon_device {
 	struct radeon_mman		mman;
 	struct radeon_fence_driver	fence_drv[RADEON_NUM_RINGS];
 	wait_queue_head_t		fence_queue;
-	unsigned			fence_context;
+	u64				fence_context;
 	struct mutex			ring_lock;
 	struct radeon_ring		ring[RADEON_NUM_RINGS];
 	bool				ib_pool_ready;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
index 8e689b4..96308e3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
@@ -46,7 +46,7 @@ struct vmw_fence_manager {
 	bool goal_irq_on; /* Protected by @goal_irq_mutex */
 	bool seqno_valid; /* Protected by @lock, and may not be set to true
 			     without the @goal_irq_mutex held. */
-	unsigned ctx;
+	u64 ctx;
 };
 
 struct vmw_user_fence {
diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
index afa0752..9298677 100644
--- a/drivers/staging/android/sync.h
+++ b/drivers/staging/android/sync.h
@@ -96,7 +96,8 @@ struct sync_timeline {
 
 	/* protected by child_list_lock */
 	bool			destroyed;
-	int			context, value;
+	u64			context;
+	int			value;
 
 	struct list_head	child_list_head;
 	spinlock_t		child_list_lock;
diff --git a/include/linux/fence.h b/include/linux/fence.h
index 5aa95eb..a2c4f53 100644
--- a/include/linux/fence.h
+++ b/include/linux/fence.h
@@ -75,7 +75,8 @@ struct fence {
 	struct rcu_head rcu;
 	struct list_head cb_list;
 	spinlock_t *lock;
-	unsigned context, seqno;
+	u64 context;
+	unsigned seqno;
 	unsigned long flags;
 	ktime_t timestamp;
 	int status;
@@ -176,7 +177,7 @@ struct fence_ops {
 };
 
 void fence_init(struct fence *fence, const struct fence_ops *ops,
-		spinlock_t *lock, unsigned context, unsigned seqno);
+		spinlock_t *lock, u64 context, unsigned seqno);
 
 void fence_release(struct kref *kref);
 void fence_free(struct fence *fence);
@@ -350,27 +351,27 @@ static inline signed long fence_wait(struct fence *fence, bool intr)
 	return ret < 0 ? ret : 0;
 }
 
-unsigned fence_context_alloc(unsigned num);
+u64 fence_context_alloc(unsigned num);
 
 #define FENCE_TRACE(f, fmt, args...) \
 	do {								\
 		struct fence *__ff = (f);				\
 		if (config_enabled(CONFIG_FENCE_TRACE))			\
-			pr_info("f %u#%u: " fmt,			\
+			pr_info("f %llu#%u: " fmt,			\
 				__ff->context, __ff->seqno, ##args);	\
 	} while (0)
 
 #define FENCE_WARN(f, fmt, args...) \
 	do {								\
 		struct fence *__ff = (f);				\
-		pr_warn("f %u#%u: " fmt, __ff->context, __ff->seqno,	\
+		pr_warn("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
 			 ##args);					\
 	} while (0)
 
 #define FENCE_ERR(f, fmt, args...) \
 	do {								\
 		struct fence *__ff = (f);				\
-		pr_err("f %u#%u: " fmt, __ff->context, __ff->seqno,	\
+		pr_err("f %llu#%u: " fmt, __ff->context, __ff->seqno,	\
 			##args);					\
 	} while (0)
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 02/11] dma-buf/fence: add fence_array fences v6
  2016-06-01 13:10 Fence array patchset Christian König
  2016-06-01 13:10 ` [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2 Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 14:01   ` Gustavo Padovan
  2016-06-01 15:25   ` Gustavo Padovan
  2016-06-01 13:10 ` [PATCH 03/11] dma-buf/fence: add signal_on_any to the fence array v2 Christian König
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

struct fence_collection inherits from struct fence and carries a
collection of fences that needs to be waited together.

It is useful to translate a sync_file to a fence to remove the complexity
of dealing with sync_files on DRM drivers. So even if there are many
fences in the sync_file that needs to waited for a commit to happen,
they all get added to the fence_collection and passed for DRM use as
a standard struct fence.

That means that no changes needed to any driver besides supporting fences.

fence_collection's fence doesn't belong to any timeline context, so
fence_is_later() and fence_later() are not meant to be called with
fence_collections fences.

v2: Comments by Daniel Vetter:
	- merge fence_collection_init() and fence_collection_add()
	- only add callbacks at ->enable_signalling()
	- remove fence_collection_put()
	- check for type on to_fence_collection()
	- adjust fence_is_later() and fence_later() to WARN_ON() if they
	are used with collection fences.

v3: - Initialize fence_cb.node at fence init.

    Comments by Chris Wilson:
	- return "unbound" on fence_collection_get_timeline_name()
	- don't stop adding callbacks if one fails
	- remove redundant !! on fence_collection_enable_signaling()
	- remove redundant () on fence_collection_signaled
	- use fence_default_wait() instead

v4 (chk): Rework, simplification and cleanup:
	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
	- Rename to fence_array.
	- Return fixed driver name.
	- Register only one callback at a time.
	- Document that create function takes ownership of array.

v5 (chk): More work and fixes:
	- Avoid deadlocks by adding all callbacks at once again.
	- Stop trying to remove the callbacks.
	- Provide context and sequence number for the array fence.

v6 (chk): Fixes found during testing
	- Fix stupid typo in _enable_signaling().

Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/Makefile      |   2 +-
 drivers/dma-buf/fence-array.c | 127 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/fence-array.h   |  72 ++++++++++++++++++++++++
 3 files changed, 200 insertions(+), 1 deletion(-)
 create mode 100644 drivers/dma-buf/fence-array.c
 create mode 100644 include/linux/fence-array.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 57a675f..85f6928 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1 +1 @@
-obj-y := dma-buf.o fence.o reservation.o seqno-fence.o
+obj-y := dma-buf.o fence.o reservation.o seqno-fence.o fence-array.o
diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
new file mode 100644
index 0000000..8141217
--- /dev/null
+++ b/drivers/dma-buf/fence-array.c
@@ -0,0 +1,127 @@
+/*
+ * fence-array: aggregate fences to be waited together
+ *
+ * Copyright (C) 2016 Collabora Ltd
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ * Authors:
+ *	Gustavo Padovan <gustavo@padovan.org>
+ *	Christian König <christian.koenig@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/fence-array.h>
+
+static void fence_array_cb_func(struct fence *f, struct fence_cb *cb);
+
+static const char *fence_array_get_driver_name(struct fence *fence)
+{
+	return "fence_array";
+}
+
+static const char *fence_array_get_timeline_name(struct fence *fence)
+{
+	return "unbound";
+}
+
+static void fence_array_cb_func(struct fence *f, struct fence_cb *cb)
+{
+	struct fence_array_cb *array_cb =
+		container_of(cb, struct fence_array_cb, cb);
+	struct fence_array *array = array_cb->array;
+
+	if (atomic_dec_and_test(&array->num_pending))
+		fence_signal(&array->base);
+}
+
+static bool fence_array_enable_signaling(struct fence *fence)
+{
+	struct fence_array *array = to_fence_array(fence);
+	struct fence_array_cb *cb = (void *)(&array[1]);
+	unsigned i;
+
+	for (i = 0; i < array->num_fences; ++i) {
+		cb[i].array = array;
+		if (fence_add_callback(array->fences[i], &cb[i].cb,
+				       fence_array_cb_func))
+			if (atomic_dec_and_test(&array->num_pending))
+				return false;
+	}
+
+	return true;
+}
+
+static bool fence_array_signaled(struct fence *fence)
+{
+	struct fence_array *array = to_fence_array(fence);
+
+	return atomic_read(&array->num_pending) == 0;
+}
+
+static void fence_array_release(struct fence *fence)
+{
+	struct fence_array *array = to_fence_array(fence);
+	unsigned i;
+
+	for (i = 0; i < array->num_fences; ++i)
+		fence_put(array->fences[i]);
+
+	kfree(array->fences);
+	fence_free(fence);
+}
+
+const struct fence_ops fence_array_ops = {
+	.get_driver_name = fence_array_get_driver_name,
+	.get_timeline_name = fence_array_get_timeline_name,
+	.enable_signaling = fence_array_enable_signaling,
+	.signaled = fence_array_signaled,
+	.wait = fence_default_wait,
+	.release = fence_array_release,
+};
+
+/**
+ * fence_array_create - Create a custom fence array
+ * @num_fences:	[in]	number of fences to add in the array
+ * @fences:	[in]	array containing the fences
+ * @context:	[in]	fence context to use
+ * @seqno:	[in]	sequence number to use
+ *
+ * Allocate a fence_array object and initialize the base fence with fence_init().
+ * In case of error it returns NULL.
+ *
+ * The caller should allocte the fences array with num_fences size
+ * and fill it with the fences it wants to add to the object. Ownership of this
+ * array is take and fence_put() is used on each fence on release.
+ */
+struct fence_array *fence_array_create(int num_fences, struct fence **fences,
+				       u64 context, unsigned seqno)
+{
+	struct fence_array *array;
+	size_t size = sizeof(*array);
+
+	/* Allocate the callback structures behind the array. */
+	size += num_fences * sizeof(struct fence_array_cb);
+	array = kzalloc(size, GFP_KERNEL);
+	if (!array)
+		return NULL;
+
+	spin_lock_init(&array->lock);
+	fence_init(&array->base, &fence_array_ops, &array->lock,
+		   context, seqno);
+
+	array->num_fences = num_fences;
+	atomic_set(&array->num_pending, num_fences);
+	array->fences = fences;
+
+	return array;
+}
+EXPORT_SYMBOL(fence_array_create);
diff --git a/include/linux/fence-array.h b/include/linux/fence-array.h
new file mode 100644
index 0000000..593ab98
--- /dev/null
+++ b/include/linux/fence-array.h
@@ -0,0 +1,72 @@
+/*
+ * fence-array: aggregates fence to be waited together
+ *
+ * Copyright (C) 2016 Collabora Ltd
+ * Copyright (C) 2016 Advanced Micro Devices, Inc.
+ * Authors:
+ *	Gustavo Padovan <gustavo@padovan.org>
+ *	Christian König <christian.koenig@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#ifndef __LINUX_FENCE_ARRAY_H
+#define __LINUX_FENCE_ARRAY_H
+
+#include <linux/fence.h>
+
+/**
+ * struct fence_array_cb - callback helper for fence array
+ * @cb: fence callback structure for signaling
+ * @array: reference to the parent fence array object
+ */
+struct fence_array_cb {
+	struct fence_cb cb;
+	struct fence_array *array;
+};
+
+/**
+ * struct fence_array - fence to represent an array of fences
+ * @base: fence base class
+ * @lock: spinlock for fence handling
+ * @num_fences: number of fences in the array
+ * @num_pending: fences in the array still pending
+ * @fences: array of the fences
+ */
+struct fence_array {
+	struct fence base;
+
+	spinlock_t lock;
+	unsigned num_fences;
+	atomic_t num_pending;
+	struct fence **fences;
+};
+
+extern const struct fence_ops fence_array_ops;
+
+/**
+ * to_fence_array - cast a fence to a fence_array
+ * @fence: fence to cast to a fence_array
+ *
+ * Returns NULL if the fence is not a fence_array,
+ * or the fence_array otherwise.
+ */
+static inline struct fence_array *to_fence_array(struct fence *fence)
+{
+	if (fence->ops != &fence_array_ops)
+		return NULL;
+
+	return container_of(fence, struct fence_array, base);
+}
+
+struct fence_array *fence_array_create(int num_fences, struct fence **fences,
+				       u64 context, unsigned seqno);
+
+#endif /* __LINUX_FENCE_ARRAY_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 03/11] dma-buf/fence: add signal_on_any to the fence array v2
  2016-06-01 13:10 Fence array patchset Christian König
  2016-06-01 13:10 ` [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2 Christian König
  2016-06-01 13:10 ` [PATCH 02/11] dma-buf/fence: add fence_array fences v6 Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 15:25   ` Gustavo Padovan
  2016-06-01 13:10 ` [PATCH 04/11] drm/amdgpu: document amdgpu_sync_get_fence Christian König
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

If @signal_on_any is true the fence array signals if any fence in the array
signals, otherwise it signals when all fences in the array signal.

v2: fix signaled test and add comment suggested by Chris Wilson.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/dma-buf/fence-array.c | 33 +++++++++++++++++++++++++--------
 include/linux/fence-array.h   |  3 ++-
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/drivers/dma-buf/fence-array.c b/drivers/dma-buf/fence-array.c
index 8141217..a8731c8 100644
--- a/drivers/dma-buf/fence-array.c
+++ b/drivers/dma-buf/fence-array.c
@@ -41,6 +41,7 @@ static void fence_array_cb_func(struct fence *f, struct fence_cb *cb)
 
 	if (atomic_dec_and_test(&array->num_pending))
 		fence_signal(&array->base);
+	fence_put(&array->base);
 }
 
 static bool fence_array_enable_signaling(struct fence *fence)
@@ -51,10 +52,21 @@ static bool fence_array_enable_signaling(struct fence *fence)
 
 	for (i = 0; i < array->num_fences; ++i) {
 		cb[i].array = array;
+		/*
+		 * As we may report that the fence is signaled before all
+		 * callbacks are complete, we need to take an additional
+		 * reference count on the array so that we do not free it too
+		 * early. The core fence handling will only hold the reference
+		 * until we signal the array as complete (but that is now
+		 * insufficient).
+		 */
+		fence_get(&array->base);
 		if (fence_add_callback(array->fences[i], &cb[i].cb,
-				       fence_array_cb_func))
+				       fence_array_cb_func)) {
+			fence_put(&array->base);
 			if (atomic_dec_and_test(&array->num_pending))
 				return false;
+		}
 	}
 
 	return true;
@@ -64,7 +76,7 @@ static bool fence_array_signaled(struct fence *fence)
 {
 	struct fence_array *array = to_fence_array(fence);
 
-	return atomic_read(&array->num_pending) == 0;
+	return atomic_read(&array->num_pending) <= 0;
 }
 
 static void fence_array_release(struct fence *fence)
@@ -90,10 +102,11 @@ const struct fence_ops fence_array_ops = {
 
 /**
  * fence_array_create - Create a custom fence array
- * @num_fences:	[in]	number of fences to add in the array
- * @fences:	[in]	array containing the fences
- * @context:	[in]	fence context to use
- * @seqno:	[in]	sequence number to use
+ * @num_fences:		[in]	number of fences to add in the array
+ * @fences:		[in]	array containing the fences
+ * @context:		[in]	fence context to use
+ * @seqno:		[in]	sequence number to use
+ * @signal_on_any	[in]	signal on any fence in the array
  *
  * Allocate a fence_array object and initialize the base fence with fence_init().
  * In case of error it returns NULL.
@@ -101,9 +114,13 @@ const struct fence_ops fence_array_ops = {
  * The caller should allocte the fences array with num_fences size
  * and fill it with the fences it wants to add to the object. Ownership of this
  * array is take and fence_put() is used on each fence on release.
+ *
+ * If @signal_on_any is true the fence array signals if any fence in the array
+ * signals, otherwise it signals when all fences in the array signal.
  */
 struct fence_array *fence_array_create(int num_fences, struct fence **fences,
-				       u64 context, unsigned seqno)
+				       u64 context, unsigned seqno,
+				       bool signal_on_any)
 {
 	struct fence_array *array;
 	size_t size = sizeof(*array);
@@ -119,7 +136,7 @@ struct fence_array *fence_array_create(int num_fences, struct fence **fences,
 		   context, seqno);
 
 	array->num_fences = num_fences;
-	atomic_set(&array->num_pending, num_fences);
+	atomic_set(&array->num_pending, signal_on_any ? 1 : num_fences);
 	array->fences = fences;
 
 	return array;
diff --git a/include/linux/fence-array.h b/include/linux/fence-array.h
index 593ab98..86baaa4 100644
--- a/include/linux/fence-array.h
+++ b/include/linux/fence-array.h
@@ -67,6 +67,7 @@ static inline struct fence_array *to_fence_array(struct fence *fence)
 }
 
 struct fence_array *fence_array_create(int num_fences, struct fence **fences,
-				       u64 context, unsigned seqno);
+				       u64 context, unsigned seqno,
+				       bool signal_on_any);
 
 #endif /* __LINUX_FENCE_ARRAY_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 04/11] drm/amdgpu: document amdgpu_sync_get_fence
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (2 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 03/11] dma-buf/fence: add signal_on_any to the fence array v2 Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 05/11] drm/amdgpu: generalize the scheduler fence Christian König
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

It's not obvious what it should do.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 34a9280..e0ff1a1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -297,6 +297,13 @@ int amdgpu_sync_cycle_fences(struct amdgpu_sync *dst, struct amdgpu_sync *src,
 	return 0;
 }
 
+/**
+ * amdgpu_sync_get_fence - get the next fence from the sync object
+ *
+ * @sync: sync object to use
+ *
+ * Get and removes the next fence from the sync object not signaled yet.
+ */
 struct fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync)
 {
 	struct amdgpu_sync_entry *e;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 05/11] drm/amdgpu: generalize the scheduler fence
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (3 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 04/11] drm/amdgpu: document amdgpu_sync_get_fence Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 06/11] drm/amdgpu: remove amdgpu_sync_wait Christian König
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

Make it two events, one for the job being scheduled and one when it is finished.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c         |  4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h       |  4 +-
 drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h |  4 +-
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.c   | 34 +++++++------
 drivers/gpu/drm/amd/scheduler/gpu_scheduler.h   | 19 ++++----
 drivers/gpu/drm/amd/scheduler/sched_fence.c     | 63 ++++++++++++++++++-------
 6 files changed, 79 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d4791a7..ddfed93 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -85,7 +85,7 @@ static void amdgpu_job_free_resources(struct amdgpu_job *job)
 	unsigned i;
 
 	/* use sched fence if available */
-	f = job->base.s_fence ? &job->base.s_fence->base : job->fence;
+	f = job->base.s_fence ? &job->base.s_fence->finished : job->fence;
 
 	for (i = 0; i < job->num_ibs; ++i)
 		amdgpu_ib_free(job->adev, &job->ibs[i], f);
@@ -143,7 +143,7 @@ static struct fence *amdgpu_job_dependency(struct amd_sched_job *sched_job)
 		int r;
 
 		r = amdgpu_vm_grab_id(vm, ring, &job->sync,
-				      &job->base.s_fence->base,
+				      &job->base.s_fence->finished,
 				      &job->vm_id, &job->vm_pd_addr);
 		if (r)
 			DRM_ERROR("Error getting VM ID (%d)\n", r);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 38e5689..ecd08f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -102,7 +102,7 @@ TRACE_EVENT(amdgpu_cs_ioctl,
 			   __entry->adev = job->adev;
 			   __entry->sched_job = &job->base;
 			   __entry->ib = job->ibs;
-			   __entry->fence = &job->base.s_fence->base;
+			   __entry->fence = &job->base.s_fence->finished;
 			   __entry->ring_name = job->ring->name;
 			   __entry->num_ibs = job->num_ibs;
 			   ),
@@ -127,7 +127,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
 			   __entry->adev = job->adev;
 			   __entry->sched_job = &job->base;
 			   __entry->ib = job->ibs;
-			   __entry->fence = &job->base.s_fence->base;
+			   __entry->fence = &job->base.s_fence->finished;
 			   __entry->ring_name = job->ring->name;
 			   __entry->num_ibs = job->num_ibs;
 			   ),
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h b/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h
index c89dc77..b961a1c 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h
+++ b/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h
@@ -26,7 +26,7 @@ TRACE_EVENT(amd_sched_job,
 	    TP_fast_assign(
 			   __entry->entity = sched_job->s_entity;
 			   __entry->sched_job = sched_job;
-			   __entry->fence = &sched_job->s_fence->base;
+			   __entry->fence = &sched_job->s_fence->finished;
 			   __entry->name = sched_job->sched->name;
 			   __entry->job_count = kfifo_len(
 				   &sched_job->s_entity->job_queue) / sizeof(sched_job);
@@ -46,7 +46,7 @@ TRACE_EVENT(amd_sched_process_job,
 		    ),
 
 	    TP_fast_assign(
-		    __entry->fence = &fence->base;
+		    __entry->fence = &fence->finished;
 		    ),
 	    TP_printk("fence=%p signaled", __entry->fence)
 );
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
index 2425172..74aa0b3 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
@@ -140,7 +140,7 @@ int amd_sched_entity_init(struct amd_gpu_scheduler *sched,
 		return r;
 
 	atomic_set(&entity->fence_seq, 0);
-	entity->fence_context = fence_context_alloc(1);
+	entity->fence_context = fence_context_alloc(2);
 
 	return 0;
 }
@@ -251,17 +251,21 @@ static bool amd_sched_entity_add_dependency_cb(struct amd_sched_entity *entity)
 
 	s_fence = to_amd_sched_fence(fence);
 	if (s_fence && s_fence->sched == sched) {
-		/* Fence is from the same scheduler */
-		if (test_bit(AMD_SCHED_FENCE_SCHEDULED_BIT, &fence->flags)) {
-			/* Ignore it when it is already scheduled */
-			fence_put(entity->dependency);
-			return false;
-		}
 
-		/* Wait for fence to be scheduled */
-		entity->cb.func = amd_sched_entity_clear_dep;
-		list_add_tail(&entity->cb.node, &s_fence->scheduled_cb);
-		return true;
+		/*
+		 * Fence is from the same scheduler, only need to wait for
+		 * it to be scheduled
+		 */
+		fence = fence_get(&s_fence->scheduled);
+		fence_put(entity->dependency);
+		entity->dependency = fence;
+		if (!fence_add_callback(fence, &entity->cb,
+					amd_sched_entity_clear_dep))
+			return true;
+
+		/* Ignore it when it is already scheduled */
+		fence_put(fence);
+		return false;
 	}
 
 	if (!fence_add_callback(entity->dependency, &entity->cb,
@@ -389,7 +393,7 @@ void amd_sched_entity_push_job(struct amd_sched_job *sched_job)
 	struct amd_sched_entity *entity = sched_job->s_entity;
 
 	trace_amd_sched_job(sched_job);
-	fence_add_callback(&sched_job->s_fence->base, &sched_job->finish_cb,
+	fence_add_callback(&sched_job->s_fence->finished, &sched_job->finish_cb,
 			   amd_sched_job_finish_cb);
 	wait_event(entity->sched->job_scheduled,
 		   amd_sched_entity_in(sched_job));
@@ -412,7 +416,7 @@ int amd_sched_job_init(struct amd_sched_job *job,
 	INIT_DELAYED_WORK(&job->work_tdr, amd_sched_job_timedout);
 
 	if (fence)
-		*fence = &job->s_fence->base;
+		*fence = &job->s_fence->finished;
 	return 0;
 }
 
@@ -463,10 +467,10 @@ static void amd_sched_process_job(struct fence *f, struct fence_cb *cb)
 	struct amd_gpu_scheduler *sched = s_fence->sched;
 
 	atomic_dec(&sched->hw_rq_count);
-	amd_sched_fence_signal(s_fence);
+	amd_sched_fence_finished(s_fence);
 
 	trace_amd_sched_process_job(s_fence);
-	fence_put(&s_fence->base);
+	fence_put(&s_fence->finished);
 	wake_up_interruptible(&sched->wake_up_worker);
 }
 
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
index e63034e..3e989b1 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.h
@@ -27,8 +27,6 @@
 #include <linux/kfifo.h>
 #include <linux/fence.h>
 
-#define AMD_SCHED_FENCE_SCHEDULED_BIT	FENCE_FLAG_USER_BITS
-
 struct amd_gpu_scheduler;
 struct amd_sched_rq;
 
@@ -68,9 +66,9 @@ struct amd_sched_rq {
 };
 
 struct amd_sched_fence {
-	struct fence                    base;
+	struct fence                    scheduled;
+	struct fence                    finished;
 	struct fence_cb                 cb;
-	struct list_head		scheduled_cb;
 	struct amd_gpu_scheduler	*sched;
 	spinlock_t			lock;
 	void                            *owner;
@@ -86,14 +84,15 @@ struct amd_sched_job {
 	struct delayed_work		work_tdr;
 };
 
-extern const struct fence_ops amd_sched_fence_ops;
+extern const struct fence_ops amd_sched_fence_ops_scheduled;
+extern const struct fence_ops amd_sched_fence_ops_finished;
 static inline struct amd_sched_fence *to_amd_sched_fence(struct fence *f)
 {
-	struct amd_sched_fence *__f = container_of(f, struct amd_sched_fence,
-						   base);
+	if (f->ops == &amd_sched_fence_ops_scheduled)
+		return container_of(f, struct amd_sched_fence, scheduled);
 
-	if (__f->base.ops == &amd_sched_fence_ops)
-		return __f;
+	if (f->ops == &amd_sched_fence_ops_finished)
+		return container_of(f, struct amd_sched_fence, finished);
 
 	return NULL;
 }
@@ -148,7 +147,7 @@ void amd_sched_entity_push_job(struct amd_sched_job *sched_job);
 struct amd_sched_fence *amd_sched_fence_create(
 	struct amd_sched_entity *s_entity, void *owner);
 void amd_sched_fence_scheduled(struct amd_sched_fence *fence);
-void amd_sched_fence_signal(struct amd_sched_fence *fence);
+void amd_sched_fence_finished(struct amd_sched_fence *fence);
 int amd_sched_job_init(struct amd_sched_job *job,
 		       struct amd_gpu_scheduler *sched,
 		       struct amd_sched_entity *entity,
diff --git a/drivers/gpu/drm/amd/scheduler/sched_fence.c b/drivers/gpu/drm/amd/scheduler/sched_fence.c
index 71931bc..a5e3fef 100644
--- a/drivers/gpu/drm/amd/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/amd/scheduler/sched_fence.c
@@ -37,36 +37,37 @@ struct amd_sched_fence *amd_sched_fence_create(struct amd_sched_entity *entity,
 	if (fence == NULL)
 		return NULL;
 
-	INIT_LIST_HEAD(&fence->scheduled_cb);
 	fence->owner = owner;
 	fence->sched = entity->sched;
 	spin_lock_init(&fence->lock);
 
 	seq = atomic_inc_return(&entity->fence_seq);
-	fence_init(&fence->base, &amd_sched_fence_ops, &fence->lock,
-		   entity->fence_context, seq);
+	fence_init(&fence->scheduled, &amd_sched_fence_ops_scheduled,
+		   &fence->lock, entity->fence_context, seq);
+	fence_init(&fence->finished, &amd_sched_fence_ops_finished,
+		   &fence->lock, entity->fence_context + 1, seq);
 
 	return fence;
 }
 
-void amd_sched_fence_signal(struct amd_sched_fence *fence)
+void amd_sched_fence_scheduled(struct amd_sched_fence *fence)
 {
-	int ret = fence_signal(&fence->base);
+	int ret = fence_signal(&fence->scheduled);
+
 	if (!ret)
-		FENCE_TRACE(&fence->base, "signaled from irq context\n");
+		FENCE_TRACE(&fence->scheduled, "signaled from irq context\n");
 	else
-		FENCE_TRACE(&fence->base, "was already signaled\n");
+		FENCE_TRACE(&fence->scheduled, "was already signaled\n");
 }
 
-void amd_sched_fence_scheduled(struct amd_sched_fence *s_fence)
+void amd_sched_fence_finished(struct amd_sched_fence *fence)
 {
-	struct fence_cb *cur, *tmp;
+	int ret = fence_signal(&fence->finished);
 
-	set_bit(AMD_SCHED_FENCE_SCHEDULED_BIT, &s_fence->base.flags);
-	list_for_each_entry_safe(cur, tmp, &s_fence->scheduled_cb, node) {
-		list_del_init(&cur->node);
-		cur->func(&s_fence->base, cur);
-	}
+	if (!ret)
+		FENCE_TRACE(&fence->finished, "signaled from irq context\n");
+	else
+		FENCE_TRACE(&fence->finished, "was already signaled\n");
 }
 
 static const char *amd_sched_fence_get_driver_name(struct fence *fence)
@@ -96,6 +97,7 @@ static void amd_sched_fence_free(struct rcu_head *rcu)
 {
 	struct fence *f = container_of(rcu, struct fence, rcu);
 	struct amd_sched_fence *fence = to_amd_sched_fence(f);
+
 	kmem_cache_free(sched_fence_slab, fence);
 }
 
@@ -107,16 +109,41 @@ static void amd_sched_fence_free(struct rcu_head *rcu)
  * This function is called when the reference count becomes zero.
  * It just RCU schedules freeing up the fence.
  */
-static void amd_sched_fence_release(struct fence *f)
+static void amd_sched_fence_release_scheduled(struct fence *f)
+{
+	struct amd_sched_fence *fence = to_amd_sched_fence(f);
+
+	call_rcu(&fence->finished.rcu, amd_sched_fence_free);
+}
+
+/**
+ * amd_sched_fence_release_scheduled - drop extra reference
+ *
+ * @f: fence
+ *
+ * Drop the extra reference from the scheduled fence to the base fence.
+ */
+static void amd_sched_fence_release_finished(struct fence *f)
 {
-	call_rcu(&f->rcu, amd_sched_fence_free);
+	struct amd_sched_fence *fence = to_amd_sched_fence(f);
+
+	fence_put(&fence->scheduled);
 }
 
-const struct fence_ops amd_sched_fence_ops = {
+const struct fence_ops amd_sched_fence_ops_scheduled = {
+	.get_driver_name = amd_sched_fence_get_driver_name,
+	.get_timeline_name = amd_sched_fence_get_timeline_name,
+	.enable_signaling = amd_sched_fence_enable_signaling,
+	.signaled = NULL,
+	.wait = fence_default_wait,
+	.release = amd_sched_fence_release_scheduled,
+};
+
+const struct fence_ops amd_sched_fence_ops_finished = {
 	.get_driver_name = amd_sched_fence_get_driver_name,
 	.get_timeline_name = amd_sched_fence_get_timeline_name,
 	.enable_signaling = amd_sched_fence_enable_signaling,
 	.signaled = NULL,
 	.wait = fence_default_wait,
-	.release = amd_sched_fence_release,
+	.release = amd_sched_fence_release_finished,
 };
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 06/11] drm/amdgpu: remove amdgpu_sync_wait
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (4 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 05/11] drm/amdgpu: generalize the scheduler fence Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 07/11] drm/amdgpu: add optional ring to amdgpu_sync_is_idle Christian König
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

Stop hiding bugs, instead print a proper error when the scheduler
doesn't handle all dependencies.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h      |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  6 +-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 19 -------------------
 3 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 7d25977..2f4ab1b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -601,7 +601,6 @@ bool amdgpu_sync_is_idle(struct amdgpu_sync *sync);
 int amdgpu_sync_cycle_fences(struct amdgpu_sync *dst, struct amdgpu_sync *src,
 			     struct fence *fence);
 struct fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync);
-int amdgpu_sync_wait(struct amdgpu_sync *sync);
 void amdgpu_sync_free(struct amdgpu_sync *sync);
 int amdgpu_sync_init(void);
 void amdgpu_sync_fini(void);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index ddfed93..009e905 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -166,11 +166,7 @@ static struct fence *amdgpu_job_run(struct amd_sched_job *sched_job)
 	}
 	job = to_amdgpu_job(sched_job);
 
-	r = amdgpu_sync_wait(&job->sync);
-	if (r) {
-		DRM_ERROR("failed to sync wait (%d)\n", r);
-		return NULL;
-	}
+	BUG_ON(!amdgpu_sync_is_idle(&job->sync));
 
 	trace_amdgpu_sched_run_job(job);
 	r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index e0ff1a1..c0ed5b9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -326,25 +326,6 @@ struct fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync)
 	return NULL;
 }
 
-int amdgpu_sync_wait(struct amdgpu_sync *sync)
-{
-	struct amdgpu_sync_entry *e;
-	struct hlist_node *tmp;
-	int i, r;
-
-	hash_for_each_safe(sync->fences, i, tmp, e, node) {
-		r = fence_wait(e->fence, false);
-		if (r)
-			return r;
-
-		hash_del(&e->node);
-		fence_put(e->fence);
-		kmem_cache_free(amdgpu_sync_slab, e);
-	}
-
-	return 0;
-}
-
 /**
  * amdgpu_sync_free - free the sync object
  *
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 07/11] drm/amdgpu: add optional ring to amdgpu_sync_is_idle
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (5 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 06/11] drm/amdgpu: remove amdgpu_sync_wait Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 08/11] drm/amdgpu: prefer VMIDs idle on the current ring Christian König
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

Check if the sync object is idle depending on the ring a submission works with.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h      |  3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 17 +++++++++++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   |  4 ++--
 4 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 2f4ab1b..f154d9f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -597,7 +597,8 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
 		     struct amdgpu_sync *sync,
 		     struct reservation_object *resv,
 		     void *owner);
-bool amdgpu_sync_is_idle(struct amdgpu_sync *sync);
+bool amdgpu_sync_is_idle(struct amdgpu_sync *sync,
+			 struct amdgpu_ring *ring);
 int amdgpu_sync_cycle_fences(struct amdgpu_sync *dst, struct amdgpu_sync *src,
 			     struct fence *fence);
 struct fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 009e905..e395bbe 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -166,7 +166,7 @@ static struct fence *amdgpu_job_run(struct amd_sched_job *sched_job)
 	}
 	job = to_amdgpu_job(sched_job);
 
-	BUG_ON(!amdgpu_sync_is_idle(&job->sync));
+	BUG_ON(!amdgpu_sync_is_idle(&job->sync, NULL));
 
 	trace_amdgpu_sched_run_job(job);
 	r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index c0ed5b9..a2766d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -226,10 +226,13 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
  * amdgpu_sync_is_idle - test if all fences are signaled
  *
  * @sync: the sync object
+ * @ring: optional ring to use for test
  *
- * Returns true if all fences in the sync object are signaled.
+ * Returns true if all fences in the sync object are signaled or scheduled to
+ * the ring (if provided).
  */
-bool amdgpu_sync_is_idle(struct amdgpu_sync *sync)
+bool amdgpu_sync_is_idle(struct amdgpu_sync *sync,
+			 struct amdgpu_ring *ring)
 {
 	struct amdgpu_sync_entry *e;
 	struct hlist_node *tmp;
@@ -237,6 +240,16 @@ bool amdgpu_sync_is_idle(struct amdgpu_sync *sync)
 
 	hash_for_each_safe(sync->fences, i, tmp, e, node) {
 		struct fence *f = e->fence;
+		struct amd_sched_fence *s_fence = to_amd_sched_fence(f);
+
+		if (ring && s_fence) {
+			/* For fences from the same ring it is sufficient
+			 * when they are scheduled.
+			 */
+			if (s_fence->sched == &ring->sched &&
+			    fence_is_signaled(&s_fence->scheduled))
+				continue;
+		}
 
 		if (fence_is_signaled(f)) {
 			hash_del(&e->node);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 62a4c12..2bcb623 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -240,13 +240,13 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 			      struct amdgpu_vm_id,
 			      list);
 
-	if (!amdgpu_sync_is_idle(&id->active)) {
+	if (!amdgpu_sync_is_idle(&id->active, NULL)) {
 		struct list_head *head = &adev->vm_manager.ids_lru;
 		struct amdgpu_vm_id *tmp;
 
 		list_for_each_entry_safe(id, tmp, &adev->vm_manager.ids_lru,
 					 list) {
-			if (amdgpu_sync_is_idle(&id->active)) {
+			if (amdgpu_sync_is_idle(&id->active, NULL)) {
 				list_move(&id->list, head);
 				head = &id->list;
 			}
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 08/11] drm/amdgpu: prefer VMIDs idle on the current ring
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (6 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 07/11] drm/amdgpu: add optional ring to amdgpu_sync_is_idle Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 09/11] drm/amdgpu: reuse VMIDs assigned to a VM only if there is also a free one Christian König
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

Prefer to use a VMIDs which are idle on the ring we want to submit to. This
also removes bubbling idle VMIDs up on the LRU, which is actually not
beneficial.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 2bcb623..b6484a2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -236,21 +236,15 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 
 	} while (i != ring->idx);
 
-	id = list_first_entry(&adev->vm_manager.ids_lru,
-			      struct amdgpu_vm_id,
-			      list);
-
-	if (!amdgpu_sync_is_idle(&id->active, NULL)) {
-		struct list_head *head = &adev->vm_manager.ids_lru;
-		struct amdgpu_vm_id *tmp;
-
-		list_for_each_entry_safe(id, tmp, &adev->vm_manager.ids_lru,
-					 list) {
-			if (amdgpu_sync_is_idle(&id->active, NULL)) {
-				list_move(&id->list, head);
-				head = &id->list;
-			}
-		}
+	/* Check if we have an idle VMID */
+	list_for_each_entry(id, &adev->vm_manager.ids_lru, list) {
+		if (amdgpu_sync_is_idle(&id->active, ring))
+			break;
+
+	}
+
+	/* If we can't find a idle VMID to use, just wait for the oldest */
+	if (&id->list == &adev->vm_manager.ids_lru) {
 		id = list_first_entry(&adev->vm_manager.ids_lru,
 				      struct amdgpu_vm_id,
 				      list);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 09/11] drm/amdgpu: reuse VMIDs assigned to a VM only if there is also a free one
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (7 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 08/11] drm/amdgpu: prefer VMIDs idle on the current ring Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 10/11] drm/amdgpu: use a fence array for VMID management Christian König
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

This fixes a fairness problem with the GPU scheduler. VM having lot of
jobs could previously starve VM with less jobs.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 113 +++++++++++++++++----------------
 1 file changed, 59 insertions(+), 54 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index b6484a2..f206820 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -179,75 +179,80 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 	uint64_t pd_addr = amdgpu_bo_gpu_offset(vm->page_directory);
 	struct amdgpu_device *adev = ring->adev;
 	struct fence *updates = sync->last_vm_update;
-	struct amdgpu_vm_id *id;
+	struct amdgpu_vm_id *id, *idle;
 	unsigned i = ring->idx;
 	int r;
 
 	mutex_lock(&adev->vm_manager.lock);
 
-	/* Check if we can use a VMID already assigned to this VM */
-	do {
-		struct fence *flushed;
-
-		id = vm->ids[i++];
-		if (i == AMDGPU_MAX_RINGS)
-			i = 0;
-
-		/* Check all the prerequisites to using this VMID */
-		if (!id)
-			continue;
-
-		if (atomic64_read(&id->owner) != vm->client_id)
-			continue;
-
-		if (pd_addr != id->pd_gpu_addr)
-			continue;
+	/* Check if we have an idle VMID */
+	list_for_each_entry(idle, &adev->vm_manager.ids_lru, list) {
+		if (amdgpu_sync_is_idle(&idle->active, ring))
+			break;
 
-		if (id->last_user != ring &&
-		    (!id->last_flush || !fence_is_signaled(id->last_flush)))
-			continue;
+	}
 
-		flushed  = id->flushed_updates;
-		if (updates && (!flushed || fence_is_later(updates, flushed)))
-			continue;
+	/* If we can't find a idle VMID to use, just wait for the oldest */
+	if (&idle->list == &adev->vm_manager.ids_lru) {
+		id = list_first_entry(&adev->vm_manager.ids_lru,
+				      struct amdgpu_vm_id,
+				      list);
+	} else {
+		/* Check if we can use a VMID already assigned to this VM */
+		do {
+			struct fence *flushed;
+
+			id = vm->ids[i++];
+			if (i == AMDGPU_MAX_RINGS)
+				i = 0;
+
+			/* Check all the prerequisites to using this VMID */
+			if (!id)
+				continue;
+
+			if (atomic64_read(&id->owner) != vm->client_id)
+				continue;
+
+			if (pd_addr != id->pd_gpu_addr)
+				continue;
+
+			if (id->last_user != ring && (!id->last_flush ||
+			    !fence_is_signaled(id->last_flush)))
+				continue;
+
+			flushed  = id->flushed_updates;
+			if (updates && (!flushed ||
+			    fence_is_later(updates, flushed)))
+				continue;
+
+			/* Good we can use this VMID */
+			if (id->last_user == ring) {
+				r = amdgpu_sync_fence(ring->adev, sync,
+						      id->first);
+				if (r)
+					goto error;
+			}
 
-		/* Good we can use this VMID */
-		if (id->last_user == ring) {
-			r = amdgpu_sync_fence(ring->adev, sync,
-					      id->first);
+			/* And remember this submission as user of the VMID */
+			r = amdgpu_sync_fence(ring->adev, &id->active, fence);
 			if (r)
 				goto error;
-		}
-
-		/* And remember this submission as user of the VMID */
-		r = amdgpu_sync_fence(ring->adev, &id->active, fence);
-		if (r)
-			goto error;
 
-		list_move_tail(&id->list, &adev->vm_manager.ids_lru);
-		vm->ids[ring->idx] = id;
+			list_move_tail(&id->list, &adev->vm_manager.ids_lru);
+			vm->ids[ring->idx] = id;
 
-		*vm_id = id - adev->vm_manager.ids;
-		*vm_pd_addr = AMDGPU_VM_NO_FLUSH;
-		trace_amdgpu_vm_grab_id(vm, ring->idx, *vm_id, *vm_pd_addr);
+			*vm_id = id - adev->vm_manager.ids;
+			*vm_pd_addr = AMDGPU_VM_NO_FLUSH;
+			trace_amdgpu_vm_grab_id(vm, ring->idx, *vm_id,
+						*vm_pd_addr);
 
-		mutex_unlock(&adev->vm_manager.lock);
-		return 0;
+			mutex_unlock(&adev->vm_manager.lock);
+			return 0;
 
-	} while (i != ring->idx);
+		} while (i != ring->idx);
 
-	/* Check if we have an idle VMID */
-	list_for_each_entry(id, &adev->vm_manager.ids_lru, list) {
-		if (amdgpu_sync_is_idle(&id->active, ring))
-			break;
-
-	}
-
-	/* If we can't find a idle VMID to use, just wait for the oldest */
-	if (&id->list == &adev->vm_manager.ids_lru) {
-		id = list_first_entry(&adev->vm_manager.ids_lru,
-				      struct amdgpu_vm_id,
-				      list);
+		/* Still no ID to use? Then use the idle one found earlier */
+		id = idle;
 	}
 
 	r = amdgpu_sync_cycle_fences(sync, &id->active, fence);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 10/11] drm/amdgpu: use a fence array for VMID management
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (8 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 09/11] drm/amdgpu: reuse VMIDs assigned to a VM only if there is also a free one Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-01 13:10 ` [PATCH 11/11] drm/amdgpu: remove now unnecessary checks Christian König
  2016-06-01 13:42 ` Fence array patchset Alex Deucher
  11 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

Just wait for any fence to become available, instead
of waiting for the last entry of the LRU.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h      |  10 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c |  69 +++-----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c   | 155 +++++++++++++++++++------------
 4 files changed, 117 insertions(+), 119 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index f154d9f..52326d3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -597,10 +597,8 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
 		     struct amdgpu_sync *sync,
 		     struct reservation_object *resv,
 		     void *owner);
-bool amdgpu_sync_is_idle(struct amdgpu_sync *sync,
-			 struct amdgpu_ring *ring);
-int amdgpu_sync_cycle_fences(struct amdgpu_sync *dst, struct amdgpu_sync *src,
-			     struct fence *fence);
+struct fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
+				     struct amdgpu_ring *ring);
 struct fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync);
 void amdgpu_sync_free(struct amdgpu_sync *sync);
 int amdgpu_sync_init(void);
@@ -910,6 +908,10 @@ struct amdgpu_vm_manager {
 	struct list_head			ids_lru;
 	struct amdgpu_vm_id			ids[AMDGPU_NUM_VM];
 
+	/* Handling of VM fences */
+	u64					fence_context;
+	unsigned				seqno[AMDGPU_MAX_RINGS];
+
 	uint32_t				max_pfn;
 	/* vram base address for page table entry  */
 	u64					vram_base_offset;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index e395bbe..b50a845 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -166,7 +166,7 @@ static struct fence *amdgpu_job_run(struct amd_sched_job *sched_job)
 	}
 	job = to_amdgpu_job(sched_job);
 
-	BUG_ON(!amdgpu_sync_is_idle(&job->sync, NULL));
+	BUG_ON(amdgpu_sync_peek_fence(&job->sync, NULL));
 
 	trace_amdgpu_sched_run_job(job);
 	r = amdgpu_ib_schedule(job->ring, job->num_ibs, job->ibs,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index a2766d7..5c8d302 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -223,16 +223,16 @@ int amdgpu_sync_resv(struct amdgpu_device *adev,
 }
 
 /**
- * amdgpu_sync_is_idle - test if all fences are signaled
+ * amdgpu_sync_peek_fence - get the next fence not signaled yet
  *
  * @sync: the sync object
  * @ring: optional ring to use for test
  *
- * Returns true if all fences in the sync object are signaled or scheduled to
- * the ring (if provided).
+ * Returns the next fence not signaled yet without removing it from the sync
+ * object.
  */
-bool amdgpu_sync_is_idle(struct amdgpu_sync *sync,
-			 struct amdgpu_ring *ring)
+struct fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync,
+				     struct amdgpu_ring *ring)
 {
 	struct amdgpu_sync_entry *e;
 	struct hlist_node *tmp;
@@ -246,68 +246,25 @@ bool amdgpu_sync_is_idle(struct amdgpu_sync *sync,
 			/* For fences from the same ring it is sufficient
 			 * when they are scheduled.
 			 */
-			if (s_fence->sched == &ring->sched &&
-			    fence_is_signaled(&s_fence->scheduled))
-				continue;
-		}
+			if (s_fence->sched == &ring->sched) {
+				if (fence_is_signaled(&s_fence->scheduled))
+					continue;
 
-		if (fence_is_signaled(f)) {
-			hash_del(&e->node);
-			fence_put(f);
-			kmem_cache_free(amdgpu_sync_slab, e);
-			continue;
+				return &s_fence->scheduled;
+			}
 		}
 
-		return false;
-	}
-
-	return true;
-}
-
-/**
- * amdgpu_sync_cycle_fences - move fences from one sync object into another
- *
- * @dst: the destination sync object
- * @src: the source sync object
- * @fence: fence to add to source
- *
- * Remove all fences from source and put them into destination and add
- * fence as new one into source.
- */
-int amdgpu_sync_cycle_fences(struct amdgpu_sync *dst, struct amdgpu_sync *src,
-			     struct fence *fence)
-{
-	struct amdgpu_sync_entry *e, *newone;
-	struct hlist_node *tmp;
-	int i;
-
-	/* Allocate the new entry before moving the old ones */
-	newone = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL);
-	if (!newone)
-		return -ENOMEM;
-
-	hash_for_each_safe(src->fences, i, tmp, e, node) {
-		struct fence *f = e->fence;
-
-		hash_del(&e->node);
 		if (fence_is_signaled(f)) {
+			hash_del(&e->node);
 			fence_put(f);
 			kmem_cache_free(amdgpu_sync_slab, e);
 			continue;
 		}
 
-		if (amdgpu_sync_add_later(dst, f)) {
-			kmem_cache_free(amdgpu_sync_slab, e);
-			continue;
-		}
-
-		hash_add(dst->fences, &e->node, f->context);
+		return f;
 	}
 
-	hash_add(src->fences, &newone->node, fence->context);
-	newone->fence = fence_get(fence);
-
-	return 0;
+	return NULL;
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index f206820..8ea1c73 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -25,6 +25,7 @@
  *          Alex Deucher
  *          Jerome Glisse
  */
+#include <linux/fence-array.h>
 #include <drm/drmP.h>
 #include <drm/amdgpu_drm.h>
 #include "amdgpu.h"
@@ -180,82 +181,116 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 	struct amdgpu_device *adev = ring->adev;
 	struct fence *updates = sync->last_vm_update;
 	struct amdgpu_vm_id *id, *idle;
-	unsigned i = ring->idx;
-	int r;
+	struct fence **fences;
+	unsigned i;
+	int r = 0;
+
+	fences = kmalloc_array(sizeof(void *), adev->vm_manager.num_ids,
+			       GFP_KERNEL);
+	if (!fences)
+		return -ENOMEM;
 
 	mutex_lock(&adev->vm_manager.lock);
 
 	/* Check if we have an idle VMID */
+	i = 0;
 	list_for_each_entry(idle, &adev->vm_manager.ids_lru, list) {
-		if (amdgpu_sync_is_idle(&idle->active, ring))
+		fences[i] = amdgpu_sync_peek_fence(&idle->active, ring);
+		if (!fences[i])
 			break;
-
+		++i;
 	}
 
-	/* If we can't find a idle VMID to use, just wait for the oldest */
+	/* If we can't find a idle VMID to use, wait till one becomes available */
 	if (&idle->list == &adev->vm_manager.ids_lru) {
-		id = list_first_entry(&adev->vm_manager.ids_lru,
-				      struct amdgpu_vm_id,
-				      list);
-	} else {
-		/* Check if we can use a VMID already assigned to this VM */
-		do {
-			struct fence *flushed;
-
-			id = vm->ids[i++];
-			if (i == AMDGPU_MAX_RINGS)
-				i = 0;
-
-			/* Check all the prerequisites to using this VMID */
-			if (!id)
-				continue;
-
-			if (atomic64_read(&id->owner) != vm->client_id)
-				continue;
-
-			if (pd_addr != id->pd_gpu_addr)
-				continue;
-
-			if (id->last_user != ring && (!id->last_flush ||
-			    !fence_is_signaled(id->last_flush)))
-				continue;
-
-			flushed  = id->flushed_updates;
-			if (updates && (!flushed ||
-			    fence_is_later(updates, flushed)))
-				continue;
-
-			/* Good we can use this VMID */
-			if (id->last_user == ring) {
-				r = amdgpu_sync_fence(ring->adev, sync,
-						      id->first);
-				if (r)
-					goto error;
-			}
+		u64 fence_context = adev->vm_manager.fence_context + ring->idx;
+		unsigned seqno = ++adev->vm_manager.seqno[ring->idx];
+		struct fence_array *array;
+		unsigned j;
+
+		for (j = 0; j < i; ++j)
+			fence_get(fences[j]);
+
+		array = fence_array_create(i, fences, fence_context,
+					   seqno, true);
+		if (!array) {
+			for (j = 0; j < i; ++j)
+				fence_put(fences[j]);
+			kfree(fences);
+			r = -ENOMEM;
+			goto error;
+		}
+
+
+		r = amdgpu_sync_fence(ring->adev, sync, &array->base);
+		fence_put(&array->base);
+		if (r)
+			goto error;
+
+		mutex_unlock(&adev->vm_manager.lock);
+		return 0;
+
+	}
+	kfree(fences);
+
+	/* Check if we can use a VMID already assigned to this VM */
+	i = ring->idx;
+	do {
+		struct fence *flushed;
+
+		id = vm->ids[i++];
+		if (i == AMDGPU_MAX_RINGS)
+			i = 0;
 
-			/* And remember this submission as user of the VMID */
-			r = amdgpu_sync_fence(ring->adev, &id->active, fence);
+		/* Check all the prerequisites to using this VMID */
+		if (!id)
+			continue;
+
+		if (atomic64_read(&id->owner) != vm->client_id)
+			continue;
+
+		if (pd_addr != id->pd_gpu_addr)
+			continue;
+
+		if (id->last_user != ring &&
+		    (!id->last_flush || !fence_is_signaled(id->last_flush)))
+			continue;
+
+		flushed  = id->flushed_updates;
+		if (updates &&
+		    (!flushed || fence_is_later(updates, flushed)))
+			continue;
+
+		/* Good we can use this VMID */
+		if (id->last_user == ring) {
+			r = amdgpu_sync_fence(ring->adev, sync,
+					      id->first);
 			if (r)
 				goto error;
+		}
+
+		/* And remember this submission as user of the VMID */
+		r = amdgpu_sync_fence(ring->adev, &id->active, fence);
+		if (r)
+			goto error;
 
-			list_move_tail(&id->list, &adev->vm_manager.ids_lru);
-			vm->ids[ring->idx] = id;
+		list_move_tail(&id->list, &adev->vm_manager.ids_lru);
+		vm->ids[ring->idx] = id;
 
-			*vm_id = id - adev->vm_manager.ids;
-			*vm_pd_addr = AMDGPU_VM_NO_FLUSH;
-			trace_amdgpu_vm_grab_id(vm, ring->idx, *vm_id,
-						*vm_pd_addr);
+		*vm_id = id - adev->vm_manager.ids;
+		*vm_pd_addr = AMDGPU_VM_NO_FLUSH;
+		trace_amdgpu_vm_grab_id(vm, ring->idx, *vm_id, *vm_pd_addr);
 
-			mutex_unlock(&adev->vm_manager.lock);
-			return 0;
+		mutex_unlock(&adev->vm_manager.lock);
+		return 0;
 
-		} while (i != ring->idx);
+	} while (i != ring->idx);
 
-		/* Still no ID to use? Then use the idle one found earlier */
-		id = idle;
-	}
+	/* Still no ID to use? Then use the idle one found earlier */
+	id = idle;
 
-	r = amdgpu_sync_cycle_fences(sync, &id->active, fence);
+	/* Remember this submission as user of the VMID */
+	r = amdgpu_sync_fence(ring->adev, &id->active, fence);
 	if (r)
 		goto error;
 
@@ -1515,6 +1550,10 @@ void amdgpu_vm_manager_init(struct amdgpu_device *adev)
 			      &adev->vm_manager.ids_lru);
 	}
 
+	adev->vm_manager.fence_context = fence_context_alloc(AMDGPU_MAX_RINGS);
+	for (i = 0; i < AMDGPU_MAX_RINGS; ++i)
+		adev->vm_manager.seqno[i] = 0;
+
 	atomic_set(&adev->vm_manager.vm_pte_next_ring, 0);
 	atomic64_set(&adev->vm_manager.client_counter, 0);
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 11/11] drm/amdgpu: remove now unnecessary checks
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (9 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 10/11] drm/amdgpu: use a fence array for VMID management Christian König
@ 2016-06-01 13:10 ` Christian König
  2016-06-06 21:14   ` Alex Deucher
  2016-06-01 13:42 ` Fence array patchset Alex Deucher
  11 siblings, 1 reply; 21+ messages in thread
From: Christian König @ 2016-06-01 13:10 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel, daniel, chris, gustavo

From: Christian König <christian.koenig@amd.com>

vm_flush() now comes directly after vm_grab_id().

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 +++++++++++--------------------
 2 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index 52326d3..e054542 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -886,7 +886,6 @@ struct amdgpu_vm_id {
 	struct fence		*first;
 	struct amdgpu_sync	active;
 	struct fence		*last_flush;
-	struct amdgpu_ring      *last_user;
 	atomic64_t		owner;
 
 	uint64_t		pd_gpu_addr;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 8ea1c73..48d5ad18 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -237,6 +237,7 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 	i = ring->idx;
 	do {
 		struct fence *flushed;
+		bool same_ring = ring->idx == i;
 
 		id = vm->ids[i++];
 		if (i == AMDGPU_MAX_RINGS)
@@ -252,7 +253,7 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 		if (pd_addr != id->pd_gpu_addr)
 			continue;
 
-		if (id->last_user != ring &&
+		if (!same_ring &&
 		    (!id->last_flush || !fence_is_signaled(id->last_flush)))
 			continue;
 
@@ -261,15 +262,9 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 		    (!flushed || fence_is_later(updates, flushed)))
 			continue;
 
-		/* Good we can use this VMID */
-		if (id->last_user == ring) {
-			r = amdgpu_sync_fence(ring->adev, sync,
-					      id->first);
-			if (r)
-				goto error;
-		}
-
-		/* And remember this submission as user of the VMID */
+		/* Good we can use this VMID. Remember this submission as
+		 * user of the VMID.
+		 */
 		r = amdgpu_sync_fence(ring->adev, &id->active, fence);
 		if (r)
 			goto error;
@@ -306,7 +301,6 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 	id->pd_gpu_addr = pd_addr;
 
 	list_move_tail(&id->list, &adev->vm_manager.ids_lru);
-	id->last_user = ring;
 	atomic64_set(&id->owner, vm->client_id);
 	vm->ids[ring->idx] = id;
 
@@ -357,16 +351,13 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring,
 		trace_amdgpu_vm_flush(pd_addr, ring->idx, vm_id);
 		amdgpu_ring_emit_vm_flush(ring, vm_id, pd_addr);
 
+		r = amdgpu_fence_emit(ring, &fence);
+		if (r)
+			return r;
+
 		mutex_lock(&adev->vm_manager.lock);
-		if ((id->pd_gpu_addr == pd_addr) && (id->last_user == ring)) {
-			r = amdgpu_fence_emit(ring, &fence);
-			if (r) {
-				mutex_unlock(&adev->vm_manager.lock);
-				return r;
-			}
-			fence_put(id->last_flush);
-			id->last_flush = fence;
-		}
+		fence_put(id->last_flush);
+		id->last_flush = fence;
 		mutex_unlock(&adev->vm_manager.lock);
 	}
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: Fence array patchset
  2016-06-01 13:10 Fence array patchset Christian König
                   ` (10 preceding siblings ...)
  2016-06-01 13:10 ` [PATCH 11/11] drm/amdgpu: remove now unnecessary checks Christian König
@ 2016-06-01 13:42 ` Alex Deucher
  11 siblings, 0 replies; 21+ messages in thread
From: Alex Deucher @ 2016-06-01 13:42 UTC (permalink / raw)
  To: Christian König; +Cc: Maling list - DRI developers, LKML

On Wed, Jun 1, 2016 at 9:10 AM, Christian König <deathsimple@vodafone.de> wrote:
> Hi guys,
>
> this is the next iteration of the fence array patch set.
>
> Daniel suggested that I provide an example on how this functionality might be
> used by a driver. So I added a few additional patches in this series to show
> what I want to do with this in the amdgpu driver.
>
> The main idea is that for each VMID we have a set of hardware fences which are
> currently using this VMID. Now when a new command submission needs a VMID we
> construct a fence array which should signal when any of the VMIDs becomes
> available and gives that back to our the scheduler.

For those that are not familiar, a VMID = Virtual Memory ID.  AMD GPUs
have multiple virtual address space contexts which can be in flight on
the GPU at any given time.  The VMID is used to select which VM
context you want to use for a specific GPU operation.

Alex

>
> This effort and my testing also found a rather stupid typo in the code and I
> also tried to incorporate the comments from Chris and Daniel as well.
>
> I think it's ready to land now, but as usual feel free to take it apart.
>
> Cheers,
> Christian.
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/11] dma-buf/fence: add fence_array fences v6
  2016-06-01 13:10 ` [PATCH 02/11] dma-buf/fence: add fence_array fences v6 Christian König
@ 2016-06-01 14:01   ` Gustavo Padovan
  2016-06-01 15:25   ` Gustavo Padovan
  1 sibling, 0 replies; 21+ messages in thread
From: Gustavo Padovan @ 2016-06-01 14:01 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel, linux-kernel, daniel, chris

Hi Christian,

2016-06-01 Christian König <deathsimple@vodafone.de>:

> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> 
> struct fence_collection inherits from struct fence and carries a
> collection of fences that needs to be waited together.
> 
> It is useful to translate a sync_file to a fence to remove the complexity
> of dealing with sync_files on DRM drivers. So even if there are many
> fences in the sync_file that needs to waited for a commit to happen,
> they all get added to the fence_collection and passed for DRM use as
> a standard struct fence.
> 
> That means that no changes needed to any driver besides supporting fences.
> 
> fence_collection's fence doesn't belong to any timeline context, so
> fence_is_later() and fence_later() are not meant to be called with
> fence_collections fences.

The commit message needs to be fixed to say mention fence_array instead
of fence_collection and we do create fence contexts for fence_arrays
now.

	Gustavo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2
  2016-06-01 13:10 ` [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2 Christian König
@ 2016-06-01 15:23   ` Gustavo Padovan
  0 siblings, 0 replies; 21+ messages in thread
From: Gustavo Padovan @ 2016-06-01 15:23 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel, linux-kernel, daniel, chris

2016-06-01 Christian König <deathsimple@vodafone.de>:

> From: Christian König <christian.koenig@amd.com>
> 
> Fence contexts are created on the fly (for example) by the GPU scheduler used
> in the amdgpu driver as a result of an userspace request. Because of this
> userspace could in theory force a wrap around of the 32bit context number
> if it doesn't behave well.
> 
> Avoid this by increasing the context number to 64bits. This way even when
> userspace manages to allocate a billion contexts per second it takes more
> than 500 years for the context number to wrap around.
> 
> v2: fix printf formats as well.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/fence.c                 |  8 ++++----
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h     |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c  |  2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_gpu.h   |  2 +-
>  drivers/gpu/drm/nouveau/nouveau_fence.h |  3 ++-
>  drivers/gpu/drm/qxl/qxl_release.c       |  2 +-
>  drivers/gpu/drm/radeon/radeon.h         |  2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_fence.c   |  2 +-
>  drivers/staging/android/sync.h          |  3 ++-
>  include/linux/fence.h                   | 13 +++++++------
>  10 files changed, 21 insertions(+), 18 deletions(-)

Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

	Gustavo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/11] dma-buf/fence: add fence_array fences v6
  2016-06-01 13:10 ` [PATCH 02/11] dma-buf/fence: add fence_array fences v6 Christian König
  2016-06-01 14:01   ` Gustavo Padovan
@ 2016-06-01 15:25   ` Gustavo Padovan
  2016-06-01 16:24     ` Sumit Semwal
  1 sibling, 1 reply; 21+ messages in thread
From: Gustavo Padovan @ 2016-06-01 15:25 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel, linux-kernel, daniel, chris

2016-06-01 Christian König <deathsimple@vodafone.de>:

> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> 
> struct fence_collection inherits from struct fence and carries a
> collection of fences that needs to be waited together.
> 
> It is useful to translate a sync_file to a fence to remove the complexity
> of dealing with sync_files on DRM drivers. So even if there are many
> fences in the sync_file that needs to waited for a commit to happen,
> they all get added to the fence_collection and passed for DRM use as
> a standard struct fence.
> 
> That means that no changes needed to any driver besides supporting fences.
> 
> fence_collection's fence doesn't belong to any timeline context, so
> fence_is_later() and fence_later() are not meant to be called with
> fence_collections fences.
> 
> v2: Comments by Daniel Vetter:
> 	- merge fence_collection_init() and fence_collection_add()
> 	- only add callbacks at ->enable_signalling()
> 	- remove fence_collection_put()
> 	- check for type on to_fence_collection()
> 	- adjust fence_is_later() and fence_later() to WARN_ON() if they
> 	are used with collection fences.
> 
> v3: - Initialize fence_cb.node at fence init.
> 
>     Comments by Chris Wilson:
> 	- return "unbound" on fence_collection_get_timeline_name()
> 	- don't stop adding callbacks if one fails
> 	- remove redundant !! on fence_collection_enable_signaling()
> 	- remove redundant () on fence_collection_signaled
> 	- use fence_default_wait() instead
> 
> v4 (chk): Rework, simplification and cleanup:
> 	- Drop FENCE_NO_CONTEXT handling, always allocate a context.
> 	- Rename to fence_array.
> 	- Return fixed driver name.
> 	- Register only one callback at a time.
> 	- Document that create function takes ownership of array.
> 
> v5 (chk): More work and fixes:
> 	- Avoid deadlocks by adding all callbacks at once again.
> 	- Stop trying to remove the callbacks.
> 	- Provide context and sequence number for the array fence.
> 
> v6 (chk): Fixes found during testing
> 	- Fix stupid typo in _enable_signaling().
> 
> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/Makefile      |   2 +-
>  drivers/dma-buf/fence-array.c | 127 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fence-array.h   |  72 ++++++++++++++++++++++++
>  3 files changed, 200 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/dma-buf/fence-array.c
>  create mode 100644 include/linux/fence-array.h

This is working fine to me. Once the commit message is fixed:

Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

You need to Cc Sumit here to decide who is picking these patches.
It would be better to get them through the drm trees so we would need
his Ack at least.

	Gustavo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 03/11] dma-buf/fence: add signal_on_any to the fence array v2
  2016-06-01 13:10 ` [PATCH 03/11] dma-buf/fence: add signal_on_any to the fence array v2 Christian König
@ 2016-06-01 15:25   ` Gustavo Padovan
  0 siblings, 0 replies; 21+ messages in thread
From: Gustavo Padovan @ 2016-06-01 15:25 UTC (permalink / raw)
  To: Christian König; +Cc: dri-devel, linux-kernel, daniel, chris

2016-06-01 Christian König <deathsimple@vodafone.de>:

> From: Christian König <christian.koenig@amd.com>
> 
> If @signal_on_any is true the fence array signals if any fence in the array
> signals, otherwise it signals when all fences in the array signal.
> 
> v2: fix signaled test and add comment suggested by Chris Wilson.
> 
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/dma-buf/fence-array.c | 33 +++++++++++++++++++++++++--------
>  include/linux/fence-array.h   |  3 ++-
>  2 files changed, 27 insertions(+), 9 deletions(-)

Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>

	Gustavo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/11] dma-buf/fence: add fence_array fences v6
  2016-06-01 15:25   ` Gustavo Padovan
@ 2016-06-01 16:24     ` Sumit Semwal
  2016-06-01 22:44       ` Daniel Vetter
  0 siblings, 1 reply; 21+ messages in thread
From: Sumit Semwal @ 2016-06-01 16:24 UTC (permalink / raw)
  To: Gustavo Padovan, Christian König, DRI mailing list, LKML,
	Daniel Vetter, Chris Wilson

Hi Christian, Gustavo,

Thanks for these patches.

On 1 June 2016 at 20:55, Gustavo Padovan <gustavo@padovan.org> wrote:
> 2016-06-01 Christian König <deathsimple@vodafone.de>:
>
>> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>
>> struct fence_collection inherits from struct fence and carries a
>> collection of fences that needs to be waited together.
>>
>> It is useful to translate a sync_file to a fence to remove the complexity
>> of dealing with sync_files on DRM drivers. So even if there are many
>> fences in the sync_file that needs to waited for a commit to happen,
>> they all get added to the fence_collection and passed for DRM use as
>> a standard struct fence.
>>
>> That means that no changes needed to any driver besides supporting fences.
>>
>> fence_collection's fence doesn't belong to any timeline context, so
>> fence_is_later() and fence_later() are not meant to be called with
>> fence_collections fences.
>>
>> v2: Comments by Daniel Vetter:
>>       - merge fence_collection_init() and fence_collection_add()
>>       - only add callbacks at ->enable_signalling()
>>       - remove fence_collection_put()
>>       - check for type on to_fence_collection()
>>       - adjust fence_is_later() and fence_later() to WARN_ON() if they
>>       are used with collection fences.
>>
>> v3: - Initialize fence_cb.node at fence init.
>>
>>     Comments by Chris Wilson:
>>       - return "unbound" on fence_collection_get_timeline_name()
>>       - don't stop adding callbacks if one fails
>>       - remove redundant !! on fence_collection_enable_signaling()
>>       - remove redundant () on fence_collection_signaled
>>       - use fence_default_wait() instead
>>
>> v4 (chk): Rework, simplification and cleanup:
>>       - Drop FENCE_NO_CONTEXT handling, always allocate a context.
>>       - Rename to fence_array.
>>       - Return fixed driver name.
>>       - Register only one callback at a time.
>>       - Document that create function takes ownership of array.
>>
>> v5 (chk): More work and fixes:
>>       - Avoid deadlocks by adding all callbacks at once again.
>>       - Stop trying to remove the callbacks.
>>       - Provide context and sequence number for the array fence.
>>
>> v6 (chk): Fixes found during testing
>>       - Fix stupid typo in _enable_signaling().
>>
>> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>  drivers/dma-buf/Makefile      |   2 +-
>>  drivers/dma-buf/fence-array.c | 127 ++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/fence-array.h   |  72 ++++++++++++++++++++++++
>>  3 files changed, 200 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/dma-buf/fence-array.c
>>  create mode 100644 include/linux/fence-array.h
>
> This is working fine to me. Once the commit message is fixed:
>
> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>
> You need to Cc Sumit here to decide who is picking these patches.
> It would be better to get them through the drm trees so we would need
> his Ack at least.
>
+1 on the commit message consistency change; post that, please feel
free to add my
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>

for the 3 dma-buf/fences patches and request to take them via DRM tree
as rightly suggested by Gustavo.

>         Gustavo

BR,
Sumit.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/11] dma-buf/fence: add fence_array fences v6
  2016-06-01 16:24     ` Sumit Semwal
@ 2016-06-01 22:44       ` Daniel Vetter
  2016-06-02  7:20         ` Christian König
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Vetter @ 2016-06-01 22:44 UTC (permalink / raw)
  To: Sumit Semwal
  Cc: Gustavo Padovan, Christian König, DRI mailing list, LKML,
	Daniel Vetter, Chris Wilson

On Wed, Jun 01, 2016 at 09:54:04PM +0530, Sumit Semwal wrote:
> Hi Christian, Gustavo,
> 
> Thanks for these patches.
> 
> On 1 June 2016 at 20:55, Gustavo Padovan <gustavo@padovan.org> wrote:
> > 2016-06-01 Christian König <deathsimple@vodafone.de>:
> >
> >> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> >>
> >> struct fence_collection inherits from struct fence and carries a
> >> collection of fences that needs to be waited together.
> >>
> >> It is useful to translate a sync_file to a fence to remove the complexity
> >> of dealing with sync_files on DRM drivers. So even if there are many
> >> fences in the sync_file that needs to waited for a commit to happen,
> >> they all get added to the fence_collection and passed for DRM use as
> >> a standard struct fence.
> >>
> >> That means that no changes needed to any driver besides supporting fences.
> >>
> >> fence_collection's fence doesn't belong to any timeline context, so
> >> fence_is_later() and fence_later() are not meant to be called with
> >> fence_collections fences.
> >>
> >> v2: Comments by Daniel Vetter:
> >>       - merge fence_collection_init() and fence_collection_add()
> >>       - only add callbacks at ->enable_signalling()
> >>       - remove fence_collection_put()
> >>       - check for type on to_fence_collection()
> >>       - adjust fence_is_later() and fence_later() to WARN_ON() if they
> >>       are used with collection fences.
> >>
> >> v3: - Initialize fence_cb.node at fence init.
> >>
> >>     Comments by Chris Wilson:
> >>       - return "unbound" on fence_collection_get_timeline_name()
> >>       - don't stop adding callbacks if one fails
> >>       - remove redundant !! on fence_collection_enable_signaling()
> >>       - remove redundant () on fence_collection_signaled
> >>       - use fence_default_wait() instead
> >>
> >> v4 (chk): Rework, simplification and cleanup:
> >>       - Drop FENCE_NO_CONTEXT handling, always allocate a context.
> >>       - Rename to fence_array.
> >>       - Return fixed driver name.
> >>       - Register only one callback at a time.
> >>       - Document that create function takes ownership of array.
> >>
> >> v5 (chk): More work and fixes:
> >>       - Avoid deadlocks by adding all callbacks at once again.
> >>       - Stop trying to remove the callbacks.
> >>       - Provide context and sequence number for the array fence.
> >>
> >> v6 (chk): Fixes found during testing
> >>       - Fix stupid typo in _enable_signaling().
> >>
> >> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> >> Signed-off-by: Christian König <christian.koenig@amd.com>
> >> ---
> >>  drivers/dma-buf/Makefile      |   2 +-
> >>  drivers/dma-buf/fence-array.c | 127 ++++++++++++++++++++++++++++++++++++++++++
> >>  include/linux/fence-array.h   |  72 ++++++++++++++++++++++++
> >>  3 files changed, 200 insertions(+), 1 deletion(-)
> >>  create mode 100644 drivers/dma-buf/fence-array.c
> >>  create mode 100644 include/linux/fence-array.h
> >
> > This is working fine to me. Once the commit message is fixed:
> >
> > Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
> >
> > You need to Cc Sumit here to decide who is picking these patches.
> > It would be better to get them through the drm trees so we would need
> > his Ack at least.
> >
> +1 on the commit message consistency change; post that, please feel
> free to add my
> Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
> 
> for the 3 dma-buf/fences patches and request to take them via DRM tree
> as rightly suggested by Gustavo.

Commit message improved and all 3 queued up for drm-misc. I haven't pushed
out the new branch yet though, since I'm waiting for Dave to open drm-next
and pull in a few other bits first so that I can rebase.

Thanks, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/11] dma-buf/fence: add fence_array fences v6
  2016-06-01 22:44       ` Daniel Vetter
@ 2016-06-02  7:20         ` Christian König
  0 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2016-06-02  7:20 UTC (permalink / raw)
  To: Sumit Semwal, Gustavo Padovan, DRI mailing list, LKML, Chris Wilson

Am 02.06.2016 um 00:44 schrieb Daniel Vetter:
> On Wed, Jun 01, 2016 at 09:54:04PM +0530, Sumit Semwal wrote:
>> Hi Christian, Gustavo,
>>
>> Thanks for these patches.
>>
>> On 1 June 2016 at 20:55, Gustavo Padovan <gustavo@padovan.org> wrote:
>>> 2016-06-01 Christian König <deathsimple@vodafone.de>:
>>>
>>>> From: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>>>
>>>> struct fence_collection inherits from struct fence and carries a
>>>> collection of fences that needs to be waited together.
>>>>
>>>> It is useful to translate a sync_file to a fence to remove the complexity
>>>> of dealing with sync_files on DRM drivers. So even if there are many
>>>> fences in the sync_file that needs to waited for a commit to happen,
>>>> they all get added to the fence_collection and passed for DRM use as
>>>> a standard struct fence.
>>>>
>>>> That means that no changes needed to any driver besides supporting fences.
>>>>
>>>> fence_collection's fence doesn't belong to any timeline context, so
>>>> fence_is_later() and fence_later() are not meant to be called with
>>>> fence_collections fences.
>>>>
>>>> v2: Comments by Daniel Vetter:
>>>>        - merge fence_collection_init() and fence_collection_add()
>>>>        - only add callbacks at ->enable_signalling()
>>>>        - remove fence_collection_put()
>>>>        - check for type on to_fence_collection()
>>>>        - adjust fence_is_later() and fence_later() to WARN_ON() if they
>>>>        are used with collection fences.
>>>>
>>>> v3: - Initialize fence_cb.node at fence init.
>>>>
>>>>      Comments by Chris Wilson:
>>>>        - return "unbound" on fence_collection_get_timeline_name()
>>>>        - don't stop adding callbacks if one fails
>>>>        - remove redundant !! on fence_collection_enable_signaling()
>>>>        - remove redundant () on fence_collection_signaled
>>>>        - use fence_default_wait() instead
>>>>
>>>> v4 (chk): Rework, simplification and cleanup:
>>>>        - Drop FENCE_NO_CONTEXT handling, always allocate a context.
>>>>        - Rename to fence_array.
>>>>        - Return fixed driver name.
>>>>        - Register only one callback at a time.
>>>>        - Document that create function takes ownership of array.
>>>>
>>>> v5 (chk): More work and fixes:
>>>>        - Avoid deadlocks by adding all callbacks at once again.
>>>>        - Stop trying to remove the callbacks.
>>>>        - Provide context and sequence number for the array fence.
>>>>
>>>> v6 (chk): Fixes found during testing
>>>>        - Fix stupid typo in _enable_signaling().
>>>>
>>>> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/dma-buf/Makefile      |   2 +-
>>>>   drivers/dma-buf/fence-array.c | 127 ++++++++++++++++++++++++++++++++++++++++++
>>>>   include/linux/fence-array.h   |  72 ++++++++++++++++++++++++
>>>>   3 files changed, 200 insertions(+), 1 deletion(-)
>>>>   create mode 100644 drivers/dma-buf/fence-array.c
>>>>   create mode 100644 include/linux/fence-array.h
>>> This is working fine to me. Once the commit message is fixed:
>>>
>>> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
>>>
>>> You need to Cc Sumit here to decide who is picking these patches.
>>> It would be better to get them through the drm trees so we would need
>>> his Ack at least.
>>>
>> +1 on the commit message consistency change; post that, please feel
>> free to add my
>> Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
>>
>> for the 3 dma-buf/fences patches and request to take them via DRM tree
>> as rightly suggested by Gustavo.
> Commit message improved and all 3 queued up for drm-misc. I haven't pushed
> out the new branch yet though, since I'm waiting for Dave to open drm-next
> and pull in a few other bits first so that I can rebase.

Thanks and sorry for missing the commit message. I was a bit to 
concentrated on fixing the code.

Leave me a note when the branch is public available, cause I then want 
to rebase the amdgpu changes on top and send a pull request to Alex.

Christian.

>
> Thanks, Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 11/11] drm/amdgpu: remove now unnecessary checks
  2016-06-01 13:10 ` [PATCH 11/11] drm/amdgpu: remove now unnecessary checks Christian König
@ 2016-06-06 21:14   ` Alex Deucher
  0 siblings, 0 replies; 21+ messages in thread
From: Alex Deucher @ 2016-06-06 21:14 UTC (permalink / raw)
  To: Christian König; +Cc: Maling list - DRI developers, LKML

On Wed, Jun 1, 2016 at 9:10 AM, Christian König <deathsimple@vodafone.de> wrote:
> From: Christian König <christian.koenig@amd.com>
>
> vm_flush() now comes directly after vm_grab_id().
>
> Signed-off-by: Christian König <christian.koenig@amd.com>

For the series:
Acked-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu.h    |  1 -
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 31 +++++++++++--------------------
>  2 files changed, 11 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index 52326d3..e054542 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -886,7 +886,6 @@ struct amdgpu_vm_id {
>         struct fence            *first;
>         struct amdgpu_sync      active;
>         struct fence            *last_flush;
> -       struct amdgpu_ring      *last_user;
>         atomic64_t              owner;
>
>         uint64_t                pd_gpu_addr;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8ea1c73..48d5ad18 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -237,6 +237,7 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
>         i = ring->idx;
>         do {
>                 struct fence *flushed;
> +               bool same_ring = ring->idx == i;
>
>                 id = vm->ids[i++];
>                 if (i == AMDGPU_MAX_RINGS)
> @@ -252,7 +253,7 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
>                 if (pd_addr != id->pd_gpu_addr)
>                         continue;
>
> -               if (id->last_user != ring &&
> +               if (!same_ring &&
>                     (!id->last_flush || !fence_is_signaled(id->last_flush)))
>                         continue;
>
> @@ -261,15 +262,9 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
>                     (!flushed || fence_is_later(updates, flushed)))
>                         continue;
>
> -               /* Good we can use this VMID */
> -               if (id->last_user == ring) {
> -                       r = amdgpu_sync_fence(ring->adev, sync,
> -                                             id->first);
> -                       if (r)
> -                               goto error;
> -               }
> -
> -               /* And remember this submission as user of the VMID */
> +               /* Good we can use this VMID. Remember this submission as
> +                * user of the VMID.
> +                */
>                 r = amdgpu_sync_fence(ring->adev, &id->active, fence);
>                 if (r)
>                         goto error;
> @@ -306,7 +301,6 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
>         id->pd_gpu_addr = pd_addr;
>
>         list_move_tail(&id->list, &adev->vm_manager.ids_lru);
> -       id->last_user = ring;
>         atomic64_set(&id->owner, vm->client_id);
>         vm->ids[ring->idx] = id;
>
> @@ -357,16 +351,13 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring,
>                 trace_amdgpu_vm_flush(pd_addr, ring->idx, vm_id);
>                 amdgpu_ring_emit_vm_flush(ring, vm_id, pd_addr);
>
> +               r = amdgpu_fence_emit(ring, &fence);
> +               if (r)
> +                       return r;
> +
>                 mutex_lock(&adev->vm_manager.lock);
> -               if ((id->pd_gpu_addr == pd_addr) && (id->last_user == ring)) {
> -                       r = amdgpu_fence_emit(ring, &fence);
> -                       if (r) {
> -                               mutex_unlock(&adev->vm_manager.lock);
> -                               return r;
> -                       }
> -                       fence_put(id->last_flush);
> -                       id->last_flush = fence;
> -               }
> +               fence_put(id->last_flush);
> +               id->last_flush = fence;
>                 mutex_unlock(&adev->vm_manager.lock);
>         }
>
> --
> 2.5.0
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-06-06 21:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01 13:10 Fence array patchset Christian König
2016-06-01 13:10 ` [PATCH 01/11] dma-buf/fence: make fence context 64 bit v2 Christian König
2016-06-01 15:23   ` Gustavo Padovan
2016-06-01 13:10 ` [PATCH 02/11] dma-buf/fence: add fence_array fences v6 Christian König
2016-06-01 14:01   ` Gustavo Padovan
2016-06-01 15:25   ` Gustavo Padovan
2016-06-01 16:24     ` Sumit Semwal
2016-06-01 22:44       ` Daniel Vetter
2016-06-02  7:20         ` Christian König
2016-06-01 13:10 ` [PATCH 03/11] dma-buf/fence: add signal_on_any to the fence array v2 Christian König
2016-06-01 15:25   ` Gustavo Padovan
2016-06-01 13:10 ` [PATCH 04/11] drm/amdgpu: document amdgpu_sync_get_fence Christian König
2016-06-01 13:10 ` [PATCH 05/11] drm/amdgpu: generalize the scheduler fence Christian König
2016-06-01 13:10 ` [PATCH 06/11] drm/amdgpu: remove amdgpu_sync_wait Christian König
2016-06-01 13:10 ` [PATCH 07/11] drm/amdgpu: add optional ring to amdgpu_sync_is_idle Christian König
2016-06-01 13:10 ` [PATCH 08/11] drm/amdgpu: prefer VMIDs idle on the current ring Christian König
2016-06-01 13:10 ` [PATCH 09/11] drm/amdgpu: reuse VMIDs assigned to a VM only if there is also a free one Christian König
2016-06-01 13:10 ` [PATCH 10/11] drm/amdgpu: use a fence array for VMID management Christian König
2016-06-01 13:10 ` [PATCH 11/11] drm/amdgpu: remove now unnecessary checks Christian König
2016-06-06 21:14   ` Alex Deucher
2016-06-01 13:42 ` Fence array patchset Alex Deucher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).