All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
@ 2020-05-25  7:53 Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly Chris Wilson
                   ` (14 more replies)
  0 siblings, 15 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

In order to keep userptr distinct from ggtt mmaps in the eyes of
lockdep, we need to avoid marking those userptr vma as PIN_GLOBAL. (So
long as we comply with only using them as local PIN_USER!)

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1880
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_ggtt.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
index 66165b10256e..8c275f8588c3 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
@@ -424,22 +424,17 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 	struct drm_i915_gem_object *obj = vma->obj;
 	u32 pte_flags;
 
+	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
+		return 0;
+
 	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
 	pte_flags = 0;
 	if (i915_gem_object_is_readonly(obj))
 		pte_flags |= PTE_READ_ONLY;
 
 	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
-
 	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
 
-	/*
-	 * Without aliasing PPGTT there's no difference between
-	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
-	 * upgrade to both bound if we bind either to avoid double-binding.
-	 */
-	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
-
 	return 0;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25 13:51   ` Mika Kuoppala
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 03/12] drm/i915/gem: Suppress some random warnings Chris Wilson
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since the worker may rearm, we currently are only guaranteed to flush
the work if we cancel the timer. If the work was running at the time we
try and cancel it, we will wait for it to complete, but it may leave
items in the pool and requeue the work. If we rearrange the immediate
discard of the pool then cancel the work, we know that the work cannot
rearm and so our flush will be final.

<0> [314.146044] i915_mod-1321    2.... 299799443us : intel_gt_fini_buffer_pool: intel_gt_fini_buffer_pool:227 GEM_BUG_ON(!list_empty(&pool->cache_list[n]))

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
index 1495054a4305..418ae184cecf 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
@@ -212,8 +212,9 @@ void intel_gt_flush_buffer_pool(struct intel_gt *gt)
 {
 	struct intel_gt_buffer_pool *pool = &gt->buffer_pool;
 
-	if (cancel_delayed_work_sync(&pool->work))
+	do {
 		pool_free_imm(pool);
+	} while (cancel_delayed_work_sync(&pool->work));
 }
 
 void intel_gt_fini_buffer_pool(struct intel_gt *gt)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 03/12] drm/i915/gem: Suppress some random warnings
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25 13:55   ` Mika Kuoppala
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 04/12] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Leave the error propagation in place, but limit the warnings to only
show up in CI if the unlikely errors are hit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_phys.c       | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c      | 3 +--
 drivers/gpu/drm/i915/gem/i915_gem_userptr.c    | 2 +-
 4 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e4fb6c372537..219a36995b96 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1626,8 +1626,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 			err = i915_vma_bind(target->vma,
 					    target->vma->obj->cache_level,
 					    PIN_GLOBAL, NULL);
-			if (drm_WARN_ONCE(&i915->drm, err,
-				      "Unexpected failure to bind target VMA!"))
+			if (err)
 				return err;
 		}
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 4c1c7232b024..12245a47e5fb 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -27,8 +27,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 	void *dst;
 	int i;
 
-	if (drm_WARN_ON(obj->base.dev,
-			i915_gem_object_needs_bit17_swizzle(obj)))
+	if (GEM_WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
 		return -EINVAL;
 
 	/*
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 7aff3514d97a..7cf8548ff708 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -147,8 +147,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
 		last_pfn = page_to_pfn(page);
 
 		/* Check that the i965g/gm workaround works. */
-		drm_WARN_ON(&i915->drm,
-			    (gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
+		GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL);
 	}
 	if (sg) { /* loop terminated early; short sg table */
 		sg_page_sizes |= sg->length;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index 8b0708708671..2226146b01c9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -235,7 +235,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
 	if (flags & I915_USERPTR_UNSYNCHRONIZED)
 		return capable(CAP_SYS_ADMIN) ? 0 : -EPERM;
 
-	if (drm_WARN_ON(obj->base.dev, obj->userptr.mm == NULL))
+	if (GEM_WARN_ON(!obj->userptr.mm))
 		return -EINVAL;
 
 	mn = i915_mmu_notifier_find(obj->userptr.mm);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 04/12] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 03/12] drm/i915/gem: Suppress some random warnings Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-26 11:17   ` Mika Kuoppala
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 05/12] drm/i915: Improve execute_cb struct packing Chris Wilson
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If there are no internal levels and the user priority-shift is zero, we
can help the compiler eliminate some dead code:

Function                                     old     new   delta
start_timeslice                              169     154     -15
__execlists_submission_tasklet              4696    4659     -37

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index de5be57ed6d2..3214a4ecc31a 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -446,6 +446,9 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 	 * we have to flip the index value to become priority.
 	 */
 	p = to_priolist(rb);
+	if (!I915_USER_PRIORITY_SHIFT)
+		return p->priority;
+
 	return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 05/12] drm/i915: Improve execute_cb struct packing
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (2 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 04/12] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-26 11:17   ` Mika Kuoppala
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Reduce the irq_work llist for attaching the callbacks to the signal for
both smaller structs (two fewer pointers!) and simpler [debug] code:

Function                                     old     new   delta
irq_execute_cb                                35      34      -1
__igt_breadcrumbs_smoketest                 1684    1682      -2
i915_request_retire                         2003    1996      -7
__i915_request_create                       1047    1040      -7
__notify_execute_cb                          135     126      -9
__i915_request_ctor                          188     178     -10
__await_execution.part.constprop             451     440     -11
igt_wait_request                             924     714    -210

One minor artifact is that the order of cb exection is reversed. No
current use cases are affected by that change.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_request.c | 18 +++++++++---------
 drivers/gpu/drm/i915/i915_request.h |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index c282719ad3ac..22df5b229aed 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -42,7 +42,6 @@
 #include "intel_pm.h"
 
 struct execute_cb {
-	struct list_head link;
 	struct irq_work work;
 	struct i915_sw_fence *fence;
 	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
@@ -189,14 +188,14 @@ static void irq_execute_cb_hook(struct irq_work *wrk)
 
 static void __notify_execute_cb(struct i915_request *rq)
 {
-	struct execute_cb *cb;
+	struct execute_cb *cb, *cn;
 
 	lockdep_assert_held(&rq->lock);
 
-	if (list_empty(&rq->execute_cb))
+	if (llist_empty(&rq->execute_cb))
 		return;
 
-	list_for_each_entry(cb, &rq->execute_cb, link)
+	llist_for_each_entry_safe(cb, cn, rq->execute_cb.first, work.llnode)
 		irq_work_queue(&cb->work);
 
 	/*
@@ -209,7 +208,7 @@ static void __notify_execute_cb(struct i915_request *rq)
 	 * preempt-to-idle cycle on the target engine, all the while the
 	 * master execute_cb may refire.
 	 */
-	INIT_LIST_HEAD(&rq->execute_cb);
+	rq->execute_cb.first = NULL;
 }
 
 static inline void
@@ -327,7 +326,7 @@ bool i915_request_retire(struct i915_request *rq)
 		set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
 		__notify_execute_cb(rq);
 	}
-	GEM_BUG_ON(!list_empty(&rq->execute_cb));
+	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
 	spin_unlock_irq(&rq->lock);
 
 	remove_from_client(rq);
@@ -395,7 +394,8 @@ __await_execution(struct i915_request *rq,
 		i915_sw_fence_complete(cb->fence);
 		kmem_cache_free(global.slab_execute_cbs, cb);
 	} else {
-		list_add_tail(&cb->link, &signal->execute_cb);
+		cb->work.llnode.next = signal->execute_cb.first;
+		signal->execute_cb.first = &cb->work.llnode;
 	}
 	spin_unlock_irq(&signal->lock);
 
@@ -704,7 +704,7 @@ static void __i915_request_ctor(void *arg)
 	rq->file_priv = NULL;
 	rq->capture_list = NULL;
 
-	INIT_LIST_HEAD(&rq->execute_cb);
+	init_llist_head(&rq->execute_cb);
 }
 
 struct i915_request *
@@ -794,7 +794,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	rq->batch = NULL;
 	GEM_BUG_ON(rq->file_priv);
 	GEM_BUG_ON(rq->capture_list);
-	GEM_BUG_ON(!list_empty(&rq->execute_cb));
+	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 8ec7ee4dbadc..5d4709a3dace 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -214,7 +214,7 @@ struct i915_request {
 			ktime_t emitted;
 		} duration;
 	};
-	struct list_head execute_cb;
+	struct llist_head execute_cb;
 	struct i915_sw_fence semaphore;
 
 	/*
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (3 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 05/12] drm/i915: Improve execute_cb struct packing Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-30  5:53     ` kbuild test robot
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 07/12] drm/i915: Unpeel awaits on a proxy fence Chris Wilson
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Often we need to create a fence for a future event that has not yet been
associated with a fence. We can store a proxy fence, a placeholder, in
the timeline and replace it later when the real fence is known. Any
listeners that attach to the proxy fence will automatically be signaled
when the real fence completes, and any future listeners will instead be
attach directly to the real fence avoiding any indirection overhead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/dma-buf/Makefile             |  13 +-
 drivers/dma-buf/dma-fence-private.h  |  20 +
 drivers/dma-buf/dma-fence-proxy.c    | 295 +++++++++++
 drivers/dma-buf/dma-fence.c          |   4 +-
 drivers/dma-buf/selftests.h          |   1 +
 drivers/dma-buf/st-dma-fence-proxy.c | 752 +++++++++++++++++++++++++++
 include/linux/dma-fence-proxy.h      |  38 ++
 7 files changed, 1119 insertions(+), 4 deletions(-)
 create mode 100644 drivers/dma-buf/dma-fence-private.h
 create mode 100644 drivers/dma-buf/dma-fence-proxy.c
 create mode 100644 drivers/dma-buf/st-dma-fence-proxy.c
 create mode 100644 include/linux/dma-fence-proxy.h

diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 995e05f609ff..afaf6dadd9a3 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,6 +1,12 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
-	 dma-resv.o seqno-fence.o
+obj-y := \
+	dma-buf.o \
+	dma-fence.o \
+	dma-fence-array.o \
+	dma-fence-chain.o \
+	dma-fence-proxy.o \
+	dma-resv.o \
+	seqno-fence.o
 obj-$(CONFIG_DMABUF_HEAPS)	+= dma-heap.o
 obj-$(CONFIG_DMABUF_HEAPS)	+= heaps/
 obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
@@ -10,6 +16,7 @@ obj-$(CONFIG_UDMABUF)		+= udmabuf.o
 dmabuf_selftests-y := \
 	selftest.o \
 	st-dma-fence.o \
-	st-dma-fence-chain.o
+	st-dma-fence-chain.o \
+	st-dma-fence-proxy.o
 
 obj-$(CONFIG_DMABUF_SELFTESTS)	+= dmabuf_selftests.o
diff --git a/drivers/dma-buf/dma-fence-private.h b/drivers/dma-buf/dma-fence-private.h
new file mode 100644
index 000000000000..6924d28af0fa
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-private.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Fence mechanism for dma-buf and to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark <robdclark@gmail.com>
+ * Maarten Lankhorst <maarten.lankhorst@canonical.com>
+ */
+
+#ifndef DMA_FENCE_PRIVATE_H
+#define DMA_FENCE_PRIAVTE_H
+
+struct dma_fence;
+
+bool __dma_fence_enable_signaling(struct dma_fence *fence);
+
+#endif /* DMA_FENCE_PRIAVTE_H */
diff --git a/drivers/dma-buf/dma-fence-proxy.c b/drivers/dma-buf/dma-fence-proxy.c
new file mode 100644
index 000000000000..6689cc89c6ad
--- /dev/null
+++ b/drivers/dma-buf/dma-fence-proxy.c
@@ -0,0 +1,295 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * dma-fence-proxy: placeholder unsignaled fence
+ *
+ * Copyright (C) 2017-2019 Intel Corporation
+ */
+
+#include <linux/dma-fence.h>
+#include <linux/dma-fence-proxy.h>
+#include <linux/export.h>
+#include <linux/irq_work.h>
+#include <linux/slab.h>
+
+#include "dma-fence-private.h"
+
+struct dma_fence_proxy {
+	struct dma_fence base;
+
+	struct dma_fence *real;
+	struct dma_fence_cb cb;
+	struct irq_work work;
+
+	wait_queue_head_t wq;
+};
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+#define same_lockclass(A, B) (A)->dep_map.key == (B)->dep_map.key
+#else
+#define same_lockclass(A, B) 0
+#endif
+
+static const char *proxy_get_driver_name(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	struct dma_fence *real = READ_ONCE(p->real);
+
+	return real ? real->ops->get_driver_name(real) : "proxy";
+}
+
+static const char *proxy_get_timeline_name(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	struct dma_fence *real = READ_ONCE(p->real);
+
+	return real ? real->ops->get_timeline_name(real) : "unset";
+}
+
+static void proxy_irq_work(struct irq_work *work)
+{
+	struct dma_fence_proxy *p = container_of(work, typeof(*p), work);
+
+	dma_fence_signal(&p->base);
+	dma_fence_put(&p->base);
+}
+
+static void proxy_callback(struct dma_fence *real, struct dma_fence_cb *cb)
+{
+	struct dma_fence_proxy *p = container_of(cb, typeof(*p), cb);
+
+	if (real->error)
+		dma_fence_set_error(&p->base, real->error);
+
+	/* Lower the height of the proxy chain -> single stack frame */
+	irq_work_queue(&p->work);
+}
+
+static bool proxy_enable_signaling(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	struct dma_fence *real = READ_ONCE(p->real);
+	bool ret = true;
+
+	if (real) {
+		spin_lock_nested(real->lock,
+				 same_lockclass(&p->wq.lock, real->lock));
+		ret = __dma_fence_enable_signaling(real);
+		spin_unlock(real->lock);
+	}
+
+	return ret;
+}
+
+static void proxy_release(struct dma_fence *fence)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+
+	dma_fence_put(p->real);
+	dma_fence_free(&p->base);
+}
+
+const struct dma_fence_ops dma_fence_proxy_ops = {
+	.get_driver_name = proxy_get_driver_name,
+	.get_timeline_name = proxy_get_timeline_name,
+	.enable_signaling = proxy_enable_signaling,
+	.wait = dma_fence_default_wait,
+	.release = proxy_release,
+};
+EXPORT_SYMBOL_GPL(dma_fence_proxy_ops);
+
+/**
+ * __dma_fence_create_proxy - Create an unset dma-fence
+ * @context: context number to use for proxy fence
+ * @seqno: sequence number to use for proxy fence
+ *
+ * __dma_fence_create_proxy() creates a new dma_fence stub that is initially
+ * unsignaled and may later be replaced with a real fence. Any listeners
+ * to the proxy fence will be signaled when the target fence signals its
+ * completion.
+ */
+struct dma_fence *__dma_fence_create_proxy(u64 context, u64 seqno)
+{
+	struct dma_fence_proxy *p;
+
+	p = kzalloc(sizeof(*p), GFP_KERNEL);
+	if (!p)
+		return NULL;
+
+	init_waitqueue_head(&p->wq);
+	dma_fence_init(&p->base, &dma_fence_proxy_ops, &p->wq.lock,
+		       context, seqno);
+	init_irq_work(&p->work, proxy_irq_work);
+
+	return &p->base;
+}
+EXPORT_SYMBOL(__dma_fence_create_proxy);
+
+/**
+ * dma_fence_create_proxy - Create an unset dma-fence
+ *
+ * Wraps __dma_fence_create_proxy() to create a new proxy fence with the
+ * next available (unique) context id.
+ */
+struct dma_fence *dma_fence_create_proxy(void)
+{
+	return __dma_fence_create_proxy(dma_fence_context_alloc(1), 0);
+}
+EXPORT_SYMBOL(dma_fence_create_proxy);
+
+static void __wake_up_listeners(struct dma_fence_proxy *p)
+{
+	struct wait_queue_entry *wait, *next;
+
+	list_for_each_entry_safe(wait, next, &p->wq.head, entry) {
+		INIT_LIST_HEAD(&wait->entry);
+		wait->func(wait, TASK_NORMAL, 0, p->real);
+	}
+}
+
+static void proxy_assign(struct dma_fence *fence, struct dma_fence *real)
+{
+	struct dma_fence_proxy *p = container_of(fence, typeof(*p), base);
+	unsigned long flags;
+
+	if (WARN_ON(fence == real))
+		return;
+
+	if (WARN_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)))
+		return;
+
+	if (WARN_ON(p->real))
+		return;
+
+	spin_lock_irqsave(&p->wq.lock, flags);
+
+	if (unlikely(!real)) {
+		dma_fence_signal_locked(&p->base);
+		goto unlock;
+	}
+
+	p->real = dma_fence_get(real);
+
+	dma_fence_get(&p->base);
+	spin_lock_nested(real->lock, same_lockclass(&p->wq.lock, real->lock));
+	if (dma_fence_is_signaled_locked(real)) {
+		proxy_callback(real, &p->cb);
+	} else if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+			    &p->base.flags) &&
+		   !__dma_fence_enable_signaling(real)) {
+		proxy_callback(real, &p->cb);
+	} else {
+		p->cb.func = proxy_callback;
+		list_add_tail(&p->cb.node, &real->cb_list);
+	}
+	spin_unlock(real->lock);
+
+unlock:
+	__wake_up_listeners(p);
+	spin_unlock_irqrestore(&p->wq.lock, flags);
+}
+
+/**
+ * dma_fence_replace_proxy - Replace the proxy fence with the real target
+ * @slot: pointer to location of fence to update
+ * @fence: the new fence to store in @slot
+ *
+ * Once the real dma_fence is known, we can replace the proxy fence holder
+ * with a pointer to the real dma fence. Future listeners will attach to
+ * the real fence, avoiding any indirection overhead. Previous listeners
+ * will remain attached to the proxy fence, and be signaled in turn when
+ * the target fence completes.
+ */
+struct dma_fence *
+dma_fence_replace_proxy(struct dma_fence __rcu **slot, struct dma_fence *fence)
+{
+	struct dma_fence *old;
+
+	if (fence)
+		dma_fence_get(fence);
+
+	old = rcu_replace_pointer(*slot, fence, true);
+	if (old && dma_fence_is_proxy(old))
+		proxy_assign(old, fence);
+
+	return old;
+}
+EXPORT_SYMBOL(dma_fence_replace_proxy);
+
+/**
+ * dma_fence_proxy_set_real - Set the target of a proxy fence
+ * @fence: the proxy fence
+ * @real: the target fence.
+ *
+ */
+void dma_fence_proxy_set_real(struct dma_fence *fence, struct dma_fence *real)
+{
+	if (dma_fence_is_proxy(fence))
+		proxy_assign(fence, real);
+}
+EXPORT_SYMBOL(dma_fence_proxy_set_real);
+
+/**
+ * dma_fence_proxy_get_real - Query the target of a proxy fence
+ * @fence: the proxy fence
+ *
+ * Unpeel the proxy fence to see if it has been replaced with a real fence.
+ */
+struct dma_fence *dma_fence_proxy_get_real(struct dma_fence *fence)
+{
+	if (dma_fence_is_proxy(fence)) {
+		struct dma_fence_proxy *p =
+			container_of(fence, typeof(*p), base);
+
+		if (p->real)
+			fence = p->real;
+	}
+
+	return fence;
+}
+EXPORT_SYMBOL(dma_fence_proxy_get_real);
+
+void dma_fence_add_proxy_listener(struct dma_fence *fence,
+				  struct wait_queue_entry *wait)
+{
+	if (dma_fence_is_proxy(fence)) {
+		struct dma_fence_proxy *p =
+			container_of(fence, typeof(*p), base);
+		unsigned long flags;
+
+		spin_lock_irqsave(&p->wq.lock, flags);
+		if (!p->real) {
+			list_add_tail(&wait->entry, &p->wq.head);
+			wait = NULL;
+		}
+		fence = p->real;
+		spin_unlock_irqrestore(&p->wq.lock, flags);
+	}
+
+	if (wait) {
+		INIT_LIST_HEAD(&wait->entry);
+		wait->func(wait, TASK_NORMAL, 0, fence);
+	}
+}
+EXPORT_SYMBOL(dma_fence_add_proxy_listener);
+
+bool dma_fence_remove_proxy_listener(struct dma_fence *fence,
+				     struct wait_queue_entry *wait)
+{
+	bool ret = false;
+
+	if (dma_fence_is_proxy(fence)) {
+		struct dma_fence_proxy *p =
+			container_of(fence, typeof(*p), base);
+		unsigned long flags;
+
+		spin_lock_irqsave(&p->wq.lock, flags);
+		if (!list_empty(&wait->entry)) {
+			list_del_init(&wait->entry);
+			ret = true;
+		}
+		spin_unlock_irqrestore(&p->wq.lock, flags);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL(dma_fence_remove_proxy_listener);
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 656e9ac2d028..329bd033059f 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -19,6 +19,8 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/dma_fence.h>
 
+#include "dma-fence-private.h"
+
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
@@ -275,7 +277,7 @@ void dma_fence_free(struct dma_fence *fence)
 }
 EXPORT_SYMBOL(dma_fence_free);
 
-static bool __dma_fence_enable_signaling(struct dma_fence *fence)
+bool __dma_fence_enable_signaling(struct dma_fence *fence)
 {
 	bool was_set;
 
diff --git a/drivers/dma-buf/selftests.h b/drivers/dma-buf/selftests.h
index 55918ef9adab..616eca70e2d8 100644
--- a/drivers/dma-buf/selftests.h
+++ b/drivers/dma-buf/selftests.h
@@ -12,3 +12,4 @@
 selftest(sanitycheck, __sanitycheck__) /* keep first (igt selfcheck) */
 selftest(dma_fence, dma_fence)
 selftest(dma_fence_chain, dma_fence_chain)
+selftest(dma_fence_proxy, dma_fence_proxy)
diff --git a/drivers/dma-buf/st-dma-fence-proxy.c b/drivers/dma-buf/st-dma-fence-proxy.c
new file mode 100644
index 000000000000..c3f210bc4e60
--- /dev/null
+++ b/drivers/dma-buf/st-dma-fence-proxy.c
@@ -0,0 +1,752 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-fence.h>
+#include <linux/dma-fence-proxy.h>
+#include <linux/kernel.h>
+#include <linux/sched/signal.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "selftest.h"
+
+static struct kmem_cache *slab_fences;
+
+static struct mock_fence {
+	struct dma_fence base;
+	spinlock_t lock;
+} *to_mock_fence(struct dma_fence *f) {
+	return container_of(f, struct mock_fence, base);
+}
+
+static const char *mock_name(struct dma_fence *f)
+{
+	return "mock";
+}
+
+static void mock_fence_release(struct dma_fence *f)
+{
+	kmem_cache_free(slab_fences, to_mock_fence(f));
+}
+
+static const struct dma_fence_ops mock_ops = {
+	.get_driver_name = mock_name,
+	.get_timeline_name = mock_name,
+	.release = mock_fence_release,
+};
+
+static struct dma_fence *mock_fence(void)
+{
+	struct mock_fence *f;
+
+	f = kmem_cache_alloc(slab_fences, GFP_KERNEL);
+	if (!f)
+		return NULL;
+
+	spin_lock_init(&f->lock);
+	dma_fence_init(&f->base, &mock_ops, &f->lock, 0, 0);
+
+	return &f->base;
+}
+
+static int sanitycheck(void *arg)
+{
+	struct dma_fence *f;
+
+	f = dma_fence_create_proxy();
+	if (!f)
+		return -ENOMEM;
+
+	dma_fence_signal(f);
+	dma_fence_put(f);
+
+	return 0;
+}
+
+struct fences {
+	struct dma_fence *real;
+	struct dma_fence *proxy;
+	struct dma_fence __rcu *slot;
+};
+
+static int create_fences(struct fences *f, bool attach)
+{
+	f->proxy = dma_fence_create_proxy();
+	if (!f->proxy)
+		return -ENOMEM;
+
+	RCU_INIT_POINTER(f->slot, f->proxy);
+
+	f->real = mock_fence();
+	if (!f->real) {
+		dma_fence_put(f->proxy);
+		return -ENOMEM;
+	}
+
+	if (attach)
+		dma_fence_replace_proxy(&f->slot, f->real);
+
+	return 0;
+}
+
+static void free_fences(struct fences *f)
+{
+	dma_fence_put(dma_fence_replace_proxy(&f->slot, NULL));
+
+	dma_fence_signal(f->real);
+	dma_fence_put(f->real);
+
+	dma_fence_signal(f->proxy);
+	dma_fence_put(f->proxy);
+}
+
+static int wrap_target(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	if (dma_fence_proxy_get_real(f.proxy) != f.proxy) {
+		pr_err("Unwrapped proxy fenced reported a target fence!\n");
+		goto err_free;
+	}
+
+	dma_fence_proxy_set_real(f.proxy, f.real);
+	rcu_assign_pointer(f.slot, dma_fence_get(f.real)); /* free_fences() */
+
+	if (dma_fence_proxy_get_real(f.proxy) != f.real) {
+		pr_err("Wrapped proxy fenced did not report the target fence!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_proxy(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_proxy_get_real(f.proxy) != f.real) {
+		pr_err("Wrapped proxy fenced did not report the target fence!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_signaling(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence unexpectedly signaled on creation\n");
+		goto err_free;
+	}
+
+	if (dma_fence_signal(f.real)) {
+		pr_err("Fence reported being already signaled\n");
+		goto err_free;
+	}
+
+	if (!dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence not reporting signaled\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_signaling_recurse(void *arg)
+{
+	struct fences f;
+	struct dma_fence *chain;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	chain = dma_fence_create_proxy();
+	if (!chain) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, chain);
+	dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real));
+	dma_fence_put(chain);
+
+	/* f.real <- chain <- f.proxy */
+
+	if (dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence unexpectedly signaled on creation\n");
+		goto err_free;
+	}
+
+	if (dma_fence_signal(f.real)) {
+		pr_err("Fence reported being already signaled\n");
+		goto err_free;
+	}
+
+	if (!dma_fence_is_signaled(f.proxy)) {
+		pr_err("Fence not reporting signaled\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+struct simple_cb {
+	struct dma_fence_cb cb;
+	bool seen;
+};
+
+static void simple_callback(struct dma_fence *f, struct dma_fence_cb *cb)
+{
+	/* Ensure the callback marker is visible, no excuses for missing it! */
+	smp_store_mb(container_of(cb, struct simple_cb, cb)->seen, true);
+}
+
+static int wrap_add_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_add_callback_recurse(void *arg)
+{
+	struct simple_cb cb = {};
+	struct dma_fence *chain;
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	chain = dma_fence_create_proxy();
+	if (!chain) {
+		err = -ENOMEM;
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, chain);
+	dma_fence_put(dma_fence_replace_proxy(&f.slot, f.real));
+	dma_fence_put(chain);
+
+	/* f.real <- chain <- f.proxy */
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_late_add_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	dma_fence_signal(f.real);
+
+	if (!dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Added callback, but fence was already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (cb.seen) {
+		pr_err("Callback called after failed attachment!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_early_add_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_early_add_callback_late(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_signal(f.real);
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_early_add_callback_early(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_rm_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	if (!dma_fence_remove_callback(f.proxy, &cb.cb)) {
+		pr_err("Failed to remove callback!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (cb.seen) {
+		pr_err("Callback still signaled after removal!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_late_rm_callback(void *arg)
+{
+	struct simple_cb cb = {};
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_add_callback(f.proxy, &cb.cb, simple_callback)) {
+		pr_err("Failed to add callback, fence already signaled!\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!cb.seen) {
+		pr_err("Callback failed!\n");
+		goto err_free;
+	}
+
+	if (dma_fence_remove_callback(f.proxy, &cb.cb)) {
+		pr_err("Callback removal succeed after being executed!\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_status(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_get_status(f.proxy)) {
+		pr_err("Fence unexpectedly has signaled status on creation\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (!dma_fence_get_status(f.proxy)) {
+		pr_err("Fence not reporting signaled status\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_error(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	dma_fence_set_error(f.real, -EIO);
+
+	if (dma_fence_get_status(f.proxy)) {
+		pr_err("Fence unexpectedly has error status before signal\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+	if (dma_fence_get_status(f.proxy) != -EIO) {
+		pr_err("Fence not reporting error status, got %d\n",
+		       dma_fence_get_status(f.proxy));
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_wait(void *arg)
+{
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, true))
+		return -ENOMEM;
+
+	if (dma_fence_wait_timeout(f.proxy, false, 0) != 0) {
+		pr_err("Wait reported complete before being signaled\n");
+		goto err_free;
+	}
+
+	dma_fence_signal(f.real);
+
+	if (dma_fence_wait_timeout(f.proxy, false, 0) == 0) {
+		pr_err("Wait reported incomplete after being signaled\n");
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+struct wait_timer {
+	struct timer_list timer;
+	struct fences f;
+};
+
+static void wait_timer(struct timer_list *timer)
+{
+	struct wait_timer *wt = from_timer(wt, timer, timer);
+
+	dma_fence_signal(wt->f.real);
+}
+
+static int wrap_wait_timeout(void *arg)
+{
+	struct wait_timer wt;
+	int err = -EINVAL;
+
+	if (create_fences(&wt.f, true))
+		return -ENOMEM;
+
+	timer_setup_on_stack(&wt.timer, wait_timer, 0);
+
+	if (dma_fence_wait_timeout(wt.f.proxy, false, 1) != 0) {
+		pr_err("Wait reported complete before being signaled\n");
+		goto err_free;
+	}
+
+	mod_timer(&wt.timer, jiffies + 1);
+
+	if (dma_fence_wait_timeout(wt.f.proxy, false, 2) != 0) {
+		if (timer_pending(&wt.timer)) {
+			pr_notice("Timer did not fire within the jiffie!\n");
+			err = 0; /* not our fault! */
+		} else {
+			pr_err("Wait reported incomplete after timeout\n");
+		}
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	del_timer_sync(&wt.timer);
+	destroy_timer_on_stack(&wt.timer);
+	dma_fence_signal(wt.f.real);
+	free_fences(&wt.f);
+	return err;
+}
+
+struct proxy_wait {
+	struct wait_queue_entry base;
+	struct dma_fence *fence;
+	bool seen;
+};
+
+static int proxy_wait_cb(struct wait_queue_entry *entry,
+			 unsigned int mode, int flags, void *key)
+{
+	struct proxy_wait *p = container_of(entry, typeof(*p), base);
+
+	p->fence = key;
+	p->seen = true;
+
+	return 0;
+}
+
+static int wrap_listen_early(void *arg)
+{
+	struct proxy_wait wait = { .base.func = proxy_wait_cb };
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_replace_proxy(&f.slot, f.real);
+	dma_fence_add_proxy_listener(f.proxy, &wait.base);
+
+	if (!wait.seen) {
+		pr_err("Proxy listener was not called after replace!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (wait.fence != f.real) {
+		pr_err("Proxy listener was not passed the real fence!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_listen_late(void *arg)
+{
+	struct proxy_wait wait = { .base.func = proxy_wait_cb };
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_add_proxy_listener(f.proxy, &wait.base);
+	dma_fence_replace_proxy(&f.slot, f.real);
+
+	if (!wait.seen) {
+		pr_err("Proxy listener was not called on replace!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (wait.fence != f.real) {
+		pr_err("Proxy listener was not passed the real fence!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+static int wrap_listen_cancel(void *arg)
+{
+	struct proxy_wait wait = { .base.func = proxy_wait_cb };
+	struct fences f;
+	int err = -EINVAL;
+
+	if (create_fences(&f, false))
+		return -ENOMEM;
+
+	dma_fence_add_proxy_listener(f.proxy, &wait.base);
+	if (!dma_fence_remove_proxy_listener(f.proxy, &wait.base)) {
+		pr_err("Cancelling listener, already detached?\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+	dma_fence_replace_proxy(&f.slot, f.real);
+
+	if (wait.seen) {
+		pr_err("Proxy listener was called after being removed!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	if (dma_fence_remove_proxy_listener(f.proxy, &wait.base)) {
+		pr_err("Double listener cancellation!\n");
+		err = -EINVAL;
+		goto err_free;
+	}
+
+	err = 0;
+err_free:
+	dma_fence_signal(f.real);
+	free_fences(&f);
+	return err;
+}
+
+int dma_fence_proxy(void)
+{
+	static const struct subtest tests[] = {
+		SUBTEST(sanitycheck),
+		SUBTEST(wrap_target),
+		SUBTEST(wrap_proxy),
+		SUBTEST(wrap_signaling),
+		SUBTEST(wrap_signaling_recurse),
+		SUBTEST(wrap_add_callback),
+		SUBTEST(wrap_add_callback_recurse),
+		SUBTEST(wrap_late_add_callback),
+		SUBTEST(wrap_early_add_callback),
+		SUBTEST(wrap_early_add_callback_late),
+		SUBTEST(wrap_early_add_callback_early),
+		SUBTEST(wrap_rm_callback),
+		SUBTEST(wrap_late_rm_callback),
+		SUBTEST(wrap_status),
+		SUBTEST(wrap_error),
+		SUBTEST(wrap_wait),
+		SUBTEST(wrap_wait_timeout),
+		SUBTEST(wrap_listen_early),
+		SUBTEST(wrap_listen_late),
+		SUBTEST(wrap_listen_cancel),
+	};
+	int ret;
+
+	slab_fences = KMEM_CACHE(mock_fence,
+				 SLAB_TYPESAFE_BY_RCU |
+				 SLAB_HWCACHE_ALIGN);
+	if (!slab_fences)
+		return -ENOMEM;
+
+	ret = subtests(tests, NULL);
+
+	kmem_cache_destroy(slab_fences);
+
+	return ret;
+}
diff --git a/include/linux/dma-fence-proxy.h b/include/linux/dma-fence-proxy.h
new file mode 100644
index 000000000000..6a986b5bb009
--- /dev/null
+++ b/include/linux/dma-fence-proxy.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * dma-fence-proxy: allows waiting upon unset and future fences
+ *
+ * Copyright (C) 2017 Intel Corporation
+ */
+
+#ifndef __LINUX_DMA_FENCE_PROXY_H
+#define __LINUX_DMA_FENCE_PROXY_H
+
+#include <linux/kernel.h>
+#include <linux/dma-fence.h>
+
+struct wait_queue_entry;
+
+extern const struct dma_fence_ops dma_fence_proxy_ops;
+
+struct dma_fence *__dma_fence_create_proxy(u64 context, u64 seqno);
+struct dma_fence *dma_fence_create_proxy(void);
+
+static inline bool dma_fence_is_proxy(struct dma_fence *fence)
+{
+	return fence->ops == &dma_fence_proxy_ops;
+}
+
+void dma_fence_proxy_set_real(struct dma_fence *fence, struct dma_fence *real);
+struct dma_fence *dma_fence_proxy_get_real(struct dma_fence *fence);
+
+struct dma_fence *
+dma_fence_replace_proxy(struct dma_fence __rcu **slot,
+			struct dma_fence *fence);
+
+void dma_fence_add_proxy_listener(struct dma_fence *fence,
+				  struct wait_queue_entry *wait);
+bool dma_fence_remove_proxy_listener(struct dma_fence *fence,
+				     struct wait_queue_entry *wait);
+
+#endif /* __LINUX_DMA_FENCE_PROXY_H */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 07/12] drm/i915: Unpeel awaits on a proxy fence
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (4 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 08/12] drm/i915: Add list_for_each_entry_safe_continue_reverse Chris Wilson
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

If the real target for a proxy fence is known at the time we are
attaching our awaits, use the real target in preference to hooking up to
the proxy. If use the real target instead, we can optimize the awaits,
e.g. if it along the same engine, we can order the submission and avoid
the wait-for-completion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_request.c   | 157 ++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.c |  41 +++++++
 drivers/gpu/drm/i915/i915_scheduler.h |   3 +
 3 files changed, 201 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 22df5b229aed..945f47b216f7 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -24,6 +24,7 @@
 
 #include <linux/dma-fence-array.h>
 #include <linux/dma-fence-chain.h>
+#include <linux/dma-fence-proxy.h>
 #include <linux/irq_work.h>
 #include <linux/prefetch.h>
 #include <linux/sched.h>
@@ -408,6 +409,7 @@ static bool fatal_error(int error)
 	case 0: /* not an error! */
 	case -EAGAIN: /* innocent victim of a GT reset (__i915_request_reset) */
 	case -ETIMEDOUT: /* waiting for Godot (timer_i915_sw_fence_wake) */
+	case -EDEADLK: /* cyclic fence lockup (await_proxy)  */
 		return false;
 	default:
 		return true;
@@ -1135,6 +1137,156 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence)
 	return err;
 }
 
+struct await_proxy {
+	struct wait_queue_entry base;
+	struct i915_request *request;
+	struct dma_fence *fence;
+	struct timer_list timer;
+	struct work_struct work;
+	int (*attach)(struct await_proxy *ap);
+	void *data;
+};
+
+static void await_proxy_work(struct work_struct *work)
+{
+	struct await_proxy *ap = container_of(work, typeof(*ap), work);
+	struct i915_request *rq = ap->request;
+
+	del_timer_sync(&ap->timer);
+
+	if (ap->fence) {
+		int err = 0;
+
+		/*
+		 * If the fence is external, we impose a 10s timeout.
+		 * However, if the fence is internal, we skip a timeout in
+		 * the belief that all fences are in-order (DAG, no cycles)
+		 * and we can enforce forward progress by reset the GPU if
+		 * necessary. A future fence, provided userspace, can trivially
+		 * generate a cycle in the dependency graph, and so cause
+		 * that entire cycle to become deadlocked and for no forward
+		 * progress to either be made, and the driver being kept
+		 * eternally awake.
+		 */
+		if (dma_fence_is_i915(ap->fence) &&
+		    !i915_sched_node_verify_dag(&rq->sched,
+						&to_request(ap->fence)->sched))
+			err = -EDEADLK;
+
+		if (!err) {
+			mutex_lock(&rq->context->timeline->mutex);
+			err = ap->attach(ap);
+			mutex_unlock(&rq->context->timeline->mutex);
+		}
+
+		/* Don't flag an error for co-dependent scheduling */
+		if (err == -EDEADLK) {
+			struct i915_sched_node *waiter =
+				&to_request(ap->fence)->sched;
+			struct i915_dependency *p;
+
+			list_for_each_entry_lockless(p,
+						     &rq->sched.waiters_list,
+						     wait_link) {
+				if (p->waiter == waiter &&
+				    p->flags & I915_DEPENDENCY_WEAK) {
+					err = 0;
+					break;
+				}
+			}
+		}
+
+		if (err < 0)
+			i915_sw_fence_set_error_once(&rq->submit, err);
+	}
+
+	i915_sw_fence_complete(&rq->submit);
+
+	dma_fence_put(ap->fence);
+	kfree(ap);
+}
+
+static int
+await_proxy_wake(struct wait_queue_entry *entry,
+		 unsigned int mode,
+		 int flags,
+		 void *fence)
+{
+	struct await_proxy *ap = container_of(entry, typeof(*ap), base);
+
+	ap->fence = dma_fence_get(fence);
+	schedule_work(&ap->work);
+
+	return 0;
+}
+
+static void
+await_proxy_timer(struct timer_list *t)
+{
+	struct await_proxy *ap = container_of(t, typeof(*ap), timer);
+
+	if (dma_fence_remove_proxy_listener(ap->base.private, &ap->base)) {
+		struct i915_request *rq = ap->request;
+
+		pr_notice("Asynchronous wait on unset proxy fence by %s:%s:%llx timed out\n",
+			  rq->fence.ops->get_driver_name(&rq->fence),
+			  rq->fence.ops->get_timeline_name(&rq->fence),
+			  rq->fence.seqno);
+		i915_sw_fence_set_error_once(&rq->submit, -ETIMEDOUT);
+
+		schedule_work(&ap->work);
+	}
+}
+
+static int
+__i915_request_await_proxy(struct i915_request *rq,
+			   struct dma_fence *fence,
+			   unsigned long timeout,
+			   int (*attach)(struct await_proxy *ap),
+			   void *data)
+{
+	struct await_proxy *ap;
+
+	ap = kzalloc(sizeof(*ap), I915_FENCE_GFP);
+	if (!ap)
+		return -ENOMEM;
+
+	i915_sw_fence_await(&rq->submit);
+	mark_external(rq);
+
+	ap->base.private = fence;
+	ap->base.func = await_proxy_wake;
+	ap->request = rq;
+	INIT_WORK(&ap->work, await_proxy_work);
+	ap->attach = attach;
+	ap->data = data;
+
+	timer_setup(&ap->timer, await_proxy_timer, 0);
+	if (timeout)
+		mod_timer(&ap->timer, round_jiffies_up(jiffies + timeout));
+
+	dma_fence_add_proxy_listener(fence, &ap->base);
+	return 0;
+}
+
+static int await_proxy(struct await_proxy *ap)
+{
+	return i915_request_await_dma_fence(ap->request, ap->fence);
+}
+
+static int
+i915_request_await_proxy(struct i915_request *rq, struct dma_fence *fence)
+{
+	/*
+	 * Wait until we know the real fence so that can optimise the
+	 * inter-fence synchronisation.
+	 */
+	return __i915_request_await_proxy(rq, fence,
+					  i915_fence_context_timeout(rq->i915,
+								     fence->context),
+					  await_proxy, NULL);
+}
+
 int
 i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 {
@@ -1142,6 +1294,9 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 	unsigned int nchild = 1;
 	int ret;
 
+	/* Unpeel the proxy fence if the real target is already known */
+	fence = dma_fence_proxy_get_real(fence);
+
 	/*
 	 * Note that if the fence-array was created in signal-on-any mode,
 	 * we should *not* decompose it into its individual fences. However,
@@ -1181,6 +1336,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 
 		if (dma_fence_is_i915(fence))
 			ret = i915_request_await_request(rq, to_request(fence));
+		else if (dma_fence_is_proxy(fence))
+			ret = i915_request_await_proxy(rq, fence);
 		else
 			ret = i915_request_await_external(rq, fence);
 		if (ret < 0)
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index cbb880b10c65..250832768279 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -469,6 +469,47 @@ int i915_sched_node_add_dependency(struct i915_sched_node *node,
 	return 0;
 }
 
+bool i915_sched_node_verify_dag(struct i915_sched_node *waiter,
+				struct i915_sched_node *signaler)
+{
+	struct i915_dependency *dep, *p;
+	struct i915_dependency stack;
+	bool result = false;
+	LIST_HEAD(dfs);
+
+	if (list_empty(&waiter->waiters_list))
+		return true;
+
+	spin_lock_irq(&schedule_lock);
+
+	stack.signaler = signaler;
+	list_add(&stack.dfs_link, &dfs);
+
+	list_for_each_entry(dep, &dfs, dfs_link) {
+		struct i915_sched_node *node = dep->signaler;
+
+		if (node_signaled(node))
+			continue;
+
+		list_for_each_entry(p, &node->signalers_list, signal_link) {
+			if (p->signaler == waiter)
+				goto out;
+
+			if (list_empty(&p->dfs_link))
+				list_add_tail(&p->dfs_link, &dfs);
+		}
+	}
+
+	result = true;
+out:
+	list_for_each_entry_safe(dep, p, &dfs, dfs_link)
+		INIT_LIST_HEAD(&dep->dfs_link);
+
+	spin_unlock_irq(&schedule_lock);
+
+	return result;
+}
+
 void i915_sched_node_fini(struct i915_sched_node *node)
 {
 	struct i915_dependency *dep, *tmp;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 6f0bf00fc569..13432add8929 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -28,6 +28,9 @@
 void i915_sched_node_init(struct i915_sched_node *node);
 void i915_sched_node_reinit(struct i915_sched_node *node);
 
+bool i915_sched_node_verify_dag(struct i915_sched_node *waiter,
+				struct i915_sched_node *signal);
+
 bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
 				      struct i915_sched_node *signal,
 				      struct i915_dependency *dep,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 08/12] drm/i915: Add list_for_each_entry_safe_continue_reverse
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (5 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 07/12] drm/i915: Unpeel awaits on a proxy fence Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 09/12] drm/i915/gem: Async GPU relocations only Chris Wilson
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

One more list iterator variant, for when we want to unwind from inside
one list iterator with the intention of restarting from the current
entry as the new head of the list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_utils.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 03a73d2bd50d..6ebccdd12d4c 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -266,6 +266,12 @@ static inline int list_is_last_rcu(const struct list_head *list,
 	return READ_ONCE(list->next) == head;
 }
 
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+	     n = list_prev_entry(pos, member);				\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))
+
 /*
  * Wait until the work is finally complete, even if it tries to postpone
  * by requeueing itself. Note, that if the worker never cancels itself,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 09/12] drm/i915/gem: Async GPU relocations only
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (6 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 08/12] drm/i915: Add list_for_each_entry_safe_continue_reverse Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 10/12] drm/i915/gem: Lift GPU relocation allocation Chris Wilson
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Reduce the 3 relocation patches down to the single path that accommodates
all. The primary motivation for this is to guard the relocations with a
natural fence (derived from the i915_request used to write the
relocation from the GPU).

The tradeoff in using async gpu relocations is that it increases latency
over using direct CPU relocations, for the cases where the target is
idle and accessible by the CPU. The benefit is greatly reduced lock
contention and improved concurrency by pipelining.

Note that forcing the async gpu relocations does reveal a few issues
they have. Firstly, is that they are visible as writes to gem_busy,
causing to mark some buffers are being to written to by the GPU even
though userspace only reads. Secondly is that, in combination with the
cmdparser, they can cause priority inversions. This should be the case
where the work is being put into a common workqueue losing our priority
information and so being executed in FIFO from the worker, denying us
the opportunity to reorder the requests afterwards.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 295 ++----------------
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  21 +-
 2 files changed, 27 insertions(+), 289 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 219a36995b96..e840b2a85284 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -45,13 +45,6 @@ struct eb_vma_array {
 	struct eb_vma vma[];
 };
 
-enum {
-	FORCE_CPU_RELOC = 1,
-	FORCE_GTT_RELOC,
-	FORCE_GPU_RELOC,
-#define DBG_FORCE_RELOC 0 /* choose one of the above! */
-};
-
 #define __EXEC_OBJECT_HAS_PIN		BIT(31)
 #define __EXEC_OBJECT_HAS_FENCE		BIT(30)
 #define __EXEC_OBJECT_NEEDS_MAP		BIT(29)
@@ -260,8 +253,6 @@ struct i915_execbuffer {
 	 */
 	struct reloc_cache {
 		struct drm_mm_node node; /** temporary GTT binding */
-		unsigned long vaddr; /** Current kmap address */
-		unsigned long page; /** Currently mapped page index */
 		unsigned int gen; /** Cached value of INTEL_GEN */
 		bool use_64bit_reloc : 1;
 		bool has_llc : 1;
@@ -605,23 +596,6 @@ eb_add_vma(struct i915_execbuffer *eb,
 	}
 }
 
-static inline int use_cpu_reloc(const struct reloc_cache *cache,
-				const struct drm_i915_gem_object *obj)
-{
-	if (!i915_gem_object_has_struct_page(obj))
-		return false;
-
-	if (DBG_FORCE_RELOC == FORCE_CPU_RELOC)
-		return true;
-
-	if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)
-		return false;
-
-	return (cache->has_llc ||
-		obj->cache_dirty ||
-		obj->cache_level != I915_CACHE_NONE);
-}
-
 static int eb_reserve_vma(const struct i915_execbuffer *eb,
 			  struct eb_vma *ev,
 			  u64 pin_flags)
@@ -945,8 +919,6 @@ relocation_target(const struct drm_i915_gem_relocation_entry *reloc,
 static void reloc_cache_init(struct reloc_cache *cache,
 			     struct drm_i915_private *i915)
 {
-	cache->page = -1;
-	cache->vaddr = 0;
 	/* Must be a variable in the struct to allow GCC to unroll. */
 	cache->gen = INTEL_GEN(i915);
 	cache->has_llc = HAS_LLC(i915);
@@ -1089,181 +1061,6 @@ static int reloc_gpu_flush(struct reloc_cache *cache)
 	return err;
 }
 
-static void reloc_cache_reset(struct reloc_cache *cache)
-{
-	void *vaddr;
-
-	if (!cache->vaddr)
-		return;
-
-	vaddr = unmask_page(cache->vaddr);
-	if (cache->vaddr & KMAP) {
-		if (cache->vaddr & CLFLUSH_AFTER)
-			mb();
-
-		kunmap_atomic(vaddr);
-		i915_gem_object_finish_access((struct drm_i915_gem_object *)cache->node.mm);
-	} else {
-		struct i915_ggtt *ggtt = cache_to_ggtt(cache);
-
-		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
-		io_mapping_unmap_atomic((void __iomem *)vaddr);
-
-		if (drm_mm_node_allocated(&cache->node)) {
-			ggtt->vm.clear_range(&ggtt->vm,
-					     cache->node.start,
-					     cache->node.size);
-			mutex_lock(&ggtt->vm.mutex);
-			drm_mm_remove_node(&cache->node);
-			mutex_unlock(&ggtt->vm.mutex);
-		} else {
-			i915_vma_unpin((struct i915_vma *)cache->node.mm);
-		}
-	}
-
-	cache->vaddr = 0;
-	cache->page = -1;
-}
-
-static void *reloc_kmap(struct drm_i915_gem_object *obj,
-			struct reloc_cache *cache,
-			unsigned long page)
-{
-	void *vaddr;
-
-	if (cache->vaddr) {
-		kunmap_atomic(unmask_page(cache->vaddr));
-	} else {
-		unsigned int flushes;
-		int err;
-
-		err = i915_gem_object_prepare_write(obj, &flushes);
-		if (err)
-			return ERR_PTR(err);
-
-		BUILD_BUG_ON(KMAP & CLFLUSH_FLAGS);
-		BUILD_BUG_ON((KMAP | CLFLUSH_FLAGS) & PAGE_MASK);
-
-		cache->vaddr = flushes | KMAP;
-		cache->node.mm = (void *)obj;
-		if (flushes)
-			mb();
-	}
-
-	vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, page));
-	cache->vaddr = unmask_flags(cache->vaddr) | (unsigned long)vaddr;
-	cache->page = page;
-
-	return vaddr;
-}
-
-static void *reloc_iomap(struct drm_i915_gem_object *obj,
-			 struct reloc_cache *cache,
-			 unsigned long page)
-{
-	struct i915_ggtt *ggtt = cache_to_ggtt(cache);
-	unsigned long offset;
-	void *vaddr;
-
-	if (cache->vaddr) {
-		intel_gt_flush_ggtt_writes(ggtt->vm.gt);
-		io_mapping_unmap_atomic((void __force __iomem *) unmask_page(cache->vaddr));
-	} else {
-		struct i915_vma *vma;
-		int err;
-
-		if (i915_gem_object_is_tiled(obj))
-			return ERR_PTR(-EINVAL);
-
-		if (use_cpu_reloc(cache, obj))
-			return NULL;
-
-		i915_gem_object_lock(obj);
-		err = i915_gem_object_set_to_gtt_domain(obj, true);
-		i915_gem_object_unlock(obj);
-		if (err)
-			return ERR_PTR(err);
-
-		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
-					       PIN_MAPPABLE |
-					       PIN_NONBLOCK /* NOWARN */ |
-					       PIN_NOEVICT);
-		if (IS_ERR(vma)) {
-			memset(&cache->node, 0, sizeof(cache->node));
-			mutex_lock(&ggtt->vm.mutex);
-			err = drm_mm_insert_node_in_range
-				(&ggtt->vm.mm, &cache->node,
-				 PAGE_SIZE, 0, I915_COLOR_UNEVICTABLE,
-				 0, ggtt->mappable_end,
-				 DRM_MM_INSERT_LOW);
-			mutex_unlock(&ggtt->vm.mutex);
-			if (err) /* no inactive aperture space, use cpu reloc */
-				return NULL;
-		} else {
-			cache->node.start = vma->node.start;
-			cache->node.mm = (void *)vma;
-		}
-	}
-
-	offset = cache->node.start;
-	if (drm_mm_node_allocated(&cache->node)) {
-		ggtt->vm.insert_page(&ggtt->vm,
-				     i915_gem_object_get_dma_address(obj, page),
-				     offset, I915_CACHE_NONE, 0);
-	} else {
-		offset += page << PAGE_SHIFT;
-	}
-
-	vaddr = (void __force *)io_mapping_map_atomic_wc(&ggtt->iomap,
-							 offset);
-	cache->page = page;
-	cache->vaddr = (unsigned long)vaddr;
-
-	return vaddr;
-}
-
-static void *reloc_vaddr(struct drm_i915_gem_object *obj,
-			 struct reloc_cache *cache,
-			 unsigned long page)
-{
-	void *vaddr;
-
-	if (cache->page == page) {
-		vaddr = unmask_page(cache->vaddr);
-	} else {
-		vaddr = NULL;
-		if ((cache->vaddr & KMAP) == 0)
-			vaddr = reloc_iomap(obj, cache, page);
-		if (!vaddr)
-			vaddr = reloc_kmap(obj, cache, page);
-	}
-
-	return vaddr;
-}
-
-static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
-{
-	if (unlikely(flushes & (CLFLUSH_BEFORE | CLFLUSH_AFTER))) {
-		if (flushes & CLFLUSH_BEFORE) {
-			clflushopt(addr);
-			mb();
-		}
-
-		*addr = value;
-
-		/*
-		 * Writes to the same cacheline are serialised by the CPU
-		 * (including clflush). On the write path, we only require
-		 * that it hits memory in an orderly fashion and place
-		 * mb barriers at the start and end of the relocation phase
-		 * to ensure ordering of clflush wrt to the system.
-		 */
-		if (flushes & CLFLUSH_AFTER)
-			clflushopt(addr);
-	} else
-		*addr = value;
-}
-
 static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -1429,17 +1226,6 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
 	return cmd;
 }
 
-static inline bool use_reloc_gpu(struct i915_vma *vma)
-{
-	if (DBG_FORCE_RELOC == FORCE_GPU_RELOC)
-		return true;
-
-	if (DBG_FORCE_RELOC)
-		return false;
-
-	return !dma_resv_test_signaled_rcu(vma->resv, true);
-}
-
 static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
 {
 	struct page *page;
@@ -1454,10 +1240,10 @@ static unsigned long vma_phys_addr(struct i915_vma *vma, u32 offset)
 	return addr + offset_in_page(offset);
 }
 
-static bool __reloc_entry_gpu(struct i915_execbuffer *eb,
-			      struct i915_vma *vma,
-			      u64 offset,
-			      u64 target_addr)
+static int __reloc_entry_gpu(struct i915_execbuffer *eb,
+			     struct i915_vma *vma,
+			     u64 offset,
+			     u64 target_addr)
 {
 	const unsigned int gen = eb->reloc_cache.gen;
 	unsigned int len;
@@ -1473,7 +1259,7 @@ static bool __reloc_entry_gpu(struct i915_execbuffer *eb,
 
 	batch = reloc_gpu(eb, vma, len);
 	if (IS_ERR(batch))
-		return false;
+		return PTR_ERR(batch);
 
 	addr = gen8_canonical_addr(vma->node.start + offset);
 	if (gen >= 8) {
@@ -1522,55 +1308,21 @@ static bool __reloc_entry_gpu(struct i915_execbuffer *eb,
 		*batch++ = target_addr;
 	}
 
-	return true;
-}
-
-static bool reloc_entry_gpu(struct i915_execbuffer *eb,
-			    struct i915_vma *vma,
-			    u64 offset,
-			    u64 target_addr)
-{
-	if (eb->reloc_cache.vaddr)
-		return false;
-
-	if (!use_reloc_gpu(vma))
-		return false;
-
-	return __reloc_entry_gpu(eb, vma, offset, target_addr);
+	return 0;
 }
 
 static u64
-relocate_entry(struct i915_vma *vma,
+relocate_entry(struct i915_execbuffer *eb,
+	       struct i915_vma *vma,
 	       const struct drm_i915_gem_relocation_entry *reloc,
-	       struct i915_execbuffer *eb,
 	       const struct i915_vma *target)
 {
 	u64 target_addr = relocation_target(reloc, target);
-	u64 offset = reloc->offset;
-
-	if (!reloc_entry_gpu(eb, vma, offset, target_addr)) {
-		bool wide = eb->reloc_cache.use_64bit_reloc;
-		void *vaddr;
-
-repeat:
-		vaddr = reloc_vaddr(vma->obj,
-				    &eb->reloc_cache,
-				    offset >> PAGE_SHIFT);
-		if (IS_ERR(vaddr))
-			return PTR_ERR(vaddr);
-
-		GEM_BUG_ON(!IS_ALIGNED(offset, sizeof(u32)));
-		clflush_write32(vaddr + offset_in_page(offset),
-				lower_32_bits(target_addr),
-				eb->reloc_cache.vaddr);
-
-		if (wide) {
-			offset += sizeof(u32);
-			target_addr >>= 32;
-			wide = false;
-			goto repeat;
-		}
-	}
+	int err;
+
+	err = __reloc_entry_gpu(eb, vma, reloc->offset, target_addr);
+	if (err)
+		return err;
 
 	return target->node.start | UPDATE;
 }
@@ -1635,8 +1387,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	 * If the relocation already has the right value in it, no
 	 * more work needs to be done.
 	 */
-	if (!DBG_FORCE_RELOC &&
-	    gen8_canonical_addr(target->vma->node.start) == reloc->presumed_offset)
+	if (gen8_canonical_addr(target->vma->node.start) == reloc->presumed_offset)
 		return 0;
 
 	/* Check that the relocation address is valid... */
@@ -1668,7 +1419,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	ev->flags &= ~EXEC_OBJECT_ASYNC;
 
 	/* and update the user's relocation entry */
-	return relocate_entry(ev->vma, reloc, eb, target->vma);
+	return relocate_entry(eb, ev->vma, reloc, target->vma);
 }
 
 static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
@@ -1706,10 +1457,8 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 		 * this is bad and so lockdep complains vehemently.
 		 */
 		copied = __copy_from_user(r, urelocs, count * sizeof(r[0]));
-		if (unlikely(copied)) {
-			remain = -EFAULT;
-			goto out;
-		}
+		if (unlikely(copied))
+			return -EFAULT;
 
 		remain -= count;
 		do {
@@ -1717,8 +1466,7 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 
 			if (likely(offset == 0)) {
 			} else if ((s64)offset < 0) {
-				remain = (int)offset;
-				goto out;
+				return (int)offset;
 			} else {
 				/*
 				 * Note that reporting an error now
@@ -1748,9 +1496,8 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 		} while (r++, --count);
 		urelocs += ARRAY_SIZE(stack);
 	} while (remain);
-out:
-	reloc_cache_reset(&eb->reloc_cache);
-	return remain;
+
+	return 0;
 }
 
 static int eb_relocate(struct i915_execbuffer *eb)
@@ -2618,7 +2365,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.i915 = i915;
 	eb.file = file;
 	eb.args = args;
-	if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC))
+	if (!(args->flags & I915_EXEC_NO_RELOC))
 		args->flags |= __EXEC_HAS_RELOC;
 
 	eb.exec = exec;
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
index a49016f8ee0d..57c14d3340cd 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
@@ -37,20 +37,14 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 		return err;
 
 	/* 8-Byte aligned */
-	if (!__reloc_entry_gpu(eb, vma,
-			       offsets[0] * sizeof(u32),
-			       0)) {
-		err = -EIO;
+	err = __reloc_entry_gpu(eb, vma, offsets[0] * sizeof(u32), 0);
+	if (err)
 		goto unpin_vma;
-	}
 
 	/* !8-Byte aligned */
-	if (!__reloc_entry_gpu(eb, vma,
-			       offsets[1] * sizeof(u32),
-			       1)) {
-		err = -EIO;
+	err = __reloc_entry_gpu(eb, vma, offsets[1] * sizeof(u32), 1);
+	if (err)
 		goto unpin_vma;
-	}
 
 	/* Skip to the end of the cmd page */
 	i = PAGE_SIZE / sizeof(u32) - RELOC_TAIL - 1;
@@ -60,12 +54,9 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	eb->reloc_cache.rq_size += i;
 
 	/* Force batch chaining */
-	if (!__reloc_entry_gpu(eb, vma,
-			       offsets[2] * sizeof(u32),
-			       2)) {
-		err = -EIO;
+	err = __reloc_entry_gpu(eb, vma, offsets[2] * sizeof(u32), 2);
+	if (err)
 		goto unpin_vma;
-	}
 
 	GEM_BUG_ON(!eb->reloc_cache.rq);
 	rq = i915_request_get(eb->reloc_cache.rq);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 10/12] drm/i915/gem: Lift GPU relocation allocation
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (7 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 09/12] drm/i915/gem: Async GPU relocations only Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 11/12] drm/i915/gem: Add all GPU reloc awaits/signals en masse Chris Wilson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Since we have reduced the relocations paths to just use the async GPU,
we can lift the request allocation to the start of the relocations.
Knowing that we use one request for all relocations will simplify
tracking the relocation fence.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 53 +++++++++----------
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  4 ++
 2 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index e840b2a85284..903805455af1 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -900,8 +900,6 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle)
 
 static void eb_destroy(const struct i915_execbuffer *eb)
 {
-	GEM_BUG_ON(eb->reloc_cache.rq);
-
 	if (eb->array)
 		eb_vma_array_put(eb->array);
 
@@ -926,7 +924,6 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->has_fence = cache->gen < 4;
 	cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
 	cache->node.flags = 0;
-	cache->rq = NULL;
 	cache->target = NULL;
 }
 
@@ -1081,9 +1078,8 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
 	return err;
 }
 
-static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
-			     struct intel_engine_cs *engine,
-			     unsigned int len)
+static int
+__reloc_gpu_alloc(struct i915_execbuffer *eb, struct intel_engine_cs *engine)
 {
 	struct reloc_cache *cache = &eb->reloc_cache;
 	struct intel_gt_buffer_pool_node *pool;
@@ -1173,11 +1169,6 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	return err;
 }
 
-static bool reloc_can_use_engine(const struct intel_engine_cs *engine)
-{
-	return engine->class != VIDEO_DECODE_CLASS || !IS_GEN(engine->i915, 6);
-}
-
 static u32 *reloc_gpu(struct i915_execbuffer *eb,
 		      struct i915_vma *vma,
 		      unsigned int len)
@@ -1186,20 +1177,6 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
 	u32 *cmd;
 	int err;
 
-	if (unlikely(!cache->rq)) {
-		struct intel_engine_cs *engine = eb->engine;
-
-		if (!reloc_can_use_engine(engine)) {
-			engine = engine->gt->engine_class[COPY_ENGINE_CLASS][0];
-			if (!engine)
-				return ERR_PTR(-ENODEV);
-		}
-
-		err = __reloc_gpu_alloc(eb, engine, len);
-		if (unlikely(err))
-			return ERR_PTR(err);
-	}
-
 	if (vma != cache->target) {
 		err = reloc_move_to_gpu(cache->rq, vma);
 		if (unlikely(err)) {
@@ -1500,6 +1477,24 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 	return 0;
 }
 
+static bool reloc_can_use_engine(const struct intel_engine_cs *engine)
+{
+	return engine->class != VIDEO_DECODE_CLASS || !IS_GEN(engine->i915, 6);
+}
+
+static int reloc_gpu_alloc(struct i915_execbuffer *eb)
+{
+	struct intel_engine_cs *engine = eb->engine;
+
+	if (!reloc_can_use_engine(engine)) {
+		engine = engine->gt->engine_class[COPY_ENGINE_CLASS][0];
+		if (!engine)
+			return -ENODEV;
+	}
+
+	return __reloc_gpu_alloc(eb, engine);
+}
+
 static int eb_relocate(struct i915_execbuffer *eb)
 {
 	int err;
@@ -1519,6 +1514,11 @@ static int eb_relocate(struct i915_execbuffer *eb)
 		struct eb_vma *ev;
 		int flush;
 
+		err = reloc_gpu_alloc(eb);
+		if (err)
+			return err;
+		GEM_BUG_ON(!eb->reloc_cache.rq);
+
 		list_for_each_entry(ev, &eb->relocs, reloc_link) {
 			err = eb_relocate_vma(eb, ev);
 			if (err)
@@ -2490,9 +2490,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		batch = vma;
 	}
 
-	/* All GPU relocation batches must be submitted prior to the user rq */
-	GEM_BUG_ON(eb.reloc_cache.rq);
-
 	/* Allocate a request for this batch buffer nice and early. */
 	eb.request = i915_request_create(eb.context);
 	if (IS_ERR(eb.request)) {
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
index 57c14d3340cd..f0ec23632a78 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
@@ -36,6 +36,10 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	if (err)
 		return err;
 
+	err = reloc_gpu_alloc(eb);
+	if (err)
+		goto unpin_vma;
+
 	/* 8-Byte aligned */
 	err = __reloc_entry_gpu(eb, vma, offsets[0] * sizeof(u32), 0);
 	if (err)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 11/12] drm/i915/gem: Add all GPU reloc awaits/signals en masse
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (8 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 10/12] drm/i915/gem: Lift GPU relocation allocation Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 12/12] drm/i915/gem: Make relocations atomic within execbuf Chris Wilson
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Asynchronous waits and signaling form a traditional semaphore with all
the usual ordering problems with taking multiple locks. If we want to
add more than one wait on a shared resource by the GPU, we must ensure
that all the associated timelines are advanced atomically, ergo we must
lock all the timelines en masse.

Testcase: igt/gem_exec_reloc/basic-concurrent16
Fixes: 0e97fbb08055 ("drm/i915/gem: Use a single chained reloc batches for a single execbuf")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1889
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 106 ++++++++++++------
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  24 ++--
 2 files changed, 86 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 903805455af1..be6fb2ebf8cc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -259,7 +259,6 @@ struct i915_execbuffer {
 		bool has_fence : 1;
 		bool needs_unfenced : 1;
 
-		struct i915_vma *target;
 		struct i915_request *rq;
 		struct i915_vma *rq_vma;
 		u32 *rq_cmd;
@@ -924,7 +923,6 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->has_fence = cache->gen < 4;
 	cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
 	cache->node.flags = 0;
-	cache->target = NULL;
 }
 
 static inline void *unmask_page(unsigned long p)
@@ -1058,26 +1056,6 @@ static int reloc_gpu_flush(struct reloc_cache *cache)
 	return err;
 }
 
-static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
-{
-	struct drm_i915_gem_object *obj = vma->obj;
-	int err;
-
-	i915_vma_lock(vma);
-
-	if (obj->cache_dirty & ~obj->cache_coherent)
-		i915_gem_clflush_object(obj, 0);
-	obj->write_domain = 0;
-
-	err = i915_request_await_object(rq, vma->obj, true);
-	if (err == 0)
-		err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
-
-	i915_vma_unlock(vma);
-
-	return err;
-}
-
 static int
 __reloc_gpu_alloc(struct i915_execbuffer *eb, struct intel_engine_cs *engine)
 {
@@ -1177,16 +1155,6 @@ static u32 *reloc_gpu(struct i915_execbuffer *eb,
 	u32 *cmd;
 	int err;
 
-	if (vma != cache->target) {
-		err = reloc_move_to_gpu(cache->rq, vma);
-		if (unlikely(err)) {
-			i915_request_set_error_once(cache->rq, err);
-			return ERR_PTR(err);
-		}
-
-		cache->target = vma;
-	}
-
 	if (unlikely(cache->rq_size + len >
 		     PAGE_SIZE / sizeof(u32) - RELOC_TAIL)) {
 		err = reloc_gpu_chain(cache);
@@ -1477,6 +1445,74 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct eb_vma *ev)
 	return 0;
 }
 
+static int reloc_move_to_gpu(struct reloc_cache *cache, struct eb_vma *ev)
+{
+	struct i915_request *rq = cache->rq;
+	struct i915_vma *vma = ev->vma;
+	struct drm_i915_gem_object *obj = vma->obj;
+	int err;
+
+	if (obj->cache_dirty & ~obj->cache_coherent)
+		i915_gem_clflush_object(obj, 0);
+
+	obj->write_domain = I915_GEM_DOMAIN_RENDER;
+	obj->read_domains = I915_GEM_DOMAIN_RENDER;
+
+	err = i915_request_await_object(rq, obj, true);
+	if (err == 0) {
+		dma_resv_add_excl_fence(vma->resv, &rq->fence);
+		err = __i915_vma_move_to_active(vma, rq);
+	}
+
+	return err;
+}
+
+static int
+lock_relocs(struct i915_execbuffer *eb)
+{
+	struct ww_acquire_ctx acquire;
+	struct eb_vma *ev;
+	int err = 0;
+
+	ww_acquire_init(&acquire, &reservation_ww_class);
+
+	list_for_each_entry(ev, &eb->relocs, reloc_link) {
+		struct i915_vma *vma = ev->vma;
+
+		err = ww_mutex_lock_interruptible(&vma->resv->lock, &acquire);
+		if (err == -EDEADLK) {
+			struct eb_vma *unlock = ev, *en;
+
+			list_for_each_entry_safe_continue_reverse(unlock, en,
+								  &eb->relocs,
+								  reloc_link) {
+				ww_mutex_unlock(&unlock->vma->resv->lock);
+				list_move_tail(&unlock->reloc_link,
+					       &eb->relocs);
+			}
+
+			GEM_BUG_ON(!list_is_first(&ev->reloc_link,
+						  &eb->relocs));
+			err = ww_mutex_lock_slow_interruptible(&vma->resv->lock,
+							       &acquire);
+		}
+		if (err)
+			break;
+	}
+
+	ww_acquire_done(&acquire);
+
+	list_for_each_entry_continue_reverse(ev, &eb->relocs, reloc_link) {
+		if (err == 0)
+			err = reloc_move_to_gpu(&eb->reloc_cache, ev);
+		ww_mutex_unlock(&ev->vma->resv->lock);
+	}
+
+	ww_acquire_fini(&acquire);
+
+	return err;
+}
+
 static bool reloc_can_use_engine(const struct intel_engine_cs *engine)
 {
 	return engine->class != VIDEO_DECODE_CLASS || !IS_GEN(engine->i915, 6);
@@ -1519,10 +1555,10 @@ static int eb_relocate(struct i915_execbuffer *eb)
 			return err;
 		GEM_BUG_ON(!eb->reloc_cache.rq);
 
+		err = lock_relocs(eb);
 		list_for_each_entry(ev, &eb->relocs, reloc_link) {
-			err = eb_relocate_vma(eb, ev);
-			if (err)
-				break;
+			if (err == 0)
+				err = eb_relocate_vma(eb, ev);
 		}
 
 		flush = reloc_gpu_flush(&eb->reloc_cache);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
index f0ec23632a78..8b06609e260e 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
@@ -24,15 +24,15 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 		GENMASK_ULL(eb->reloc_cache.use_64bit_reloc ? 63 : 31, 0);
 	const u32 *map = page_mask_bits(obj->mm.mapping);
 	struct i915_request *rq;
-	struct i915_vma *vma;
+	struct eb_vma ev;
 	int err;
 	int i;
 
-	vma = i915_vma_instance(obj, eb->context->vm, NULL);
-	if (IS_ERR(vma))
-		return PTR_ERR(vma);
+	ev.vma = i915_vma_instance(obj, eb->context->vm, NULL);
+	if (IS_ERR(ev.vma))
+		return PTR_ERR(ev.vma);
 
-	err = i915_vma_pin(vma, 0, 0, PIN_USER | PIN_HIGH);
+	err = i915_vma_pin(ev.vma, 0, 0, PIN_USER | PIN_HIGH);
 	if (err)
 		return err;
 
@@ -40,13 +40,19 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	if (err)
 		goto unpin_vma;
 
+	i915_vma_lock(ev.vma);
+	err = reloc_move_to_gpu(&eb->reloc_cache, &ev);
+	i915_vma_unlock(ev.vma);
+	if (err)
+		goto unpin_vma;
+
 	/* 8-Byte aligned */
-	err = __reloc_entry_gpu(eb, vma, offsets[0] * sizeof(u32), 0);
+	err = __reloc_entry_gpu(eb, ev.vma, offsets[0] * sizeof(u32), 0);
 	if (err)
 		goto unpin_vma;
 
 	/* !8-Byte aligned */
-	err = __reloc_entry_gpu(eb, vma, offsets[1] * sizeof(u32), 1);
+	err = __reloc_entry_gpu(eb, ev.vma, offsets[1] * sizeof(u32), 1);
 	if (err)
 		goto unpin_vma;
 
@@ -58,7 +64,7 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	eb->reloc_cache.rq_size += i;
 
 	/* Force batch chaining */
-	err = __reloc_entry_gpu(eb, vma, offsets[2] * sizeof(u32), 2);
+	err = __reloc_entry_gpu(eb, ev.vma, offsets[2] * sizeof(u32), 2);
 	if (err)
 		goto unpin_vma;
 
@@ -96,7 +102,7 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 put_rq:
 	i915_request_put(rq);
 unpin_vma:
-	i915_vma_unpin(vma);
+	i915_vma_unpin(ev.vma);
 	return err;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] [PATCH 12/12] drm/i915/gem: Make relocations atomic within execbuf
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (9 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 11/12] drm/i915/gem: Add all GPU reloc awaits/signals en masse Chris Wilson
@ 2020-05-25  7:53 ` Chris Wilson
  2020-05-25  8:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Patchwork
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25  7:53 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Although we may chide userspace for reusing the same batches
concurrently from multiple threads, at the same time we must be very
careful to only execute the batch and its relocations as supplied by the
user. If we are not careful, we may allow another thread to rewrite the
current batch with its own relocations. We must order the relocations
and their batch such that they are an atomic pair on the GPU, and that
the ioctl itself appears atomic to userspace. The order of execution may
be undetermined, but it will not be subverted.

We could do this by moving the relocations into the main request, if it
were not for the situation where we need a second engine to perform the
relocations for us. Instead, we use the dependency tracking to only
publish the write fence on the main request and not on the relocation
request, so that concurrent updates are queued after the batch has
consumed its relocations.

Testcase: igt/gem_exec_reloc/basic-concurrent
Fixes: ef398881d27d ("drm/i915/gem: Limit struct_mutex to eb_reserve")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 103 ++++++++++++------
 .../i915/gem/selftests/i915_gem_execbuffer.c  |  11 +-
 2 files changed, 74 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index be6fb2ebf8cc..854edc67a82b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -5,6 +5,7 @@
  */
 
 #include <linux/intel-iommu.h>
+#include <linux/dma-fence-proxy.h>
 #include <linux/dma-resv.h>
 #include <linux/sync_file.h>
 #include <linux/uaccess.h>
@@ -259,6 +260,8 @@ struct i915_execbuffer {
 		bool has_fence : 1;
 		bool needs_unfenced : 1;
 
+		struct dma_fence *fence;
+
 		struct i915_request *rq;
 		struct i915_vma *rq_vma;
 		u32 *rq_cmd;
@@ -555,16 +558,6 @@ eb_add_vma(struct i915_execbuffer *eb,
 	ev->exec = entry;
 	ev->flags = entry->flags;
 
-	if (eb->lut_size > 0) {
-		ev->handle = entry->handle;
-		hlist_add_head(&ev->node,
-			       &eb->buckets[hash_32(entry->handle,
-						    eb->lut_size)]);
-	}
-
-	if (entry->relocation_count)
-		list_add_tail(&ev->reloc_link, &eb->relocs);
-
 	/*
 	 * SNA is doing fancy tricks with compressing batch buffers, which leads
 	 * to negative relocation deltas. Usually that works out ok since the
@@ -581,9 +574,21 @@ eb_add_vma(struct i915_execbuffer *eb,
 		if (eb->reloc_cache.has_fence)
 			ev->flags |= EXEC_OBJECT_NEEDS_FENCE;
 
+		INIT_LIST_HEAD(&ev->reloc_link);
+
 		eb->batch = ev;
 	}
 
+	if (entry->relocation_count)
+		list_add_tail(&ev->reloc_link, &eb->relocs);
+
+	if (eb->lut_size > 0) {
+		ev->handle = entry->handle;
+		hlist_add_head(&ev->node,
+			       &eb->buckets[hash_32(entry->handle,
+						    eb->lut_size)]);
+	}
+
 	if (eb_pin_vma(eb, entry, ev)) {
 		if (entry->offset != vma->node.start) {
 			entry->offset = vma->node.start | UPDATE;
@@ -923,6 +928,7 @@ static void reloc_cache_init(struct reloc_cache *cache,
 	cache->has_fence = cache->gen < 4;
 	cache->needs_unfenced = INTEL_INFO(i915)->unfenced_needs_alignment;
 	cache->node.flags = 0;
+	cache->fence = NULL;
 }
 
 static inline void *unmask_page(unsigned long p)
@@ -1021,13 +1027,9 @@ static unsigned int reloc_bb_flags(const struct reloc_cache *cache)
 
 static int reloc_gpu_flush(struct reloc_cache *cache)
 {
-	struct i915_request *rq;
+	struct i915_request *rq = cache->rq;
 	int err;
 
-	rq = fetch_and_zero(&cache->rq);
-	if (!rq)
-		return 0;
-
 	if (cache->rq_vma) {
 		struct drm_i915_gem_object *obj = cache->rq_vma->obj;
 
@@ -1051,6 +1053,7 @@ static int reloc_gpu_flush(struct reloc_cache *cache)
 		i915_request_set_error_once(rq, err);
 
 	intel_gt_chipset_flush(rq->engine->gt);
+	i915_request_get(rq);
 	i915_request_add(rq);
 
 	return err;
@@ -1353,16 +1356,6 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 		return -EINVAL;
 	}
 
-	/*
-	 * If we write into the object, we need to force the synchronisation
-	 * barrier, either with an asynchronous clflush or if we executed the
-	 * patching using the GPU (though that should be serialised by the
-	 * timeline). To be completely sure, and since we are required to
-	 * do relocations we are already stalling, disable the user's opt
-	 * out of our synchronisation.
-	 */
-	ev->flags &= ~EXEC_OBJECT_ASYNC;
-
 	/* and update the user's relocation entry */
 	return relocate_entry(eb, ev->vma, reloc, target->vma);
 }
@@ -1457,10 +1450,11 @@ static int reloc_move_to_gpu(struct reloc_cache *cache, struct eb_vma *ev)
 
 	obj->write_domain = I915_GEM_DOMAIN_RENDER;
 	obj->read_domains = I915_GEM_DOMAIN_RENDER;
+	ev->flags |= EXEC_OBJECT_ASYNC;
 
 	err = i915_request_await_object(rq, obj, true);
 	if (err == 0) {
-		dma_resv_add_excl_fence(vma->resv, &rq->fence);
+		dma_resv_add_excl_fence(vma->resv, cache->fence);
 		err = __i915_vma_move_to_active(vma, rq);
 	}
 
@@ -1531,7 +1525,15 @@ static int reloc_gpu_alloc(struct i915_execbuffer *eb)
 	return __reloc_gpu_alloc(eb, engine);
 }
 
-static int eb_relocate(struct i915_execbuffer *eb)
+static void free_reloc_fence(struct i915_execbuffer *eb)
+{
+	struct dma_fence *f = fetch_and_zero(&eb->reloc_cache.fence);
+
+	dma_fence_signal(f);
+	dma_fence_put(f);
+}
+
+static int eb_reloc(struct i915_execbuffer *eb)
 {
 	int err;
 
@@ -1550,9 +1552,15 @@ static int eb_relocate(struct i915_execbuffer *eb)
 		struct eb_vma *ev;
 		int flush;
 
+		eb->reloc_cache.fence = __dma_fence_create_proxy(0, 0);
+		if (!eb->reloc_cache.fence)
+			return -ENOMEM;
+
 		err = reloc_gpu_alloc(eb);
-		if (err)
+		if (err) {
+			free_reloc_fence(eb);
 			return err;
+		}
 		GEM_BUG_ON(!eb->reloc_cache.rq);
 
 		err = lock_relocs(eb);
@@ -1560,6 +1568,7 @@ static int eb_relocate(struct i915_execbuffer *eb)
 			if (err == 0)
 				err = eb_relocate_vma(eb, ev);
 		}
+		GEM_BUG_ON(dma_fence_is_signaled(eb->reloc_cache.fence));
 
 		flush = reloc_gpu_flush(&eb->reloc_cache);
 		if (!err)
@@ -1569,6 +1578,15 @@ static int eb_relocate(struct i915_execbuffer *eb)
 	return err;
 }
 
+static void eb_reloc_signal(struct i915_execbuffer *eb, struct i915_request *rq)
+{
+	dma_fence_proxy_set_real(eb->reloc_cache.fence, &rq->fence);
+	i915_request_put(eb->reloc_cache.rq);
+
+	dma_fence_put(eb->reloc_cache.fence);
+	eb->reloc_cache.fence = NULL;
+}
+
 static int eb_move_to_gpu(struct i915_execbuffer *eb)
 {
 	const unsigned int count = eb->buffer_count;
@@ -1812,10 +1830,15 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 	if (err)
 		goto err_batch_unlock;
 
-	/* Wait for all writes (and relocs) into the batch to complete */
-	err = i915_sw_fence_await_reservation(&pw->base.chain,
-					      pw->batch->resv, NULL, false,
-					      0, I915_FENCE_GFP);
+	/* Wait for all writes (or relocs) into the batch to complete */
+	if (!eb->reloc_cache.fence || list_empty(&eb->batch->reloc_link))
+		err = i915_sw_fence_await_reservation(&pw->base.chain,
+						      pw->batch->resv, NULL,
+						      false, 0, I915_FENCE_GFP);
+	else
+		err = i915_sw_fence_await_dma_fence(&pw->base.chain,
+						    &eb->reloc_cache.rq->fence,
+						    0, I915_FENCE_GFP);
 	if (err < 0)
 		goto err_batch_unlock;
 
@@ -1940,6 +1963,15 @@ static int eb_submit(struct i915_execbuffer *eb, struct i915_vma *batch)
 {
 	int err;
 
+	if (eb->reloc_cache.fence) {
+		err = i915_request_await_dma_fence(eb->request,
+						   &eb->reloc_cache.rq->fence);
+		if (err)
+			return err;
+
+		eb_reloc_signal(eb, eb->request);
+	}
+
 	err = eb_move_to_gpu(eb);
 	if (err)
 		return err;
@@ -2464,7 +2496,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_context;
 
-	err = eb_relocate(&eb);
+	err = eb_reloc(&eb);
 	if (err) {
 		/*
 		 * If the user expects the execobject.offset and
@@ -2497,7 +2529,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
 	err = eb_parse(&eb);
 	if (err)
-		goto err_vma;
+		goto err_reloc;
 
 	/*
 	 * snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
@@ -2598,6 +2630,9 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_parse:
 	if (batch->private)
 		intel_gt_buffer_pool_put(batch->private);
+err_reloc:
+	if (eb.reloc_cache.fence)
+		eb_reloc_signal(&eb, eb.reloc_cache.rq);
 err_vma:
 	if (eb.trampoline)
 		i915_vma_unpin(eb.trampoline);
diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
index 8b06609e260e..e0900846de38 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_execbuffer.c
@@ -23,7 +23,6 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	const u64 mask =
 		GENMASK_ULL(eb->reloc_cache.use_64bit_reloc ? 63 : 31, 0);
 	const u32 *map = page_mask_bits(obj->mm.mapping);
-	struct i915_request *rq;
 	struct eb_vma ev;
 	int err;
 	int i;
@@ -40,6 +39,8 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	if (err)
 		goto unpin_vma;
 
+	eb->reloc_cache.fence = &eb->reloc_cache.rq->fence;
+
 	i915_vma_lock(ev.vma);
 	err = reloc_move_to_gpu(&eb->reloc_cache, &ev);
 	i915_vma_unlock(ev.vma);
@@ -68,12 +69,9 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 	if (err)
 		goto unpin_vma;
 
-	GEM_BUG_ON(!eb->reloc_cache.rq);
-	rq = i915_request_get(eb->reloc_cache.rq);
 	err = reloc_gpu_flush(&eb->reloc_cache);
 	if (err)
 		goto put_rq;
-	GEM_BUG_ON(eb->reloc_cache.rq);
 
 	err = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE, HZ / 2);
 	if (err) {
@@ -81,7 +79,7 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 		goto put_rq;
 	}
 
-	if (!i915_request_completed(rq)) {
+	if (!i915_request_completed(eb->reloc_cache.rq)) {
 		pr_err("%s: did not wait for relocations!\n", eb->engine->name);
 		err = -EINVAL;
 		goto put_rq;
@@ -100,7 +98,8 @@ static int __igt_gpu_reloc(struct i915_execbuffer *eb,
 		igt_hexdump(map, 4096);
 
 put_rq:
-	i915_request_put(rq);
+	i915_request_put(eb->reloc_cache.rq);
+	eb->reloc_cache.rq = NULL;
 unpin_vma:
 	i915_vma_unpin(ev.vma);
 	return err;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (10 preceding siblings ...)
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 12/12] drm/i915/gem: Make relocations atomic within execbuf Chris Wilson
@ 2020-05-25  8:03 ` Patchwork
  2020-05-25  8:04 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2020-05-25  8:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
URL   : https://patchwork.freedesktop.org/series/77627/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e73ba84bc40f drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
b10c80198f71 drm/i915/gt: Cancel the flush worker more thoroughly
-:13: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#13: 
<0> [314.146044] i915_mod-1321    2.... 299799443us : intel_gt_fini_buffer_pool: intel_gt_fini_buffer_pool:227 GEM_BUG_ON(!list_empty(&pool->cache_list[n]))

total: 0 errors, 1 warnings, 0 checks, 10 lines checked
176bca8e32c4 drm/i915/gem: Suppress some random warnings
f248457f5a24 drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
95ae398ab244 drm/i915: Improve execute_cb struct packing
fa5d2c8489c4 dma-buf: Proxy fence, an unsignaled fence placeholder
-:45: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#45: 
new file mode 100644

-:427: CHECK:UNCOMMENTED_DEFINITION: spinlock_t definition without comment
#427: FILE: drivers/dma-buf/st-dma-fence-proxy.c:20:
+	spinlock_t lock;

total: 0 errors, 1 warnings, 1 checks, 1147 lines checked
b8b1cf238ad3 drm/i915: Unpeel awaits on a proxy fence
01d3dc2a8e96 drm/i915: Add list_for_each_entry_safe_continue_reverse
-:20: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'pos' - possible side-effects?
#20: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+	     n = list_prev_entry(pos, member);				\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))

-:20: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'n' - possible side-effects?
#20: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+	     n = list_prev_entry(pos, member);				\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))

-:20: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible side-effects?
#20: FILE: drivers/gpu/drm/i915/i915_utils.h:269:
+#define list_for_each_entry_safe_continue_reverse(pos, n, head, member)	\
+	for (pos = list_prev_entry(pos, member),			\
+	     n = list_prev_entry(pos, member);				\
+	     &pos->member != (head);					\
+	     pos = n, n = list_prev_entry(n, member))

total: 0 errors, 0 warnings, 3 checks, 12 lines checked
aeb35c7b979b drm/i915/gem: Async GPU relocations only
df5c0d3ea02f drm/i915/gem: Lift GPU relocation allocation
c3e6e121ab28 drm/i915/gem: Add all GPU reloc awaits/signals en masse
a2f70c5c2f5c drm/i915/gem: Make relocations atomic within execbuf

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (11 preceding siblings ...)
  2020-05-25  8:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Patchwork
@ 2020-05-25  8:04 ` Patchwork
  2020-05-25  8:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
  2020-05-25 15:38 ` [Intel-gfx] [PATCH 01/12] " Mika Kuoppala
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2020-05-25  8:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
URL   : https://patchwork.freedesktop.org/series/77627/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.0
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1222:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1225:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1228:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/display/intel_display.c:1231:22: error: Expected constant expression in case statement
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2274:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2275:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2276:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2277:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2278:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gem/i915_gem_context.c:2279:17: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/intel_reset.c:1310:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/sysfs_engines.c:61:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:62:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gt/sysfs_engines.c:66:10: error: bad integer constant expression
+drivers/gpu/drm/i915/gvt/mmio.c:287:23: warning: memcpy with byte count of 279040
+drivers/gpu/drm/i915/i915_perf.c:1425:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1479:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/intel_wakeref.c:137:19: warning: context imbalance in 'wakeref_auto_timeout' - unexpected unlock
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y
+./include/linux/compiler.h:199:9: warning: context imbalance in 'engines_sample' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:408:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (12 preceding siblings ...)
  2020-05-25  8:04 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
@ 2020-05-25  8:24 ` Patchwork
  2020-05-25 15:38 ` [Intel-gfx] [PATCH 01/12] " Mika Kuoppala
  14 siblings, 0 replies; 23+ messages in thread
From: Patchwork @ 2020-05-25  8:24 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
URL   : https://patchwork.freedesktop.org/series/77627/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8529 -> Patchwork_17767
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_17767 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17767, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17767/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_17767:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_busy@basic@flip:
    - fi-bwr-2160:        [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8529/fi-bwr-2160/igt@kms_busy@basic@flip.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17767/fi-bwr-2160/igt@kms_busy@basic@flip.html

  
New tests
---------

  New tests have been introduced between CI_DRM_8529 and Patchwork_17767:

### New IGT tests (1) ###

  * igt@dmabuf@all@dma_fence_proxy:
    - Statuses : 40 pass(s)
    - Exec time: [0.03, 0.10] s

  

Known issues
------------

  Here are the changes found in Patchwork_17767 that come from known issues:

### IGT changes ###

#### Possible fixes ####

  * igt@i915_selftest@live@coherency:
    - fi-bwr-2160:        [INCOMPLETE][3] ([i915#489]) -> [PASS][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8529/fi-bwr-2160/igt@i915_selftest@live@coherency.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17767/fi-bwr-2160/igt@i915_selftest@live@coherency.html

  
  [i915#489]: https://gitlab.freedesktop.org/drm/intel/issues/489


Participating hosts (45 -> 43)
------------------------------

  Additional (3): fi-hsw-4770 fi-kbl-7560u fi-cfl-guc 
  Missing    (5): fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_8529 -> Patchwork_17767

  CI-20190529: 20190529
  CI_DRM_8529: 9ae23ae1b437ee0d75ed2153eca05ecbd8c417bd @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5676: ff03d458f708583c8f9296f97c38df312055651a @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17767: a2f70c5c2f5c3d04890dad3d0b682597963e4622 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

a2f70c5c2f5c drm/i915/gem: Make relocations atomic within execbuf
c3e6e121ab28 drm/i915/gem: Add all GPU reloc awaits/signals en masse
df5c0d3ea02f drm/i915/gem: Lift GPU relocation allocation
aeb35c7b979b drm/i915/gem: Async GPU relocations only
01d3dc2a8e96 drm/i915: Add list_for_each_entry_safe_continue_reverse
b8b1cf238ad3 drm/i915: Unpeel awaits on a proxy fence
fa5d2c8489c4 dma-buf: Proxy fence, an unsignaled fence placeholder
95ae398ab244 drm/i915: Improve execute_cb struct packing
f248457f5a24 drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
176bca8e32c4 drm/i915/gem: Suppress some random warnings
b10c80198f71 drm/i915/gt: Cancel the flush worker more thoroughly
e73ba84bc40f drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17767/index.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly Chris Wilson
@ 2020-05-25 13:51   ` Mika Kuoppala
  2020-05-25 14:08     ` Chris Wilson
  0 siblings, 1 reply; 23+ messages in thread
From: Mika Kuoppala @ 2020-05-25 13:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Since the worker may rearm, we currently are only guaranteed to flush
> the work if we cancel the timer. If the work was running at the time we
> try and cancel it, we will wait for it to complete, but it may leave
> items in the pool and requeue the work. If we rearrange the immediate
> discard of the pool then cancel the work, we know that the work cannot
> rearm and so our flush will be final.
>
> <0> [314.146044] i915_mod-1321    2.... 299799443us : intel_gt_fini_buffer_pool: intel_gt_fini_buffer_pool:227 GEM_BUG_ON(!list_empty(&pool->cache_list[n]))
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> index 1495054a4305..418ae184cecf 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> @@ -212,8 +212,9 @@ void intel_gt_flush_buffer_pool(struct intel_gt *gt)
>  {
>  	struct intel_gt_buffer_pool *pool = &gt->buffer_pool;
>  
> -	if (cancel_delayed_work_sync(&pool->work))
> +	do {
>  		pool_free_imm(pool);
> +	} while (cancel_delayed_work_sync(&pool->work));

Yeah changing of order makes sense. If you want
a guarantee that the finit goes as you expect, you
could add two cancel_delayed_work_sync and assert
that the final one return false.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

>  }
>  
>  void intel_gt_fini_buffer_pool(struct intel_gt *gt)
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 03/12] drm/i915/gem: Suppress some random warnings
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 03/12] drm/i915/gem: Suppress some random warnings Chris Wilson
@ 2020-05-25 13:55   ` Mika Kuoppala
  0 siblings, 0 replies; 23+ messages in thread
From: Mika Kuoppala @ 2020-05-25 13:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Leave the error propagation in place, but limit the warnings to only
> show up in CI if the unlikely errors are hit.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 3 +--
>  drivers/gpu/drm/i915/gem/i915_gem_phys.c       | 3 +--
>  drivers/gpu/drm/i915/gem/i915_gem_shmem.c      | 3 +--
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c    | 2 +-
>  4 files changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> index e4fb6c372537..219a36995b96 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
> @@ -1626,8 +1626,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
>  			err = i915_vma_bind(target->vma,
>  					    target->vma->obj->cache_level,
>  					    PIN_GLOBAL, NULL);
> -			if (drm_WARN_ONCE(&i915->drm, err,
> -				      "Unexpected failure to bind target VMA!"))
> +			if (err)
>  				return err;
>  		}
>  	}
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> index 4c1c7232b024..12245a47e5fb 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
> @@ -27,8 +27,7 @@ static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
>  	void *dst;
>  	int i;
>  
> -	if (drm_WARN_ON(obj->base.dev,
> -			i915_gem_object_needs_bit17_swizzle(obj)))
> +	if (GEM_WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
>  		return -EINVAL;
>  
>  	/*
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> index 7aff3514d97a..7cf8548ff708 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
> @@ -147,8 +147,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj)
>  		last_pfn = page_to_pfn(page);
>  
>  		/* Check that the i965g/gm workaround works. */
> -		drm_WARN_ON(&i915->drm,
> -			    (gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
> +		GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL);
>  	}
>  	if (sg) { /* loop terminated early; short sg table */
>  		sg_page_sizes |= sg->length;
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> index 8b0708708671..2226146b01c9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
> @@ -235,7 +235,7 @@ i915_gem_userptr_init__mmu_notifier(struct drm_i915_gem_object *obj,
>  	if (flags & I915_USERPTR_UNSYNCHRONIZED)
>  		return capable(CAP_SYS_ADMIN) ? 0 : -EPERM;
>  
> -	if (drm_WARN_ON(obj->base.dev, obj->userptr.mm == NULL))
> +	if (GEM_WARN_ON(!obj->userptr.mm))
>  		return -EINVAL;
>  
>  	mn = i915_mmu_notifier_find(obj->userptr.mm);
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly
  2020-05-25 13:51   ` Mika Kuoppala
@ 2020-05-25 14:08     ` Chris Wilson
  0 siblings, 0 replies; 23+ messages in thread
From: Chris Wilson @ 2020-05-25 14:08 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2020-05-25 14:51:18)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Since the worker may rearm, we currently are only guaranteed to flush
> > the work if we cancel the timer. If the work was running at the time we
> > try and cancel it, we will wait for it to complete, but it may leave
> > items in the pool and requeue the work. If we rearrange the immediate
> > discard of the pool then cancel the work, we know that the work cannot
> > rearm and so our flush will be final.
> >
> > <0> [314.146044] i915_mod-1321    2.... 299799443us : intel_gt_fini_buffer_pool: intel_gt_fini_buffer_pool:227 GEM_BUG_ON(!list_empty(&pool->cache_list[n]))
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> > index 1495054a4305..418ae184cecf 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_gt_buffer_pool.c
> > @@ -212,8 +212,9 @@ void intel_gt_flush_buffer_pool(struct intel_gt *gt)
> >  {
> >       struct intel_gt_buffer_pool *pool = &gt->buffer_pool;
> >  
> > -     if (cancel_delayed_work_sync(&pool->work))
> > +     do {
> >               pool_free_imm(pool);
> > +     } while (cancel_delayed_work_sync(&pool->work));
> 
> Yeah changing of order makes sense. If you want
> a guarantee that the finit goes as you expect, you
> could add two cancel_delayed_work_sync and assert
> that the final one return false.

We have an assert that the lists are empty after this function returns.
That's enough to keep me [un]happy :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt
  2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
                   ` (13 preceding siblings ...)
  2020-05-25  8:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
@ 2020-05-25 15:38 ` Mika Kuoppala
  14 siblings, 0 replies; 23+ messages in thread
From: Mika Kuoppala @ 2020-05-25 15:38 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> In order to keep userptr distinct from ggtt mmaps in the eyes of
> lockdep, we need to avoid marking those userptr vma as PIN_GLOBAL. (So
> long as we comply with only using them as local PIN_USER!)
>
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/1880
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_ggtt.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> index 66165b10256e..8c275f8588c3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c
> +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c
> @@ -424,22 +424,17 @@ static int ggtt_bind_vma(struct i915_vma *vma,
>  	struct drm_i915_gem_object *obj = vma->obj;
>  	u32 pte_flags;
>  
> +	if (i915_vma_is_bound(vma, ~flags & I915_VMA_BIND_MASK))
> +		return 0;
> +
>  	/* Applicable to VLV (gen8+ do not support RO in the GGTT) */
>  	pte_flags = 0;
>  	if (i915_gem_object_is_readonly(obj))
>  		pte_flags |= PTE_READ_ONLY;
>  
>  	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
> -
>  	vma->page_sizes.gtt = I915_GTT_PAGE_SIZE;
>  
> -	/*
> -	 * Without aliasing PPGTT there's no difference between
> -	 * GLOBAL/LOCAL_BIND, it's all the same ptes. Hence unconditionally
> -	 * upgrade to both bound if we bind either to avoid double-binding.
> -	 */
> -	atomic_or(I915_VMA_GLOBAL_BIND | I915_VMA_LOCAL_BIND, &vma->flags);
> -
>  	return 0;
>  }
>  
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 05/12] drm/i915: Improve execute_cb struct packing
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 05/12] drm/i915: Improve execute_cb struct packing Chris Wilson
@ 2020-05-26 11:17   ` Mika Kuoppala
  0 siblings, 0 replies; 23+ messages in thread
From: Mika Kuoppala @ 2020-05-26 11:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Reduce the irq_work llist for attaching the callbacks to the signal for
> both smaller structs (two fewer pointers!) and simpler [debug] code:
>
> Function                                     old     new   delta
> irq_execute_cb                                35      34      -1
> __igt_breadcrumbs_smoketest                 1684    1682      -2
> i915_request_retire                         2003    1996      -7
> __i915_request_create                       1047    1040      -7
> __notify_execute_cb                          135     126      -9
> __i915_request_ctor                          188     178     -10
> __await_execution.part.constprop             451     440     -11
> igt_wait_request                             924     714    -210
>
> One minor artifact is that the order of cb exection is reversed. No
> current use cases are affected by that change.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_request.c | 18 +++++++++---------
>  drivers/gpu/drm/i915/i915_request.h |  2 +-
>  2 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index c282719ad3ac..22df5b229aed 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -42,7 +42,6 @@
>  #include "intel_pm.h"
>  
>  struct execute_cb {
> -	struct list_head link;
>  	struct irq_work work;
>  	struct i915_sw_fence *fence;
>  	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
> @@ -189,14 +188,14 @@ static void irq_execute_cb_hook(struct irq_work *wrk)
>  
>  static void __notify_execute_cb(struct i915_request *rq)
>  {
> -	struct execute_cb *cb;
> +	struct execute_cb *cb, *cn;
>  
>  	lockdep_assert_held(&rq->lock);
>  
> -	if (list_empty(&rq->execute_cb))
> +	if (llist_empty(&rq->execute_cb))
>  		return;
>  
> -	list_for_each_entry(cb, &rq->execute_cb, link)
> +	llist_for_each_entry_safe(cb, cn, rq->execute_cb.first, work.llnode)
>  		irq_work_queue(&cb->work);
>  
>  	/*
> @@ -209,7 +208,7 @@ static void __notify_execute_cb(struct i915_request *rq)
>  	 * preempt-to-idle cycle on the target engine, all the while the
>  	 * master execute_cb may refire.
>  	 */
> -	INIT_LIST_HEAD(&rq->execute_cb);
> +	rq->execute_cb.first = NULL;
>  }
>  
>  static inline void
> @@ -327,7 +326,7 @@ bool i915_request_retire(struct i915_request *rq)
>  		set_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
>  		__notify_execute_cb(rq);
>  	}
> -	GEM_BUG_ON(!list_empty(&rq->execute_cb));
> +	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
>  	spin_unlock_irq(&rq->lock);
>  
>  	remove_from_client(rq);
> @@ -395,7 +394,8 @@ __await_execution(struct i915_request *rq,
>  		i915_sw_fence_complete(cb->fence);
>  		kmem_cache_free(global.slab_execute_cbs, cb);
>  	} else {
> -		list_add_tail(&cb->link, &signal->execute_cb);
> +		cb->work.llnode.next = signal->execute_cb.first;
> +		signal->execute_cb.first = &cb->work.llnode;

With this part giving more glues as of why can we do this,
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>


>  	}
>  	spin_unlock_irq(&signal->lock);
>  
> @@ -704,7 +704,7 @@ static void __i915_request_ctor(void *arg)
>  	rq->file_priv = NULL;
>  	rq->capture_list = NULL;
>  
> -	INIT_LIST_HEAD(&rq->execute_cb);
> +	init_llist_head(&rq->execute_cb);
>  }
>  
>  struct i915_request *
> @@ -794,7 +794,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
>  	rq->batch = NULL;
>  	GEM_BUG_ON(rq->file_priv);
>  	GEM_BUG_ON(rq->capture_list);
> -	GEM_BUG_ON(!list_empty(&rq->execute_cb));
> +	GEM_BUG_ON(!llist_empty(&rq->execute_cb));
>  
>  	/*
>  	 * Reserve space in the ring buffer for all the commands required to
> diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
> index 8ec7ee4dbadc..5d4709a3dace 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -214,7 +214,7 @@ struct i915_request {
>  			ktime_t emitted;
>  		} duration;
>  	};
> -	struct list_head execute_cb;
> +	struct llist_head execute_cb;
>  	struct i915_sw_fence semaphore;
>  
>  	/*
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 04/12] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 04/12] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
@ 2020-05-26 11:17   ` Mika Kuoppala
  0 siblings, 0 replies; 23+ messages in thread
From: Mika Kuoppala @ 2020-05-26 11:17 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Chris Wilson

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If there are no internal levels and the user priority-shift is zero, we
> can help the compiler eliminate some dead code:
>
> Function                                     old     new   delta
> start_timeslice                              169     154     -15
> __execlists_submission_tasklet              4696    4659     -37
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> ---
>  drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index de5be57ed6d2..3214a4ecc31a 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -446,6 +446,9 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
>  	 * we have to flip the index value to become priority.
>  	 */
>  	p = to_priolist(rb);
> +	if (!I915_USER_PRIORITY_SHIFT)
> +		return p->priority;
> +
>  	return ((p->priority + 1) << I915_USER_PRIORITY_SHIFT) - ffs(p->used);
>  }
>  
> -- 
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder
  2020-05-25  7:53 ` [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
@ 2020-05-30  5:53     ` kbuild test robot
  0 siblings, 0 replies; 23+ messages in thread
From: kbuild test robot @ 2020-05-30  5:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: kbuild-all, Chris Wilson

Hi Chris,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[cannot apply to drm-intel/for-linux-next linus/master v5.7-rc7 next-20200529]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-gt-Stop-cross-polluting-PIN_GLOBAL-with-PIN_USER-with-no-ppgtt/20200525-160038
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>


cppcheck warnings: (new ones prefixed by >>)

>> drivers/dma-buf/st-dma-fence-proxy.c:127:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:109:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:127:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:146:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:136:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:146:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:175:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:155:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:175:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:217:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:185:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:217:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:254:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:238:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:254:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:293:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:265:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:293:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:321:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:303:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:321:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:348:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:331:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:348:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:377:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:358:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:377:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:404:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:387:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:404:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:435:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:414:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:435:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:466:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;

vim +/err +127 drivers/dma-buf/st-dma-fence-proxy.c

   105	
   106	static int wrap_target(void *arg)
   107	{
   108		struct fences f;
   109		int err = -EINVAL;
   110	
   111		if (create_fences(&f, false))
   112			return -ENOMEM;
   113	
   114		if (dma_fence_proxy_get_real(f.proxy) != f.proxy) {
   115			pr_err("Unwrapped proxy fenced reported a target fence!\n");
   116			goto err_free;
   117		}
   118	
   119		dma_fence_proxy_set_real(f.proxy, f.real);
   120		rcu_assign_pointer(f.slot, dma_fence_get(f.real)); /* free_fences() */
   121	
   122		if (dma_fence_proxy_get_real(f.proxy) != f.real) {
   123			pr_err("Wrapped proxy fenced did not report the target fence!\n");
   124			goto err_free;
   125		}
   126	
 > 127		err = 0;
   128	err_free:
   129		free_fences(&f);
   130		return err;
   131	}
   132	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder
@ 2020-05-30  5:53     ` kbuild test robot
  0 siblings, 0 replies; 23+ messages in thread
From: kbuild test robot @ 2020-05-30  5:53 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 7275 bytes --]

Hi Chris,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-tip/drm-tip]
[cannot apply to drm-intel/for-linux-next linus/master v5.7-rc7 next-20200529]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Chris-Wilson/drm-i915-gt-Stop-cross-polluting-PIN_GLOBAL-with-PIN_USER-with-no-ppgtt/20200525-160038
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>


cppcheck warnings: (new ones prefixed by >>)

>> drivers/dma-buf/st-dma-fence-proxy.c:127:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:109:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:127:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:146:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:136:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:146:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:175:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:155:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:175:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:217:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:185:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:217:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:254:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:238:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:254:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:293:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:265:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:293:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:321:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:303:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:321:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:348:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:331:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:348:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:377:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:358:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:377:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:404:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:387:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:404:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:435:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:414:0: note: Variable 'err' is reassigned a value before the old one has been used.
    int err = -EINVAL;
   ^
   drivers/dma-buf/st-dma-fence-proxy.c:435:6: note: Variable 'err' is reassigned a value before the old one has been used.
    err = 0;
        ^
   drivers/dma-buf/st-dma-fence-proxy.c:466:6: warning: Variable 'err' is reassigned a value before the old one has been used. [redundantAssignment]
    err = 0;

vim +/err +127 drivers/dma-buf/st-dma-fence-proxy.c

   105	
   106	static int wrap_target(void *arg)
   107	{
   108		struct fences f;
   109		int err = -EINVAL;
   110	
   111		if (create_fences(&f, false))
   112			return -ENOMEM;
   113	
   114		if (dma_fence_proxy_get_real(f.proxy) != f.proxy) {
   115			pr_err("Unwrapped proxy fenced reported a target fence!\n");
   116			goto err_free;
   117		}
   118	
   119		dma_fence_proxy_set_real(f.proxy, f.real);
   120		rcu_assign_pointer(f.slot, dma_fence_get(f.real)); /* free_fences() */
   121	
   122		if (dma_fence_proxy_get_real(f.proxy) != f.real) {
   123			pr_err("Wrapped proxy fenced did not report the target fence!\n");
   124			goto err_free;
   125		}
   126	
 > 127		err = 0;
   128	err_free:
   129		free_fences(&f);
   130		return err;
   131	}
   132	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-05-30  5:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-25  7:53 [Intel-gfx] [PATCH 01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 02/12] drm/i915/gt: Cancel the flush worker more thoroughly Chris Wilson
2020-05-25 13:51   ` Mika Kuoppala
2020-05-25 14:08     ` Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 03/12] drm/i915/gem: Suppress some random warnings Chris Wilson
2020-05-25 13:55   ` Mika Kuoppala
2020-05-25  7:53 ` [Intel-gfx] [PATCH 04/12] drm/i915/execlists: Shortcircuit queue_prio() for no internal levels Chris Wilson
2020-05-26 11:17   ` Mika Kuoppala
2020-05-25  7:53 ` [Intel-gfx] [PATCH 05/12] drm/i915: Improve execute_cb struct packing Chris Wilson
2020-05-26 11:17   ` Mika Kuoppala
2020-05-25  7:53 ` [Intel-gfx] [PATCH 06/12] dma-buf: Proxy fence, an unsignaled fence placeholder Chris Wilson
2020-05-30  5:53   ` kbuild test robot
2020-05-30  5:53     ` kbuild test robot
2020-05-25  7:53 ` [Intel-gfx] [PATCH 07/12] drm/i915: Unpeel awaits on a proxy fence Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 08/12] drm/i915: Add list_for_each_entry_safe_continue_reverse Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 09/12] drm/i915/gem: Async GPU relocations only Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 10/12] drm/i915/gem: Lift GPU relocation allocation Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 11/12] drm/i915/gem: Add all GPU reloc awaits/signals en masse Chris Wilson
2020-05-25  7:53 ` [Intel-gfx] [PATCH 12/12] drm/i915/gem: Make relocations atomic within execbuf Chris Wilson
2020-05-25  8:03 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/12] drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt Patchwork
2020-05-25  8:04 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2020-05-25  8:24 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-05-25 15:38 ` [Intel-gfx] [PATCH 01/12] " Mika Kuoppala

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.