All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 01/22] drm/i915: Flush pages on acquisition
@ 2019-03-18  9:51 Chris Wilson
  2019-03-18  9:51 ` [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
                   ` (23 more replies)
  0 siblings, 24 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

When we return pages to the system, we ensure that they are marked as
being in the CPU domain since any external access is uncontrolled and we
must assume the worst. This means that we need to always flush the pages
on acquisition if we need to use them on the GPU, and from the beginning
have used set-domain. Set-domain is overkill for the purpose as it is a
general synchronisation barrier, but our intent is to only flush the
pages being swapped in. If we move that flush into the pages acquisition
phase, we know then that when we have obj->mm.pages, they are coherent
with the GPU and need only maintain that status without resorting to
heavy handed use of set-domain.

The principle knock-on effect for userspace is through mmap-gtt
pagefaulting. Our uAPI has always implied that the GTT mmap was async
(especially as when any pagefault occurs is unpredicatable to userspace)
and so userspace had to apply explicit domain control itself
(set-domain). However, swapping is transparent to the kernel, and so on
first fault we need to acquire the pages and make them coherent for
access through the GTT. Our use of set-domain here leaks into the uABI
that the first pagefault was synchronous. This is unintentional and
baring a few igt should be unoticed, nevertheless we bump the uABI
version for mmap-gtt to reflect the change in behaviour.

Another implication of the change is that gem_create() is presumed to
create an object that is coherent with the CPU and is in the CPU write
domain, so a set-domain(CPU) following a gem_create() would be a minor
operation that merely checked whether we could allocate all pages for
the object. On applying this change, a set-domain(CPU) causes a clflush
as we acquire the pages. This will have a small impact on mesa as we move
the clflush here on !llc from execbuf time to create, but that should
have minimal performance impact as the same clflush exists but is now
done early and because of the clflush issue, userspace recycles bo and
so should resist allocating fresh objects.

Internally, the presumption that objects are created in the CPU
write-domain and remain so through writes to obj->mm.mapping is more
prevalent than I expect; but easy enough to catch and apply a manual
flush.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h               |  8 +++
 drivers/gpu/drm/i915/i915_gem.c               | 57 ++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c        |  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    |  7 +--
 drivers/gpu/drm/i915/i915_gem_render_state.c  |  2 +-
 drivers/gpu/drm/i915/i915_perf.c              |  4 +-
 drivers/gpu/drm/i915/intel_engine_cs.c        |  4 +-
 drivers/gpu/drm/i915/intel_lrc.c              | 63 +++++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.c       | 62 +++++++-----------
 drivers/gpu/drm/i915/selftests/huge_pages.c   |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c | 17 ++---
 .../gpu/drm/i915/selftests/i915_gem_dmabuf.c  |  1 +
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_request.c | 14 ++---
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  2 +-
 .../gpu/drm/i915/selftests/intel_hangcheck.c  |  2 +-
 drivers/gpu/drm/i915/selftests/intel_lrc.c    |  5 +-
 .../drm/i915/selftests/intel_workarounds.c    |  3 +
 18 files changed, 127 insertions(+), 134 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c65c2e6649df..395aa9d5ba02 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2959,6 +2959,14 @@ i915_coherent_map_type(struct drm_i915_private *i915)
 void *__must_check i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 					   enum i915_map_type type);
 
+void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
+				 unsigned long offset,
+				 unsigned long size);
+static inline void i915_gem_object_flush_map(struct drm_i915_gem_object *obj)
+{
+	__i915_gem_object_flush_map(obj, 0, obj->base.size);
+}
+
 /**
  * i915_gem_object_unpin_map - releases an earlier mapping
  * @obj: the object to unmap
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b38c9531b5e8..f4591a143c84 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1710,6 +1710,9 @@ static unsigned int tile_row_pages(const struct drm_i915_gem_object *obj)
  * 2 - Recognise WC as a separate cache domain so that we can flush the
  *     delayed writes via GTT before performing direct access via WC.
  *
+ * 3 - Remove implicit set-domain(GTT) and synchronisation on initial
+ *     pagefault; swapin remains transparent.
+ *
  * Restrictions:
  *
  *  * snoopable objects cannot be accessed via the GTT. It can cause machine
@@ -1737,7 +1740,7 @@ static unsigned int tile_row_pages(const struct drm_i915_gem_object *obj)
  */
 int i915_gem_mmap_gtt_version(void)
 {
-	return 2;
+	return 3;
 }
 
 static inline struct i915_ggtt_view
@@ -1805,17 +1808,6 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 
 	trace_i915_gem_object_fault(obj, page_offset, true, write);
 
-	/* Try to flush the object off the GPU first without holding the lock.
-	 * Upon acquiring the lock, we will perform our sanity checks and then
-	 * repeat the flush holding the lock in the normal manner to catch cases
-	 * where we are gazumped.
-	 */
-	ret = i915_gem_object_wait(obj,
-				   I915_WAIT_INTERRUPTIBLE,
-				   MAX_SCHEDULE_TIMEOUT);
-	if (ret)
-		goto err;
-
 	ret = i915_gem_object_pin_pages(obj);
 	if (ret)
 		goto err;
@@ -1871,10 +1863,6 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
 		goto err_unlock;
 	}
 
-	ret = i915_gem_object_set_to_gtt_domain(obj, write);
-	if (ret)
-		goto err_unpin;
-
 	ret = i915_vma_pin_fence(vma);
 	if (ret)
 		goto err_unpin;
@@ -2531,6 +2519,14 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 
 	lockdep_assert_held(&obj->mm.lock);
 
+	/* Make the pages coherent with the GPU (flushing any swapin). */
+	if (obj->cache_dirty) {
+		obj->write_domain = 0;
+		if (i915_gem_object_has_struct_page(obj))
+			drm_clflush_sg(pages);
+		obj->cache_dirty = false;
+	}
+
 	obj->mm.get_page.sg_pos = pages->sgl;
 	obj->mm.get_page.sg_idx = 0;
 
@@ -2732,6 +2728,33 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
 	goto out_unlock;
 }
 
+void __i915_gem_object_flush_map(struct drm_i915_gem_object *obj,
+				 unsigned long offset,
+				 unsigned long size)
+{
+	enum i915_map_type has_type;
+	void *ptr;
+
+	GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
+	GEM_BUG_ON(range_overflows_t(typeof(obj->base.size),
+				     offset, size, obj->base.size));
+
+	obj->mm.dirty = true;
+
+	if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE)
+		return;
+
+	ptr = page_unpack_bits(obj->mm.mapping, &has_type);
+	if (has_type == I915_MAP_WC)
+		return;
+
+	drm_clflush_virt_range(ptr + offset, size);
+	if (size == obj->base.size) {
+		obj->write_domain &= ~I915_GEM_DOMAIN_CPU;
+		obj->cache_dirty = false;
+	}
+}
+
 static int
 i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
 			   const struct drm_i915_gem_pwrite *arg)
@@ -4689,6 +4712,8 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 			goto err_active;
 
 		engine->default_state = i915_gem_object_get(state->obj);
+		i915_gem_object_set_cache_coherency(engine->default_state,
+						    I915_CACHE_LLC);
 
 		/* Check we can acquire the image of the context state */
 		vaddr = i915_gem_object_pin_map(engine->default_state,
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 33181678990e..5a101a9462d8 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -107,6 +107,7 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
 {
 	struct drm_i915_gem_object *obj = dma_buf_to_obj(dma_buf);
 
+	i915_gem_object_flush_map(obj);
 	i915_gem_object_unpin_map(obj);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index ee6d301a9627..3d672c9edb94 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1001,7 +1001,10 @@ static void reloc_gpu_flush(struct reloc_cache *cache)
 {
 	GEM_BUG_ON(cache->rq_size >= cache->rq->batch->obj->base.size / sizeof(u32));
 	cache->rq_cmd[cache->rq_size] = MI_BATCH_BUFFER_END;
+
+	__i915_gem_object_flush_map(cache->rq->batch->obj, 0, cache->rq_size);
 	i915_gem_object_unpin_map(cache->rq->batch->obj);
+
 	i915_gem_chipset_flush(cache->rq->i915);
 
 	i915_request_add(cache->rq);
@@ -1214,10 +1217,6 @@ static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 	if (IS_ERR(cmd))
 		return PTR_ERR(cmd);
 
-	err = i915_gem_object_set_to_wc_domain(obj, false);
-	if (err)
-		goto err_unmap;
-
 	batch = i915_vma_instance(obj, vma->vm, NULL);
 	if (IS_ERR(batch)) {
 		err = PTR_ERR(batch);
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 91196348c68c..9440024c763f 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -164,7 +164,7 @@ static int render_state_setup(struct intel_render_state *so,
 		drm_clflush_virt_range(d, i * sizeof(u32));
 	kunmap_atomic(d);
 
-	ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
+	ret = 0;
 out:
 	i915_gem_obj_finish_shmem_access(so->obj);
 	return ret;
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9b0292a38865..7f92d52579bd 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1509,9 +1509,7 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv)
 		goto unlock;
 	}
 
-	ret = i915_gem_object_set_cache_level(bo, I915_CACHE_LLC);
-	if (ret)
-		goto err_unref;
+	i915_gem_object_set_cache_coherency(bo, I915_CACHE_LLC);
 
 	/* PreHSW required 512K alignment, HSW requires 16M */
 	vma = i915_gem_object_ggtt_pin(bo, NULL, 0, SZ_16M, 0);
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 652c1b3ba190..314b86b6f88d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -528,9 +528,7 @@ static int init_status_page(struct intel_engine_cs *engine)
 		return PTR_ERR(obj);
 	}
 
-	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
-	if (ret)
-		goto err;
+	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
 
 	vma = i915_vma_instance(obj, &engine->i915->ggtt.vm, NULL);
 	if (IS_ERR(vma)) {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e54e0064b2d6..aa50f03ba812 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1252,6 +1252,30 @@ static void execlists_context_destroy(struct intel_context *ce)
 	intel_context_free(ce);
 }
 
+static int __context_pin(struct i915_vma *vma)
+{
+	unsigned int flags;
+	int err;
+
+	flags = PIN_GLOBAL | PIN_HIGH;
+	flags |= PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
+
+	err = i915_vma_pin(vma, 0, 0, flags);
+	if (err)
+		return err;
+
+	vma->obj->pin_global++;
+	vma->obj->mm.dirty = true;
+
+	return 0;
+}
+
+static void __context_unpin(struct i915_vma *vma)
+{
+	vma->obj->pin_global--;
+	__i915_vma_unpin(vma);
+}
+
 static void execlists_context_unpin(struct intel_context *ce)
 {
 	struct intel_engine_cs *engine;
@@ -1280,36 +1304,13 @@ static void execlists_context_unpin(struct intel_context *ce)
 
 	intel_ring_unpin(ce->ring);
 
-	ce->state->obj->pin_global--;
 	i915_gem_object_unpin_map(ce->state->obj);
-	i915_vma_unpin(ce->state);
+	__context_unpin(ce->state);
 
 	list_del(&ce->active_link);
 	i915_gem_context_put(ce->gem_context);
 }
 
-static int __context_pin(struct i915_vma *vma)
-{
-	unsigned int flags;
-	int err;
-
-	/*
-	 * Clear this page out of any CPU caches for coherent swap-in/out.
-	 * We only want to do this on the first bind so that we do not stall
-	 * on an active context (which by nature is already on the GPU).
-	 */
-	if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
-		err = i915_gem_object_set_to_wc_domain(vma->obj, true);
-		if (err)
-			return err;
-	}
-
-	flags = PIN_GLOBAL | PIN_HIGH;
-	flags |= PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
-
-	return i915_vma_pin(vma, 0, 0, flags);
-}
-
 static void
 __execlists_update_reg_state(struct intel_context *ce,
 			     struct intel_engine_cs *engine)
@@ -1368,7 +1369,6 @@ __execlists_context_pin(struct intel_context *ce,
 	ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
 	__execlists_update_reg_state(ce, engine);
 
-	ce->state->obj->pin_global++;
 	return 0;
 
 unpin_ring:
@@ -1376,7 +1376,7 @@ __execlists_context_pin(struct intel_context *ce,
 unpin_map:
 	i915_gem_object_unpin_map(ce->state->obj);
 unpin_vma:
-	__i915_vma_unpin(ce->state);
+	__context_unpin(ce->state);
 err:
 	return ret;
 }
@@ -2755,19 +2755,12 @@ populate_lr_context(struct intel_context *ce,
 	u32 *regs;
 	int ret;
 
-	ret = i915_gem_object_set_to_cpu_domain(ctx_obj, true);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Could not set to CPU domain\n");
-		return ret;
-	}
-
 	vaddr = i915_gem_object_pin_map(ctx_obj, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		ret = PTR_ERR(vaddr);
 		DRM_DEBUG_DRIVER("Could not map object pages! (%d)\n", ret);
 		return ret;
 	}
-	ctx_obj->mm.dirty = true;
 
 	if (engine->default_state) {
 		/*
@@ -2802,7 +2795,11 @@ populate_lr_context(struct intel_context *ce,
 			_MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT |
 					   CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT);
 
+	ret = 0;
 err_unpin_ctx:
+	__i915_gem_object_flush_map(ctx_obj,
+				    LRC_HEADER_PAGES * PAGE_SIZE,
+				    engine->context_size);
 	i915_gem_object_unpin_map(ctx_obj);
 	return ret;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f26f5cc1584c..746fe570466c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1199,15 +1199,6 @@ int intel_ring_pin(struct intel_ring *ring)
 	else
 		flags |= PIN_HIGH;
 
-	if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
-		if (flags & PIN_MAPPABLE || map == I915_MAP_WC)
-			ret = i915_gem_object_set_to_gtt_domain(vma->obj, true);
-		else
-			ret = i915_gem_object_set_to_cpu_domain(vma->obj, true);
-		if (unlikely(ret))
-			goto unpin_timeline;
-	}
-
 	ret = i915_vma_pin(vma, 0, 0, flags);
 	if (unlikely(ret))
 		goto unpin_timeline;
@@ -1393,17 +1384,6 @@ static int __context_pin(struct intel_context *ce)
 	if (!vma)
 		return 0;
 
-	/*
-	 * Clear this page out of any CPU caches for coherent swap-in/out.
-	 * We only want to do this on the first bind so that we do not stall
-	 * on an active context (which by nature is already on the GPU).
-	 */
-	if (!(vma->flags & I915_VMA_GLOBAL_BIND)) {
-		err = i915_gem_object_set_to_gtt_domain(vma->obj, true);
-		if (err)
-			return err;
-	}
-
 	err = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
 	if (err)
 		return err;
@@ -1413,6 +1393,7 @@ static int __context_pin(struct intel_context *ce)
 	 * it cannot reclaim the object until we release it.
 	 */
 	vma->obj->pin_global++;
+	vma->obj->mm.dirty = true;
 
 	return 0;
 }
@@ -1450,6 +1431,24 @@ alloc_context_vma(struct intel_engine_cs *engine)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
+	/*
+	 * Try to make the context utilize L3 as well as LLC.
+	 *
+	 * On VLV we don't have L3 controls in the PTEs so we
+	 * shouldn't touch the cache level, especially as that
+	 * would make the object snooped which might have a
+	 * negative performance impact.
+	 *
+	 * Snooping is required on non-llc platforms in execlist
+	 * mode, but since all GGTT accesses use PAT entry 0 we
+	 * get snooping anyway regardless of cache_level.
+	 *
+	 * This is only applicable for Ivy Bridge devices since
+	 * later platforms don't have L3 control bits in the PTE.
+	 */
+	if (IS_IVYBRIDGE(i915))
+		i915_gem_object_set_cache_coherency(obj, I915_CACHE_L3_LLC);
+
 	if (engine->default_state) {
 		void *defaults, *vaddr;
 
@@ -1467,29 +1466,10 @@ alloc_context_vma(struct intel_engine_cs *engine)
 		}
 
 		memcpy(vaddr, defaults, engine->context_size);
-
 		i915_gem_object_unpin_map(engine->default_state);
-		i915_gem_object_unpin_map(obj);
-	}
 
-	/*
-	 * Try to make the context utilize L3 as well as LLC.
-	 *
-	 * On VLV we don't have L3 controls in the PTEs so we
-	 * shouldn't touch the cache level, especially as that
-	 * would make the object snooped which might have a
-	 * negative performance impact.
-	 *
-	 * Snooping is required on non-llc platforms in execlist
-	 * mode, but since all GGTT accesses use PAT entry 0 we
-	 * get snooping anyway regardless of cache_level.
-	 *
-	 * This is only applicable for Ivy Bridge devices since
-	 * later platforms don't have L3 control bits in the PTE.
-	 */
-	if (IS_IVYBRIDGE(i915)) {
-		/* Ignore any error, regard it as a simple optimisation */
-		i915_gem_object_set_cache_level(obj, I915_CACHE_L3_LLC);
+		i915_gem_object_flush_map(obj);
+		i915_gem_object_unpin_map(obj);
 	}
 
 	vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 2e1db30af477..218cfc361de3 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -908,10 +908,6 @@ gpu_write_dw(struct i915_vma *vma, u64 offset, u32 val)
 	if (IS_ERR(obj))
 		return ERR_CAST(obj);
 
-	err = i915_gem_object_set_to_wc_domain(obj, true);
-	if (err)
-		goto err;
-
 	cmd = i915_gem_object_pin_map(obj, I915_MAP_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
@@ -1584,6 +1580,7 @@ static int igt_tmpfs_fallback(void *arg)
 	}
 	*vaddr = 0xdeadbeaf;
 
+	__i915_gem_object_flush_map(obj, 0, 64);
 	i915_gem_object_unpin_map(obj);
 
 	vma = i915_vma_instance(obj, vm, NULL);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 4399ef9ebf15..0759a90c0d5a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -220,6 +220,7 @@ gpu_fill_dw(struct i915_vma *vma, u64 offset, unsigned long count, u32 value)
 		offset += PAGE_SIZE;
 	}
 	*cmd = MI_BATCH_BUFFER_END;
+	i915_gem_object_flush_map(obj);
 	i915_gem_object_unpin_map(obj);
 
 	err = i915_gem_object_set_to_gtt_domain(obj, false);
@@ -604,12 +605,9 @@ static struct i915_vma *rpcs_query_batch(struct i915_vma *vma)
 	*cmd++ = upper_32_bits(vma->node.start);
 	*cmd = MI_BATCH_BUFFER_END;
 
+	__i915_gem_object_flush_map(obj, 0, 64);
 	i915_gem_object_unpin_map(obj);
 
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
-	if (err)
-		goto err;
-
 	vma = i915_vma_instance(obj, vma->vm, NULL);
 	if (IS_ERR(vma)) {
 		err = PTR_ERR(vma);
@@ -1202,12 +1200,9 @@ static int write_to_scratch(struct i915_gem_context *ctx,
 	}
 	*cmd++ = value;
 	*cmd = MI_BATCH_BUFFER_END;
+	__i915_gem_object_flush_map(obj, 0, 64);
 	i915_gem_object_unpin_map(obj);
 
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
-	if (err)
-		goto err;
-
 	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
 	if (IS_ERR(vma)) {
 		err = PTR_ERR(vma);
@@ -1299,11 +1294,9 @@ static int read_from_scratch(struct i915_gem_context *ctx,
 		*cmd++ = result;
 	}
 	*cmd = MI_BATCH_BUFFER_END;
-	i915_gem_object_unpin_map(obj);
 
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
-	if (err)
-		goto err;
+	i915_gem_object_flush_map(obj);
+	i915_gem_object_unpin_map(obj);
 
 	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
 	if (IS_ERR(vma)) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c
index a7055b12e53c..2b943ee246c9 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_dmabuf.c
@@ -315,6 +315,7 @@ static int igt_dmabuf_export_kmap(void *arg)
 		goto err;
 	}
 	memset(ptr + PAGE_SIZE, 0xaa, PAGE_SIZE);
+	i915_gem_object_flush_map(obj);
 	i915_gem_object_unpin_map(obj);
 
 	ptr = dma_buf_kmap(dmabuf, 1);
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index b270eab1cad1..9a9451846b33 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -274,7 +274,7 @@ static int igt_evict_for_cache_color(void *arg)
 		err = PTR_ERR(obj);
 		goto cleanup;
 	}
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
 	quirk_add(obj, &objects);
 
 	vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
@@ -290,7 +290,7 @@ static int igt_evict_for_cache_color(void *arg)
 		err = PTR_ERR(obj);
 		goto cleanup;
 	}
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
 	quirk_add(obj, &objects);
 
 	/* Neighbouring; same colour - should fit */
diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c
index 3eb6a6b075ab..e6ffe2240126 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -619,13 +619,11 @@ static struct i915_vma *empty_batch(struct drm_i915_private *i915)
 	}
 
 	*cmd = MI_BATCH_BUFFER_END;
-	i915_gem_chipset_flush(i915);
 
+	__i915_gem_object_flush_map(obj, 0, 64);
 	i915_gem_object_unpin_map(obj);
 
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
-	if (err)
-		goto err;
+	i915_gem_chipset_flush(i915);
 
 	vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
 	if (IS_ERR(vma)) {
@@ -777,10 +775,6 @@ static struct i915_vma *recursive_batch(struct drm_i915_private *i915)
 	if (err)
 		goto err;
 
-	err = i915_gem_object_set_to_wc_domain(obj, true);
-	if (err)
-		goto err;
-
 	cmd = i915_gem_object_pin_map(obj, I915_MAP_WC);
 	if (IS_ERR(cmd)) {
 		err = PTR_ERR(cmd);
@@ -799,10 +793,12 @@ static struct i915_vma *recursive_batch(struct drm_i915_private *i915)
 		*cmd++ = lower_32_bits(vma->node.start);
 	}
 	*cmd++ = MI_BATCH_BUFFER_END; /* terminate early in case of error */
-	i915_gem_chipset_flush(i915);
 
+	__i915_gem_object_flush_map(obj, 0, 64);
 	i915_gem_object_unpin_map(obj);
 
+	i915_gem_chipset_flush(i915);
+
 	return vma;
 
 err:
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index d0b93a3fbc54..16890dfe74c0 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -29,7 +29,7 @@ int igt_spinner_init(struct igt_spinner *spin, struct drm_i915_private *i915)
 		goto err_hws;
 	}
 
-	i915_gem_object_set_cache_level(spin->hws, I915_CACHE_LLC);
+	i915_gem_object_set_cache_coherency(spin->hws, I915_CACHE_LLC);
 	vaddr = i915_gem_object_pin_map(spin->hws, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index b5e35b2a925f..76b4fa150f2e 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -70,7 +70,7 @@ static int hang_init(struct hang *h, struct drm_i915_private *i915)
 		goto err_hws;
 	}
 
-	i915_gem_object_set_cache_level(h->hws, I915_CACHE_LLC);
+	i915_gem_object_set_cache_coherency(h->hws, I915_CACHE_LLC);
 	vaddr = i915_gem_object_pin_map(h->hws, I915_MAP_WB);
 	if (IS_ERR(vaddr)) {
 		err = PTR_ERR(vaddr);
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index d61520ea03c1..9e871eb0bfb1 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -1018,12 +1018,9 @@ static int live_preempt_smoke(void *arg)
 	for (n = 0; n < PAGE_SIZE / sizeof(*cs) - 1; n++)
 		cs[n] = MI_ARB_CHECK;
 	cs[n] = MI_BATCH_BUFFER_END;
+	i915_gem_object_flush_map(smoke.batch);
 	i915_gem_object_unpin_map(smoke.batch);
 
-	err = i915_gem_object_set_to_gtt_domain(smoke.batch, false);
-	if (err)
-		goto err_batch;
-
 	for (n = 0; n < smoke.ncontext; n++) {
 		smoke.contexts[n] = kernel_context(smoke.i915);
 		if (!smoke.contexts[n])
diff --git a/drivers/gpu/drm/i915/selftests/intel_workarounds.c b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
index f2a2b51a4662..3baed59008d7 100644
--- a/drivers/gpu/drm/i915/selftests/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/selftests/intel_workarounds.c
@@ -90,6 +90,7 @@ read_nonprivs(struct i915_gem_context *ctx, struct intel_engine_cs *engine)
 		goto err_obj;
 	}
 	memset(cs, 0xc5, PAGE_SIZE);
+	i915_gem_object_flush_map(result);
 	i915_gem_object_unpin_map(result);
 
 	vma = i915_vma_instance(result, &engine->i915->ggtt.vm, NULL);
@@ -358,6 +359,7 @@ static struct i915_vma *create_scratch(struct i915_gem_context *ctx)
 		goto err_obj;
 	}
 	memset(ptr, 0xc5, PAGE_SIZE);
+	i915_gem_object_flush_map(obj);
 	i915_gem_object_unpin_map(obj);
 
 	vma = i915_vma_instance(obj, &ctx->ppgtt->vm, NULL);
@@ -551,6 +553,7 @@ static int check_dirty_whitelist(struct i915_gem_context *ctx,
 
 		*cs++ = MI_BATCH_BUFFER_END;
 
+		i915_gem_object_flush_map(batch->obj);
 		i915_gem_object_unpin_map(batch->obj);
 		i915_gem_chipset_flush(ctx->i915);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 10:21   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 03/22] drm/i915: Sanity check mmap length against object size Chris Wilson
                   ` (22 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

We want to use intel_engine_mask_t inside i915_request.h, which means
extracting it from the general header file mess and placing it inside a
types.h. A knock on effect is that the compiler wants to warn about
type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
for the worst.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |  1 +
 drivers/gpu/drm/i915/gvt/gvt.h                |  2 +-
 drivers/gpu/drm/i915/gvt/handlers.c           |  2 +-
 drivers/gpu/drm/i915/gvt/scheduler.c          |  2 +-
 drivers/gpu/drm/i915/gvt/vgpu.c               |  6 +-
 drivers/gpu/drm/i915/i915_drv.h               |  1 -
 drivers/gpu/drm/i915/i915_reset.c             | 30 +++---
 drivers/gpu/drm/i915/i915_reset.h             |  6 +-
 drivers/gpu/drm/i915/i915_scheduler.h         | 86 +---------------
 drivers/gpu/drm/i915/i915_scheduler_types.h   | 98 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_timeline.h          |  1 +
 drivers/gpu/drm/i915/i915_timeline_types.h    |  3 +-
 drivers/gpu/drm/i915/intel_device_info.h      |  3 +-
 drivers/gpu/drm/i915/intel_engine_types.h     |  9 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c |  4 +-
 .../gpu/drm/i915/selftests/intel_hangcheck.c  |  2 +-
 .../test_i915_scheduler_types_standalone.c    |  7 ++
 17 files changed, 147 insertions(+), 116 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_scheduler_types.h
 create mode 100644 drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 68fecf355471..197b081769b5 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -60,6 +60,7 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 i915-$(CONFIG_DRM_I915_WERROR) += \
 	test_i915_active_types_standalone.o \
 	test_i915_gem_context_types_standalone.o \
+	test_i915_scheduler_types_standalone.o \
 	test_i915_timeline_types_standalone.o \
 	test_intel_context_types_standalone.o \
 	test_intel_engine_types_standalone.o \
diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 8bce09de4b82..c7f373566ecd 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -488,7 +488,7 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
 void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_release_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
-				 unsigned int engine_mask);
+				 unsigned long engine_mask);
 void intel_gvt_reset_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_activate_vgpu(struct intel_vgpu *vgpu);
 void intel_gvt_deactivate_vgpu(struct intel_vgpu *vgpu);
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index b596cb42e24e..a0d981547c9e 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -311,7 +311,7 @@ static int mul_force_wake_write(struct intel_vgpu *vgpu,
 static int gdrst_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
 			    void *p_data, unsigned int bytes)
 {
-	unsigned int engine_mask = 0;
+	unsigned long engine_mask = 0;
 	u32 data;
 
 	write_vreg(vgpu, offset, p_data, bytes);
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 7550e09939ae..56a9530b4e06 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -1137,7 +1137,7 @@ void intel_vgpu_clean_submission(struct intel_vgpu *vgpu)
  *
  */
 void intel_vgpu_reset_submission(struct intel_vgpu *vgpu,
-		unsigned long engine_mask)
+				 unsigned long engine_mask)
 {
 	struct intel_vgpu_submission *s = &vgpu->submission;
 
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 314e40121e47..e734c21e7d06 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -526,14 +526,14 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
  * GPU engines. For FLR, engine_mask is ignored.
  */
 void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
-				 unsigned int engine_mask)
+				 unsigned long engine_mask)
 {
 	struct intel_gvt *gvt = vgpu->gvt;
 	struct intel_gvt_workload_scheduler *scheduler = &gvt->scheduler;
-	unsigned int resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
+	unsigned long resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
 
 	gvt_dbg_core("------------------------------------------\n");
-	gvt_dbg_core("resseting vgpu%d, dmlr %d, engine_mask %08x\n",
+	gvt_dbg_core("resseting vgpu%d, dmlr %d, engine_mask %08lx\n",
 		     vgpu->id, dmlr, engine_mask);
 
 	vgpu->resetting_eng = resetting_eng;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 395aa9d5ba02..86080a6e0f45 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2432,7 +2432,6 @@ static inline unsigned int i915_sg_segment_size(void)
 #define IS_GEN9_LP(dev_priv)	(IS_GEN(dev_priv, 9) && IS_LP(dev_priv))
 #define IS_GEN9_BC(dev_priv)	(IS_GEN(dev_priv, 9) && !IS_LP(dev_priv))
 
-#define ALL_ENGINES	(~0u)
 #define HAS_ENGINE(dev_priv, id) (INTEL_INFO(dev_priv)->engine_mask & BIT(id))
 
 #define HAS_LLC(dev_priv)	(INTEL_INFO(dev_priv)->has_llc)
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index 861fe083e383..b8daec7ddc06 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -144,7 +144,7 @@ static void gen3_stop_engine(struct intel_engine_cs *engine)
 }
 
 static void i915_stop_engines(struct drm_i915_private *i915,
-			      unsigned int engine_mask)
+			      unsigned long engine_mask)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
@@ -165,7 +165,7 @@ static bool i915_in_reset(struct pci_dev *pdev)
 }
 
 static int i915_do_reset(struct drm_i915_private *i915,
-			 unsigned int engine_mask,
+			 unsigned long engine_mask,
 			 unsigned int retry)
 {
 	struct pci_dev *pdev = i915->drm.pdev;
@@ -194,7 +194,7 @@ static bool g4x_reset_complete(struct pci_dev *pdev)
 }
 
 static int g33_do_reset(struct drm_i915_private *i915,
-			unsigned int engine_mask,
+			unsigned long engine_mask,
 			unsigned int retry)
 {
 	struct pci_dev *pdev = i915->drm.pdev;
@@ -204,7 +204,7 @@ static int g33_do_reset(struct drm_i915_private *i915,
 }
 
 static int g4x_do_reset(struct drm_i915_private *dev_priv,
-			unsigned int engine_mask,
+			unsigned long engine_mask,
 			unsigned int retry)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
@@ -242,7 +242,7 @@ static int g4x_do_reset(struct drm_i915_private *dev_priv,
 }
 
 static int ironlake_do_reset(struct drm_i915_private *dev_priv,
-			     unsigned int engine_mask,
+			     unsigned long engine_mask,
 			     unsigned int retry)
 {
 	int ret;
@@ -299,7 +299,7 @@ static int gen6_hw_domain_reset(struct drm_i915_private *dev_priv,
 }
 
 static int gen6_reset_engines(struct drm_i915_private *i915,
-			      unsigned int engine_mask,
+			      unsigned long engine_mask,
 			      unsigned int retry)
 {
 	struct intel_engine_cs *engine;
@@ -425,7 +425,7 @@ static void gen11_unlock_sfc(struct drm_i915_private *dev_priv,
 }
 
 static int gen11_reset_engines(struct drm_i915_private *i915,
-			       unsigned int engine_mask,
+			       unsigned long engine_mask,
 			       unsigned int retry)
 {
 	const u32 hw_engine_mask[] = {
@@ -492,7 +492,7 @@ static void gen8_engine_reset_cancel(struct intel_engine_cs *engine)
 }
 
 static int gen8_reset_engines(struct drm_i915_private *i915,
-			      unsigned int engine_mask,
+			      unsigned long engine_mask,
 			      unsigned int retry)
 {
 	struct intel_engine_cs *engine;
@@ -533,7 +533,7 @@ static int gen8_reset_engines(struct drm_i915_private *i915,
 }
 
 typedef int (*reset_func)(struct drm_i915_private *,
-			  unsigned int engine_mask,
+			  unsigned long engine_mask,
 			  unsigned int retry);
 
 static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
@@ -554,7 +554,7 @@ static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
 		return NULL;
 }
 
-int intel_gpu_reset(struct drm_i915_private *i915, unsigned int engine_mask)
+int intel_gpu_reset(struct drm_i915_private *i915, unsigned long engine_mask)
 {
 	const int retries = engine_mask == ALL_ENGINES ? RESET_MAX_RETRIES : 1;
 	reset_func reset;
@@ -588,7 +588,7 @@ int intel_gpu_reset(struct drm_i915_private *i915, unsigned int engine_mask)
 		if (retry)
 			i915_stop_engines(i915, engine_mask);
 
-		GEM_TRACE("engine_mask=%x\n", engine_mask);
+		GEM_TRACE("engine_mask=%lx\n", engine_mask);
 		preempt_disable();
 		ret = reset(i915, engine_mask, retry);
 		preempt_enable();
@@ -688,7 +688,7 @@ static void gt_revoke(struct drm_i915_private *i915)
 	revoke_mmaps(i915);
 }
 
-static int gt_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
+static int gt_reset(struct drm_i915_private *i915, unsigned long stalled_mask)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
@@ -945,7 +945,7 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	return result;
 }
 
-static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
+static int do_reset(struct drm_i915_private *i915, unsigned long stalled_mask)
 {
 	int err, i;
 
@@ -980,7 +980,7 @@ static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
  *   - re-init display
  */
 void i915_reset(struct drm_i915_private *i915,
-		unsigned int stalled_mask,
+		unsigned long stalled_mask,
 		const char *reason)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
@@ -1222,7 +1222,7 @@ void i915_clear_error_registers(struct drm_i915_private *dev_priv)
  * of a ring dump etc.).
  */
 void i915_handle_error(struct drm_i915_private *i915,
-		       u32 engine_mask,
+		       unsigned long engine_mask,
 		       unsigned long flags,
 		       const char *fmt, ...)
 {
diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/i915_reset.h
index 16f2389f656f..6d2bf7e81ac4 100644
--- a/drivers/gpu/drm/i915/i915_reset.h
+++ b/drivers/gpu/drm/i915/i915_reset.h
@@ -17,7 +17,7 @@ struct intel_guc;
 
 __printf(4, 5)
 void i915_handle_error(struct drm_i915_private *i915,
-		       u32 engine_mask,
+		       unsigned long engine_mask,
 		       unsigned long flags,
 		       const char *fmt, ...);
 #define I915_ERROR_CAPTURE BIT(0)
@@ -25,7 +25,7 @@ void i915_handle_error(struct drm_i915_private *i915,
 void i915_clear_error_registers(struct drm_i915_private *i915);
 
 void i915_reset(struct drm_i915_private *i915,
-		unsigned int stalled_mask,
+		unsigned long stalled_mask,
 		const char *reason);
 int i915_reset_engine(struct intel_engine_cs *engine,
 		      const char *reason);
@@ -41,7 +41,7 @@ int i915_terminally_wedged(struct drm_i915_private *i915);
 bool intel_has_gpu_reset(struct drm_i915_private *i915);
 bool intel_has_reset_engine(struct drm_i915_private *i915);
 
-int intel_gpu_reset(struct drm_i915_private *i915, u32 engine_mask);
+int intel_gpu_reset(struct drm_i915_private *i915, unsigned long engine_mask);
 
 int intel_reset_guc(struct drm_i915_private *i915);
 
diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
index 9a1d257f3d6e..07d243acf553 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.h
+++ b/drivers/gpu/drm/i915/i915_scheduler.h
@@ -8,92 +8,10 @@
 #define _I915_SCHEDULER_H_
 
 #include <linux/bitops.h>
+#include <linux/list.h>
 #include <linux/kernel.h>
 
-#include <uapi/drm/i915_drm.h>
-
-struct drm_i915_private;
-struct i915_request;
-struct intel_engine_cs;
-
-enum {
-	I915_PRIORITY_MIN = I915_CONTEXT_MIN_USER_PRIORITY - 1,
-	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
-	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
-
-	I915_PRIORITY_INVALID = INT_MIN
-};
-
-#define I915_USER_PRIORITY_SHIFT 3
-#define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
-
-#define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
-#define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
-
-#define I915_PRIORITY_WAIT		((u8)BIT(0))
-#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
-#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
-
-#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
-
-struct i915_sched_attr {
-	/**
-	 * @priority: execution and service priority
-	 *
-	 * All clients are equal, but some are more equal than others!
-	 *
-	 * Requests from a context with a greater (more positive) value of
-	 * @priority will be executed before those with a lower @priority
-	 * value, forming a simple QoS.
-	 *
-	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
-	 */
-	int priority;
-};
-
-/*
- * "People assume that time is a strict progression of cause to effect, but
- * actually, from a nonlinear, non-subjective viewpoint, it's more like a big
- * ball of wibbly-wobbly, timey-wimey ... stuff." -The Doctor, 2015
- *
- * Requests exist in a complex web of interdependencies. Each request
- * has to wait for some other request to complete before it is ready to be run
- * (e.g. we have to wait until the pixels have been rendering into a texture
- * before we can copy from it). We track the readiness of a request in terms
- * of fences, but we also need to keep the dependency tree for the lifetime
- * of the request (beyond the life of an individual fence). We use the tree
- * at various points to reorder the requests whilst keeping the requests
- * in order with respect to their various dependencies.
- *
- * There is no active component to the "scheduler". As we know the dependency
- * DAG of each request, we are able to insert it into a sorted queue when it
- * is ready, and are able to reorder its portion of the graph to accommodate
- * dynamic priority changes.
- */
-struct i915_sched_node {
-	struct list_head signalers_list; /* those before us, we depend upon */
-	struct list_head waiters_list; /* those after us, they depend upon us */
-	struct list_head link;
-	struct i915_sched_attr attr;
-	unsigned int flags;
-#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
-};
-
-struct i915_dependency {
-	struct i915_sched_node *signaler;
-	struct list_head signal_link;
-	struct list_head wait_link;
-	struct list_head dfs_link;
-	unsigned long flags;
-#define I915_DEPENDENCY_ALLOC BIT(0)
-};
-
-struct i915_priolist {
-	struct list_head requests[I915_PRIORITY_COUNT];
-	struct rb_node node;
-	unsigned long used;
-	int priority;
-};
+#include "i915_scheduler_types.h"
 
 #define priolist_for_each_request(it, plist, idx) \
 	for (idx = 0; idx < ARRAY_SIZE((plist)->requests); idx++) \
diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
new file mode 100644
index 000000000000..5c94b3eb5c81
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
@@ -0,0 +1,98 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#ifndef _I915_SCHEDULER_TYPES_H_
+#define _I915_SCHEDULER_TYPES_H_
+
+#include <linux/list.h>
+#include <linux/rbtree.h>
+
+#include <uapi/drm/i915_drm.h>
+
+struct drm_i915_private;
+struct i915_request;
+struct intel_engine_cs;
+
+enum {
+	I915_PRIORITY_MIN = I915_CONTEXT_MIN_USER_PRIORITY - 1,
+	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
+	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
+
+	I915_PRIORITY_INVALID = INT_MIN
+};
+
+#define I915_USER_PRIORITY_SHIFT 3
+#define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
+
+#define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
+#define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
+
+#define I915_PRIORITY_WAIT		((u8)BIT(0))
+#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
+#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
+
+#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
+
+struct i915_sched_attr {
+	/**
+	 * @priority: execution and service priority
+	 *
+	 * All clients are equal, but some are more equal than others!
+	 *
+	 * Requests from a context with a greater (more positive) value of
+	 * @priority will be executed before those with a lower @priority
+	 * value, forming a simple QoS.
+	 *
+	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
+	 */
+	int priority;
+};
+
+/*
+ * "People assume that time is a strict progression of cause to effect, but
+ * actually, from a nonlinear, non-subjective viewpoint, it's more like a big
+ * ball of wibbly-wobbly, timey-wimey ... stuff." -The Doctor, 2015
+ *
+ * Requests exist in a complex web of interdependencies. Each request
+ * has to wait for some other request to complete before it is ready to be run
+ * (e.g. we have to wait until the pixels have been rendering into a texture
+ * before we can copy from it). We track the readiness of a request in terms
+ * of fences, but we also need to keep the dependency tree for the lifetime
+ * of the request (beyond the life of an individual fence). We use the tree
+ * at various points to reorder the requests whilst keeping the requests
+ * in order with respect to their various dependencies.
+ *
+ * There is no active component to the "scheduler". As we know the dependency
+ * DAG of each request, we are able to insert it into a sorted queue when it
+ * is ready, and are able to reorder its portion of the graph to accommodate
+ * dynamic priority changes.
+ */
+struct i915_sched_node {
+	struct list_head signalers_list; /* those before us, we depend upon */
+	struct list_head waiters_list; /* those after us, they depend upon us */
+	struct list_head link;
+	struct i915_sched_attr attr;
+	unsigned int flags;
+#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
+};
+
+struct i915_dependency {
+	struct i915_sched_node *signaler;
+	struct list_head signal_link;
+	struct list_head wait_link;
+	struct list_head dfs_link;
+	unsigned long flags;
+#define I915_DEPENDENCY_ALLOC BIT(0)
+};
+
+struct i915_priolist {
+	struct list_head requests[I915_PRIORITY_COUNT];
+	struct rb_node node;
+	unsigned long used;
+	int priority;
+};
+
+#endif /* _I915_SCHEDULER_TYPES_H_ */
diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
index 9126c8206490..454aa72aee18 100644
--- a/drivers/gpu/drm/i915/i915_timeline.h
+++ b/drivers/gpu/drm/i915/i915_timeline.h
@@ -27,6 +27,7 @@
 
 #include <linux/lockdep.h>
 
+#include "i915_active.h"
 #include "i915_syncmap.h"
 #include "i915_timeline_types.h"
 
diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
index 8ff146dc05ba..d42053544d7c 100644
--- a/drivers/gpu/drm/i915/i915_timeline_types.h
+++ b/drivers/gpu/drm/i915/i915_timeline_types.h
@@ -9,9 +9,10 @@
 
 #include <linux/list.h>
 #include <linux/kref.h>
+#include <linux/mutex.h>
 #include <linux/types.h>
 
-#include "i915_active.h"
+#include "i915_active_types.h"
 
 struct drm_i915_private;
 struct i915_vma;
diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
index 6234570a9b17..d20c33a10c11 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -27,6 +27,7 @@
 
 #include <uapi/drm/i915_drm.h>
 
+#include "intel_engine_types.h"
 #include "intel_display.h"
 
 struct drm_printer;
@@ -149,8 +150,6 @@ struct sseu_dev_info {
 	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
 };
 
-typedef u8 intel_engine_mask_t;
-
 struct intel_device_info {
 	u16 gen_mask;
 
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index b0aa1f0d4e47..79a166b9a81b 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -12,8 +12,10 @@
 #include <linux/list.h>
 #include <linux/types.h>
 
+#include "i915_gem.h"
+#include "i915_scheduler_types.h"
+#include "i915_selftest.h"
 #include "i915_timeline_types.h"
-#include "intel_device_info.h"
 #include "intel_workarounds_types.h"
 
 #include "i915_gem_batch_pool.h"
@@ -24,11 +26,16 @@
 
 #define I915_CMD_HASH_ORDER 9
 
+struct dma_fence;
 struct drm_i915_reg_table;
 struct i915_gem_context;
 struct i915_request;
 struct i915_sched_attr;
 
+typedef u8 intel_engine_mask_t;
+#define ALL_ENGINES	(~0ul)
+#define INIT_ALL_ENGINES(x) (x) = (intel_engine_mask_t)(ALL_ENGINES)
+
 struct intel_hw_status_page {
 	struct i915_vma *vma;
 	u32 *addr;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 0759a90c0d5a..f18c78ebff07 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -1466,7 +1466,7 @@ static int igt_vm_isolation(void *arg)
 }
 
 static __maybe_unused const char *
-__engine_name(struct drm_i915_private *i915, unsigned int engines)
+__engine_name(struct drm_i915_private *i915, unsigned long engines)
 {
 	struct intel_engine_cs *engine;
 	unsigned int tmp;
@@ -1482,7 +1482,7 @@ __engine_name(struct drm_i915_private *i915, unsigned int engines)
 
 static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
 					  struct i915_gem_context *ctx,
-					  unsigned int engines)
+					  unsigned long engines)
 {
 	struct intel_engine_cs *engine;
 	unsigned int tmp;
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index 76b4fa150f2e..05a7b9b9a1de 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -1124,7 +1124,7 @@ static int igt_reset_engines(void *arg)
 	return 0;
 }
 
-static u32 fake_hangcheck(struct drm_i915_private *i915, u32 mask)
+static u32 fake_hangcheck(struct drm_i915_private *i915, unsigned long mask)
 {
 	u32 count = i915_reset_count(&i915->gpu_error);
 
diff --git a/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c b/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
new file mode 100644
index 000000000000..8afa2c3719fb
--- /dev/null
+++ b/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
@@ -0,0 +1,7 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_scheduler_types.h"
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 03/22] drm/i915: Sanity check mmap length against object size
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
  2019-03-18  9:51 ` [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-25  0:38   ` Sasha Levin
  2019-03-18  9:51 ` [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring Chris Wilson
                   ` (21 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx
  Cc: tvrtko.ursulin, joonas.lahtinen, mika.kuoppala, Chris Wilson,
	Antonio Argenziano, stable

We assumed that vm_mmap() would reject an attempt to mmap past the end of
the filp (our object), but we were wrong.

Reported-by: Antonio Argenziano <antonio.argenziano@intel.com>
Testcase: igt/gem_mmap/bad-size
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Antonio Argenziano <antonio.argenziano@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f4591a143c84..41d96414ef18 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1639,8 +1639,13 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	 * pages from.
 	 */
 	if (!obj->base.filp) {
-		i915_gem_object_put(obj);
-		return -ENXIO;
+		addr = -ENXIO;
+		goto err;
+	}
+
+	if (range_overflows(args->offset, args->size, (u64)obj->base.size)) {
+		addr = -EINVAL;
+		goto err;
 	}
 
 	addr = vm_mmap(obj->base.filp, 0, args->size,
@@ -1654,8 +1659,8 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 		struct vm_area_struct *vma;
 
 		if (down_write_killable(&mm->mmap_sem)) {
-			i915_gem_object_put(obj);
-			return -EINTR;
+			addr = -EINTR;
+			goto err;
 		}
 		vma = find_vma(mm, addr);
 		if (vma && __vma_matches(vma, obj->base.filp, addr, args->size))
@@ -1673,12 +1678,10 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	i915_gem_object_put(obj);
 
 	args->addr_ptr = (u64)addr;
-
 	return 0;
 
 err:
 	i915_gem_object_put(obj);
-
 	return addr;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
  2019-03-18  9:51 ` [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
  2019-03-18  9:51 ` [PATCH 03/22] drm/i915: Sanity check mmap length against object size Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 10:31   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link Chris Wilson
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

As the final request on a ring may hold the reference to this ring (via
retiring the last pinned context), we may find ourselves chasing a
dangling pointer on completion of the list.

A quick solution is to hold a reference to the ring itself as we retire
along it so that we only free it after we stop dereferencing it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
 drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
 drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
 drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
 drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
 6 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 9533a85cb0b3..0a3d94517d0a 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
 	if (!i915->gt.active_requests)
 		return;
 
-	list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
+	list_for_each_entry_safe(ring, tmp,
+				 &i915->gt.active_rings, active_link) {
+		intel_ring_get(ring); /* last rq holds reference! */
 		ring_retire_requests(ring);
+		intel_ring_put(ring);
+	}
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 79a166b9a81b..549fdfca17aa 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -9,6 +9,7 @@
 
 #include <linux/hashtable.h>
 #include <linux/irq_work.h>
+#include <linux/kref.h>
 #include <linux/list.h>
 #include <linux/types.h>
 
@@ -58,6 +59,7 @@ struct intel_engine_hangcheck {
 };
 
 struct intel_ring {
+	struct kref ref;
 	struct i915_vma *vma;
 	void *vaddr;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index aa50f03ba812..d3f1fe06d013 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1236,7 +1236,7 @@ static void execlists_submit_request(struct i915_request *request)
 
 static void __execlists_context_fini(struct intel_context *ce)
 {
-	intel_ring_free(ce->ring);
+	intel_ring_put(ce->ring);
 
 	GEM_BUG_ON(i915_gem_object_is_active(ce->state->obj));
 	i915_gem_object_put(ce->state->obj);
@@ -2867,7 +2867,7 @@ static int execlists_context_deferred_alloc(struct intel_context *ce,
 	return 0;
 
 error_ring_free:
-	intel_ring_free(ring);
+	intel_ring_put(ring);
 error_deref_obj:
 	i915_gem_object_put(ctx_obj);
 	return ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 746fe570466c..45a54fadc482 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1302,6 +1302,7 @@ intel_engine_create_ring(struct intel_engine_cs *engine,
 	if (!ring)
 		return ERR_PTR(-ENOMEM);
 
+	kref_init(&ring->ref);
 	INIT_LIST_HEAD(&ring->request_list);
 	ring->timeline = i915_timeline_get(timeline);
 
@@ -1326,9 +1327,9 @@ intel_engine_create_ring(struct intel_engine_cs *engine,
 	return ring;
 }
 
-void
-intel_ring_free(struct intel_ring *ring)
+void intel_ring_free(struct kref *ref)
 {
+	struct intel_ring *ring = container_of(ref, typeof(*ring), ref);
 	struct drm_i915_gem_object *obj = ring->vma->obj;
 
 	i915_vma_close(ring->vma);
@@ -1571,7 +1572,7 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 err_unpin:
 	intel_ring_unpin(ring);
 err_ring:
-	intel_ring_free(ring);
+	intel_ring_put(ring);
 err:
 	intel_engine_cleanup_common(engine);
 	return err;
@@ -1585,7 +1586,7 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
 		(I915_READ_MODE(engine) & MODE_IDLE) == 0);
 
 	intel_ring_unpin(engine->buffer);
-	intel_ring_free(engine->buffer);
+	intel_ring_put(engine->buffer);
 
 	if (engine->cleanup)
 		engine->cleanup(engine);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index e612bdca9fd9..a57489fcb302 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -231,7 +231,18 @@ int intel_ring_pin(struct intel_ring *ring);
 void intel_ring_reset(struct intel_ring *ring, u32 tail);
 unsigned int intel_ring_update_space(struct intel_ring *ring);
 void intel_ring_unpin(struct intel_ring *ring);
-void intel_ring_free(struct intel_ring *ring);
+void intel_ring_free(struct kref *ref);
+
+static inline struct intel_ring *intel_ring_get(struct intel_ring *ring)
+{
+	kref_get(&ring->ref);
+	return ring;
+}
+
+static inline void intel_ring_put(struct intel_ring *ring)
+{
+	kref_put(&ring->ref, intel_ring_free);
+}
 
 void intel_engine_stop(struct intel_engine_cs *engine);
 void intel_engine_cleanup(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index f6d120e05ee4..881450c694e9 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -57,6 +57,7 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
 		return NULL;
 	}
 
+	kref_init(&ring->base.ref);
 	ring->base.size = sz;
 	ring->base.effective_size = sz;
 	ring->base.vaddr = (void *)(ring + 1);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (2 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 10:39   ` Tvrtko Ursulin
  2019-03-18 10:54   ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 06/22] drm/i915: Hold a reference to the active HW context Chris Wilson
                   ` (19 subsequent siblings)
  23 siblings, 2 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

On unpinning the intel_context, we remove it from the active list
inside the GEM context. This list is supposed to be guarded by the GEM
context mutex, so remember to take it!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_context.c         | 15 +++++++++++----
 drivers/gpu/drm/i915/intel_lrc.c             |  3 ---
 drivers/gpu/drm/i915/intel_ringbuffer.c      |  3 ---
 drivers/gpu/drm/i915/selftests/mock_engine.c |  2 --
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
index 5a16c9bb2778..0ab894a058f6 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/intel_context.c
@@ -165,13 +165,13 @@ intel_context_pin(struct i915_gem_context *ctx,
 		if (err)
 			goto err;
 
+		i915_gem_context_get(ctx);
+		GEM_BUG_ON(ce->gem_context != ctx);
+
 		mutex_lock(&ctx->mutex);
 		list_add(&ce->active_link, &ctx->active_engines);
 		mutex_unlock(&ctx->mutex);
 
-		i915_gem_context_get(ctx);
-		GEM_BUG_ON(ce->gem_context != ctx);
-
 		smp_mb__before_atomic(); /* flush pin before it is visible */
 	}
 
@@ -194,9 +194,16 @@ void intel_context_unpin(struct intel_context *ce)
 	/* We may be called from inside intel_context_pin() to evict another */
 	mutex_lock_nested(&ce->pin_mutex, SINGLE_DEPTH_NESTING);
 
-	if (likely(atomic_dec_and_test(&ce->pin_count)))
+	if (likely(atomic_dec_and_test(&ce->pin_count))) {
 		ce->ops->unpin(ce);
 
+		mutex_lock(&ce->gem_context->mutex);
+		list_del(&ce->active_link);
+		mutex_unlock(&ce->gem_context->mutex);
+
+		i915_gem_context_put(ce->gem_context);
+	}
+
 	mutex_unlock(&ce->pin_mutex);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d3f1fe06d013..13f5545fc1d2 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1306,9 +1306,6 @@ static void execlists_context_unpin(struct intel_context *ce)
 
 	i915_gem_object_unpin_map(ce->state->obj);
 	__context_unpin(ce->state);
-
-	list_del(&ce->active_link);
-	i915_gem_context_put(ce->gem_context);
 }
 
 static void
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 45a54fadc482..6d60bc258feb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1415,9 +1415,6 @@ static void ring_context_unpin(struct intel_context *ce)
 {
 	__context_unpin_ppgtt(ce->gem_context);
 	__context_unpin(ce);
-
-	list_del(&ce->active_link);
-	i915_gem_context_put(ce->gem_context);
 }
 
 static struct i915_vma *
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index 881450c694e9..7641b74ada98 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -126,8 +126,6 @@ static void hw_delay_complete(struct timer_list *t)
 static void mock_context_unpin(struct intel_context *ce)
 {
 	mock_timeline_unpin(ce->ring->timeline);
-	list_del(&ce->active_link);
-	i915_gem_context_put(ce->gem_context);
 }
 
 static void mock_context_destroy(struct intel_context *ce)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 06/22] drm/i915: Hold a reference to the active HW context
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (3 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 12:54   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 07/22] drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set Chris Wilson
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

For virtual engines, we need to keep the HW context alive while it
remains in use. For regular HW contexts, they are created and kept alive
until the end of the GEM context. For simplicity, generalise the
requirements and keep an active reference to each HW context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c      |  2 +-
 drivers/gpu/drm/i915/intel_context.c         |  6 ++++++
 drivers/gpu/drm/i915/intel_context.h         | 11 +++++++++++
 drivers/gpu/drm/i915/intel_context_types.h   |  6 +++++-
 drivers/gpu/drm/i915/intel_lrc.c             |  4 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.c      |  4 +++-
 drivers/gpu/drm/i915/selftests/mock_engine.c |  7 ++++++-
 7 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 21208a865380..d776d43707e0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -232,7 +232,7 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	i915_ppgtt_put(ctx->ppgtt);
 
 	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
-		it->ops->destroy(it);
+		intel_context_put(it);
 
 	kfree(ctx->name);
 	put_pid(ctx->pid);
diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
index 0ab894a058f6..8931e0fee873 100644
--- a/drivers/gpu/drm/i915/intel_context.c
+++ b/drivers/gpu/drm/i915/intel_context.c
@@ -172,6 +172,7 @@ intel_context_pin(struct i915_gem_context *ctx,
 		list_add(&ce->active_link, &ctx->active_engines);
 		mutex_unlock(&ctx->mutex);
 
+		intel_context_get(ce);
 		smp_mb__before_atomic(); /* flush pin before it is visible */
 	}
 
@@ -192,6 +193,7 @@ void intel_context_unpin(struct intel_context *ce)
 		return;
 
 	/* We may be called from inside intel_context_pin() to evict another */
+	intel_context_get(ce);
 	mutex_lock_nested(&ce->pin_mutex, SINGLE_DEPTH_NESTING);
 
 	if (likely(atomic_dec_and_test(&ce->pin_count))) {
@@ -202,9 +204,11 @@ void intel_context_unpin(struct intel_context *ce)
 		mutex_unlock(&ce->gem_context->mutex);
 
 		i915_gem_context_put(ce->gem_context);
+		intel_context_put(ce);
 	}
 
 	mutex_unlock(&ce->pin_mutex);
+	intel_context_put(ce);
 }
 
 static void intel_context_retire(struct i915_active_request *active,
@@ -221,6 +225,8 @@ intel_context_init(struct intel_context *ce,
 		   struct i915_gem_context *ctx,
 		   struct intel_engine_cs *engine)
 {
+	kref_init(&ce->ref);
+
 	ce->gem_context = ctx;
 	ce->engine = engine;
 	ce->ops = engine->cops;
diff --git a/drivers/gpu/drm/i915/intel_context.h b/drivers/gpu/drm/i915/intel_context.h
index 9546d932406a..ebc861b1a49e 100644
--- a/drivers/gpu/drm/i915/intel_context.h
+++ b/drivers/gpu/drm/i915/intel_context.h
@@ -73,4 +73,15 @@ static inline void __intel_context_pin(struct intel_context *ce)
 
 void intel_context_unpin(struct intel_context *ce);
 
+static inline struct intel_context *intel_context_get(struct intel_context *ce)
+{
+	kref_get(&ce->ref);
+	return ce;
+}
+
+static inline void intel_context_put(struct intel_context *ce)
+{
+	kref_put(&ce->ref, ce->ops->destroy);
+}
+
 #endif /* __INTEL_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/intel_context_types.h
index 6dc9b4b9067b..624729a35875 100644
--- a/drivers/gpu/drm/i915/intel_context_types.h
+++ b/drivers/gpu/drm/i915/intel_context_types.h
@@ -7,6 +7,7 @@
 #ifndef __INTEL_CONTEXT_TYPES__
 #define __INTEL_CONTEXT_TYPES__
 
+#include <linux/kref.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
 #include <linux/rbtree.h>
@@ -22,7 +23,8 @@ struct intel_ring;
 struct intel_context_ops {
 	int (*pin)(struct intel_context *ce);
 	void (*unpin)(struct intel_context *ce);
-	void (*destroy)(struct intel_context *ce);
+
+	void (*destroy)(struct kref *kref);
 };
 
 /*
@@ -36,6 +38,8 @@ struct intel_sseu {
 };
 
 struct intel_context {
+	struct kref ref;
+
 	struct i915_gem_context *gem_context;
 	struct intel_engine_cs *engine;
 	struct intel_engine_cs *active;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 13f5545fc1d2..fbf67105f040 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1242,8 +1242,10 @@ static void __execlists_context_fini(struct intel_context *ce)
 	i915_gem_object_put(ce->state->obj);
 }
 
-static void execlists_context_destroy(struct intel_context *ce)
+static void execlists_context_destroy(struct kref *kref)
 {
+	struct intel_context *ce = container_of(kref, typeof(*ce), ref);
+
 	GEM_BUG_ON(intel_context_is_pinned(ce));
 
 	if (ce->state)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 6d60bc258feb..35fdebd67e5f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1345,8 +1345,10 @@ static void __ring_context_fini(struct intel_context *ce)
 	i915_gem_object_put(ce->state->obj);
 }
 
-static void ring_context_destroy(struct intel_context *ce)
+static void ring_context_destroy(struct kref *ref)
 {
+	struct intel_context *ce = container_of(ref, typeof(*ce), ref);
+
 	GEM_BUG_ON(intel_context_is_pinned(ce));
 
 	if (ce->state)
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index 7641b74ada98..639d36eb904a 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -128,12 +128,16 @@ static void mock_context_unpin(struct intel_context *ce)
 	mock_timeline_unpin(ce->ring->timeline);
 }
 
-static void mock_context_destroy(struct intel_context *ce)
+static void mock_context_destroy(struct kref *ref)
 {
+	struct intel_context *ce = container_of(ref, typeof(*ce), ref);
+
 	GEM_BUG_ON(intel_context_is_pinned(ce));
 
 	if (ce->ring)
 		mock_ring_free(ce->ring);
+
+	intel_context_free(ce);
 }
 
 static int mock_context_pin(struct intel_context *ce)
@@ -151,6 +155,7 @@ static int mock_context_pin(struct intel_context *ce)
 static const struct intel_context_ops mock_context_ops = {
 	.pin = mock_context_pin,
 	.unpin = mock_context_unpin,
+
 	.destroy = mock_context_destroy,
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 07/22] drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (4 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 06/22] drm/i915: Hold a reference to the active HW context Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 13:08   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 08/22] drm/i915/selftests: Provide stub reset functions Chris Wilson
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx; +Cc: Yokoyama

We only need to acquire a wakeref for ourselves for a few operations, as
most either already acquire their own wakeref or imply a wakeref. In
particular, it is i915_gem_set_wedged() that needed us to present it
with a wakeref, which is incongruous with its "use anywhere" ability.

Suggested-by: Yokoyama, Caz <caz.yokoyama@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Yokoyama, Caz <caz.yokoyama@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 12 ++++--------
 drivers/gpu/drm/i915/i915_reset.c   |  4 +++-
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6a90558de213..08683dca7775 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -3888,12 +3888,9 @@ static int
 i915_drop_caches_set(void *data, u64 val)
 {
 	struct drm_i915_private *i915 = data;
-	intel_wakeref_t wakeref;
-	int ret = 0;
 
 	DRM_DEBUG("Dropping caches: 0x%08llx [0x%08llx]\n",
 		  val, val & DROP_ALL);
-	wakeref = intel_runtime_pm_get(i915);
 
 	if (val & DROP_RESET_ACTIVE &&
 	    wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT))
@@ -3902,9 +3899,11 @@ i915_drop_caches_set(void *data, u64 val)
 	/* No need to check and wait for gpu resets, only libdrm auto-restarts
 	 * on ioctls on -EAGAIN. */
 	if (val & (DROP_ACTIVE | DROP_RETIRE | DROP_RESET_SEQNO)) {
+		int ret;
+
 		ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
 		if (ret)
-			goto out;
+			return ret;
 
 		if (val & DROP_ACTIVE)
 			ret = i915_gem_wait_for_idle(i915,
@@ -3943,10 +3942,7 @@ i915_drop_caches_set(void *data, u64 val)
 	if (val & DROP_FREED)
 		i915_gem_drain_freed_objects(i915);
 
-out:
-	intel_runtime_pm_put(i915, wakeref);
-
-	return ret;
+	return 0;
 }
 
 DEFINE_SIMPLE_ATTRIBUTE(i915_drop_caches_fops,
diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
index b8daec7ddc06..e61bfa0fc4e0 100644
--- a/drivers/gpu/drm/i915/i915_reset.c
+++ b/drivers/gpu/drm/i915/i915_reset.c
@@ -863,9 +863,11 @@ static void __i915_gem_set_wedged(struct drm_i915_private *i915)
 void i915_gem_set_wedged(struct drm_i915_private *i915)
 {
 	struct i915_gpu_error *error = &i915->gpu_error;
+	intel_wakeref_t wakeref;
 
 	mutex_lock(&error->wedge_mutex);
-	__i915_gem_set_wedged(i915);
+	with_intel_runtime_pm(i915, wakeref)
+		__i915_gem_set_wedged(i915);
 	mutex_unlock(&error->wedge_mutex);
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 08/22] drm/i915/selftests: Provide stub reset functions
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (5 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 07/22] drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 09/22] drm/i915: Switch to use HWS indices rather than addresses Chris Wilson
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

If a test fails, we quite often mark the device as wedged. Provide the
stub functions so that we can wedge the mock device, and avoid exploding
on test failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109981
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/mock_engine.c | 36 ++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index 639d36eb904a..61744819172b 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -198,6 +198,37 @@ static void mock_submit_request(struct i915_request *request)
 	spin_unlock_irqrestore(&engine->hw_lock, flags);
 }
 
+static void mock_reset_prepare(struct intel_engine_cs *engine)
+{
+}
+
+static void mock_reset(struct intel_engine_cs *engine, bool stalled)
+{
+	GEM_BUG_ON(stalled);
+}
+
+static void mock_reset_finish(struct intel_engine_cs *engine)
+{
+}
+
+static void mock_cancel_requests(struct intel_engine_cs *engine)
+{
+	struct i915_request *request;
+	unsigned long flags;
+
+	spin_lock_irqsave(&engine->timeline.lock, flags);
+
+	/* Mark all submitted requests as skipped. */
+	list_for_each_entry(request, &engine->timeline.requests, sched.link) {
+		if (!i915_request_signaled(request))
+			dma_fence_set_error(&request->fence, -EIO);
+
+		i915_request_mark_complete(request);
+	}
+
+	spin_unlock_irqrestore(&engine->timeline.lock, flags);
+}
+
 struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 				    const char *name,
 				    int id)
@@ -223,6 +254,11 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	engine->base.emit_fini_breadcrumb = mock_emit_breadcrumb;
 	engine->base.submit_request = mock_submit_request;
 
+	engine->base.reset.prepare = mock_reset_prepare;
+	engine->base.reset.reset = mock_reset;
+	engine->base.reset.finish = mock_reset_finish;
+	engine->base.cancel_requests = mock_cancel_requests;
+
 	if (i915_timeline_init(i915,
 			       &engine->base.timeline,
 			       engine->base.name,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 09/22] drm/i915: Switch to use HWS indices rather than addresses
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (6 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 08/22] drm/i915/selftests: Provide stub reset functions Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 13:21   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace Chris Wilson
                   ` (15 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

If we use the STORE_DATA_INDEX function we can use a fixed offset and
avoid having to lookup up the engine HWS address. A step closer to being
able to emit the final breadcrumb during request_add rather than later
in the submission interrupt handler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_guc_submission.c |  3 ++-
 drivers/gpu/drm/i915/intel_lrc.c            | 17 +++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.c     | 16 ++++++----------
 drivers/gpu/drm/i915/intel_ringbuffer.h     |  4 ++--
 4 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index 4a5727233419..c4ad73980988 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -583,7 +583,8 @@ static void inject_preempt_context(struct work_struct *work)
 		} else {
 			cs = gen8_emit_ggtt_write(cs,
 						  GUC_PREEMPT_FINISHED,
-						  addr);
+						  addr,
+						  0);
 			*cs++ = MI_NOOP;
 			*cs++ = MI_NOOP;
 		}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index fbf67105f040..7e0c20a2d733 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -173,12 +173,6 @@ static void execlists_init_reg_state(u32 *reg_state,
 				     struct intel_engine_cs *engine,
 				     struct intel_ring *ring);
 
-static inline u32 intel_hws_hangcheck_address(struct intel_engine_cs *engine)
-{
-	return (i915_ggtt_offset(engine->status_page.vma) +
-		I915_GEM_HWS_HANGCHECK_ADDR);
-}
-
 static inline struct i915_priolist *to_priolist(struct rb_node *rb)
 {
 	return rb_entry(rb, struct i915_priolist, node);
@@ -2213,11 +2207,14 @@ static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
 {
 	cs = gen8_emit_ggtt_write(cs,
 				  request->fence.seqno,
-				  request->timeline->hwsp_offset);
+				  request->timeline->hwsp_offset,
+				  0);
 
 	cs = gen8_emit_ggtt_write(cs,
 				  intel_engine_next_hangcheck_seqno(request->engine),
-				  intel_hws_hangcheck_address(request->engine));
+				  I915_GEM_HWS_HANGCHECK_ADDR,
+				  MI_FLUSH_DW_STORE_INDEX);
+
 
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
@@ -2241,8 +2238,8 @@ static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
 
 	cs = gen8_emit_ggtt_write_rcs(cs,
 				      intel_engine_next_hangcheck_seqno(request->engine),
-				      intel_hws_hangcheck_address(request->engine),
-				      0);
+				      I915_GEM_HWS_HANGCHECK_ADDR,
+				      PIPE_CONTROL_STORE_DATA_INDEX);
 
 	*cs++ = MI_USER_INTERRUPT;
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 35fdebd67e5f..0310d5d53bf9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -43,12 +43,6 @@
  */
 #define LEGACY_REQUEST_SIZE 200
 
-static inline u32 hws_hangcheck_address(struct intel_engine_cs *engine)
-{
-	return (i915_ggtt_offset(engine->status_page.vma) +
-		I915_GEM_HWS_HANGCHECK_ADDR);
-}
-
 unsigned int intel_ring_update_space(struct intel_ring *ring)
 {
 	unsigned int space;
@@ -317,8 +311,8 @@ static u32 *gen6_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = rq->fence.seqno;
 
 	*cs++ = GFX_OP_PIPE_CONTROL(4);
-	*cs++ = PIPE_CONTROL_QW_WRITE;
-	*cs++ = hws_hangcheck_address(rq->engine) | PIPE_CONTROL_GLOBAL_GTT;
+	*cs++ = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_STORE_DATA_INDEX;
+	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR | PIPE_CONTROL_GLOBAL_GTT;
 	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
 
 	*cs++ = MI_USER_INTERRUPT;
@@ -423,8 +417,10 @@ static u32 *gen7_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
 	*cs++ = rq->fence.seqno;
 
 	*cs++ = GFX_OP_PIPE_CONTROL(4);
-	*cs++ = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_GLOBAL_GTT_IVB;
-	*cs++ = hws_hangcheck_address(rq->engine);
+	*cs++ = (PIPE_CONTROL_QW_WRITE |
+		 PIPE_CONTROL_STORE_DATA_INDEX |
+		 PIPE_CONTROL_GLOBAL_GTT_IVB);
+	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR;
 	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
 
 	*cs++ = MI_USER_INTERRUPT;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index a57489fcb302..a02c92dac5da 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -419,14 +419,14 @@ gen8_emit_ggtt_write_rcs(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
 }
 
 static inline u32 *
-gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset)
+gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
 {
 	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
 	GEM_BUG_ON(gtt_offset & (1 << 5));
 	/* Offset should be aligned to 8 bytes for both (QW/DW) write types */
 	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
 
-	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
+	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW | flags;
 	*cs++ = gtt_offset | MI_FLUSH_DW_USE_GTT;
 	*cs++ = 0;
 	*cs++ = value;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (7 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 09/22] drm/i915: Switch to use HWS indices rather than addresses Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 16:22   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr Chris Wilson
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

In later patches, it became apparent that userspace can see a partially
constructed GEM context and begin using it before it was ready, to much
hilarity. Close this window of opportunity by lifting the registration of
the context with userspace (the insertion of the context into the filp's
idr) to the very end of the CONTEXT_CREATE ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c       | 143 +++++++++++-------
 drivers/gpu/drm/i915/i915_gem_gtt.c           |   7 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h           |   8 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c   |   2 +-
 .../gpu/drm/i915/selftests/i915_gem_context.c |  12 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   2 +-
 drivers/gpu/drm/i915/selftests/mock_context.c |  17 ++-
 7 files changed, 116 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index d776d43707e0..5df3d423ec6c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -337,15 +337,13 @@ static u32 default_desc_template(const struct drm_i915_private *i915,
 }
 
 static struct i915_gem_context *
-__create_hw_context(struct drm_i915_private *dev_priv,
-		    struct drm_i915_file_private *file_priv)
+__create_context(struct drm_i915_private *dev_priv)
 {
 	struct i915_gem_context *ctx;
-	int ret;
 	int i;
 
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
-	if (ctx == NULL)
+	if (!ctx)
 		return ERR_PTR(-ENOMEM);
 
 	kref_init(&ctx->ref);
@@ -362,29 +360,6 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 	INIT_LIST_HEAD(&ctx->handles_list);
 	INIT_LIST_HEAD(&ctx->hw_id_link);
 
-	/* Default context will never have a file_priv */
-	ret = DEFAULT_CONTEXT_HANDLE;
-	if (file_priv) {
-		ret = idr_alloc(&file_priv->context_idr, ctx,
-				DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
-		if (ret < 0)
-			goto err_lut;
-	}
-	ctx->user_handle = ret;
-
-	ctx->file_priv = file_priv;
-	if (file_priv) {
-		ctx->pid = get_task_pid(current, PIDTYPE_PID);
-		ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x",
-				      current->comm,
-				      pid_nr(ctx->pid),
-				      ctx->user_handle);
-		if (!ctx->name) {
-			ret = -ENOMEM;
-			goto err_pid;
-		}
-	}
-
 	/* NB: Mark all slices as needing a remap so that when the context first
 	 * loads it will restore whatever remap state already exists. If there
 	 * is no remap info, it will be a NOP. */
@@ -401,25 +376,10 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
 
 	return ctx;
-
-err_pid:
-	put_pid(ctx->pid);
-	idr_remove(&file_priv->context_idr, ctx->user_handle);
-err_lut:
-	context_close(ctx);
-	return ERR_PTR(ret);
-}
-
-static void __destroy_hw_context(struct i915_gem_context *ctx,
-				 struct drm_i915_file_private *file_priv)
-{
-	idr_remove(&file_priv->context_idr, ctx->user_handle);
-	context_close(ctx);
 }
 
 static struct i915_gem_context *
-i915_gem_create_context(struct drm_i915_private *dev_priv,
-			struct drm_i915_file_private *file_priv)
+i915_gem_create_context(struct drm_i915_private *dev_priv)
 {
 	struct i915_gem_context *ctx;
 
@@ -428,18 +388,18 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 	/* Reap the most stale context */
 	contexts_free_first(dev_priv);
 
-	ctx = __create_hw_context(dev_priv, file_priv);
+	ctx = __create_context(dev_priv);
 	if (IS_ERR(ctx))
 		return ctx;
 
 	if (HAS_FULL_PPGTT(dev_priv)) {
 		struct i915_hw_ppgtt *ppgtt;
 
-		ppgtt = i915_ppgtt_create(dev_priv, file_priv);
+		ppgtt = i915_ppgtt_create(dev_priv);
 		if (IS_ERR(ppgtt)) {
 			DRM_DEBUG_DRIVER("PPGTT setup failed (%ld)\n",
 					 PTR_ERR(ppgtt));
-			__destroy_hw_context(ctx, file_priv);
+			context_close(ctx);
 			return ERR_CAST(ppgtt);
 		}
 
@@ -475,7 +435,7 @@ i915_gem_context_create_gvt(struct drm_device *dev)
 	if (ret)
 		return ERR_PTR(ret);
 
-	ctx = i915_gem_create_context(to_i915(dev), NULL);
+	ctx = i915_gem_create_context(to_i915(dev));
 	if (IS_ERR(ctx))
 		goto out;
 
@@ -511,7 +471,7 @@ i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio)
 	struct i915_gem_context *ctx;
 	int err;
 
-	ctx = i915_gem_create_context(i915, NULL);
+	ctx = i915_gem_create_context(i915);
 	if (IS_ERR(ctx))
 		return ctx;
 
@@ -625,25 +585,79 @@ static int context_idr_cleanup(int id, void *p, void *data)
 	return 0;
 }
 
+static int gem_context_register(struct i915_gem_context *ctx,
+				struct drm_i915_file_private *fpriv)
+{
+	int ret;
+
+	ctx->pid = get_task_pid(current, PIDTYPE_PID);
+
+	if (ctx->ppgtt)
+		ctx->ppgtt->vm.file = fpriv;
+
+	/* And (nearly) finally expose ourselves to userspace via the idr */
+	ret = idr_alloc(&fpriv->context_idr, ctx,
+			DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
+	if (ret < 0)
+		goto err_pid;
+
+	ctx->file_priv = fpriv;
+	ctx->user_handle = ret;
+
+	ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x",
+			      current->comm,
+			      pid_nr(ctx->pid),
+			      ctx->user_handle);
+	if (!ctx->name) {
+		ret = -ENOMEM;
+		goto err_idr;
+	}
+
+	return 0;
+
+err_idr:
+	idr_remove(&fpriv->context_idr, ctx->user_handle);
+	ctx->file_priv = NULL;
+err_pid:
+	put_pid(ctx->pid);
+	ctx->pid = NULL;
+	return ret;
+}
+
 int i915_gem_context_open(struct drm_i915_private *i915,
 			  struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_gem_context *ctx;
+	int err;
 
 	idr_init(&file_priv->context_idr);
 
 	mutex_lock(&i915->drm.struct_mutex);
-	ctx = i915_gem_create_context(i915, file_priv);
-	mutex_unlock(&i915->drm.struct_mutex);
+
+	ctx = i915_gem_create_context(i915);
 	if (IS_ERR(ctx)) {
-		idr_destroy(&file_priv->context_idr);
-		return PTR_ERR(ctx);
+		err = PTR_ERR(ctx);
+		goto err;
 	}
 
+	err = gem_context_register(ctx, file_priv);
+	if (err)
+		goto err_ctx;
+
+	GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
 	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
 
+	mutex_unlock(&i915->drm.struct_mutex);
+
 	return 0;
+
+err_ctx:
+	context_close(ctx);
+err:
+	mutex_unlock(&i915->drm.struct_mutex);
+	idr_destroy(&file_priv->context_idr);
+	return PTR_ERR(ctx);
 }
 
 void i915_gem_context_close(struct drm_file *file)
@@ -835,17 +849,28 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
-	ctx = i915_gem_create_context(i915, file_priv);
-	mutex_unlock(&dev->struct_mutex);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
+	ctx = i915_gem_create_context(i915);
+	if (IS_ERR(ctx)) {
+		ret = PTR_ERR(ctx);
+		goto err_unlock;
+	}
 
-	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
+	ret = gem_context_register(ctx, file_priv);
+	if (ret)
+		goto err_ctx;
+
+	mutex_unlock(&dev->struct_mutex);
 
 	args->ctx_id = ctx->user_handle;
 	DRM_DEBUG("HW context %d created\n", args->ctx_id);
 
 	return 0;
+
+err_ctx:
+	context_close(ctx);
+err_unlock:
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
 }
 
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
@@ -870,7 +895,9 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		goto out;
 
-	__destroy_hw_context(ctx, file_priv);
+	idr_remove(&file_priv->context_idr, ctx->user_handle);
+	context_close(ctx);
+
 	mutex_unlock(&dev->struct_mutex);
 
 out:
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b8055c8d4e71..b9e0e3a00223 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2069,8 +2069,7 @@ __hw_ppgtt_create(struct drm_i915_private *i915)
 }
 
 struct i915_hw_ppgtt *
-i915_ppgtt_create(struct drm_i915_private *i915,
-		  struct drm_i915_file_private *fpriv)
+i915_ppgtt_create(struct drm_i915_private *i915)
 {
 	struct i915_hw_ppgtt *ppgtt;
 
@@ -2078,8 +2077,6 @@ i915_ppgtt_create(struct drm_i915_private *i915,
 	if (IS_ERR(ppgtt))
 		return ppgtt;
 
-	ppgtt->vm.file = fpriv;
-
 	trace_i915_ppgtt_create(&ppgtt->vm);
 
 	return ppgtt;
@@ -2657,7 +2654,7 @@ int i915_gem_init_aliasing_ppgtt(struct drm_i915_private *i915)
 	struct i915_hw_ppgtt *ppgtt;
 	int err;
 
-	ppgtt = i915_ppgtt_create(i915, ERR_PTR(-EPERM));
+	ppgtt = i915_ppgtt_create(i915);
 	if (IS_ERR(ppgtt))
 		return PTR_ERR(ppgtt);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 35f21a2ae36c..b76ab4c2a0e6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -603,15 +603,17 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv);
 void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv);
 
 int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv);
-void i915_ppgtt_release(struct kref *kref);
-struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_i915_private *dev_priv,
-					struct drm_i915_file_private *fpriv);
+
+struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_i915_private *dev_priv);
 void i915_ppgtt_close(struct i915_address_space *vm);
+void i915_ppgtt_release(struct kref *kref);
+
 static inline void i915_ppgtt_get(struct i915_hw_ppgtt *ppgtt)
 {
 	if (ppgtt)
 		kref_get(&ppgtt->ref);
 }
+
 static inline void i915_ppgtt_put(struct i915_hw_ppgtt *ppgtt)
 {
 	if (ppgtt)
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 218cfc361de3..c5c8ba6c059f 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -1710,7 +1710,7 @@ int i915_gem_huge_page_mock_selftests(void)
 	mkwrite_device_info(dev_priv)->ppgtt_size = 48;
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	ppgtt = i915_ppgtt_create(dev_priv, ERR_PTR(-ENODEV));
+	ppgtt = i915_ppgtt_create(dev_priv);
 	if (IS_ERR(ppgtt)) {
 		err = PTR_ERR(ppgtt);
 		goto out_unlock;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index f18c78ebff07..4dc96e28d89f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -76,7 +76,7 @@ static int live_nop_switch(void *arg)
 	}
 
 	for (n = 0; n < nctx; n++) {
-		ctx[n] = i915_gem_create_context(i915, file->driver_priv);
+		ctx[n] = live_context(i915, file);
 		if (IS_ERR(ctx[n])) {
 			err = PTR_ERR(ctx[n]);
 			goto out_unlock;
@@ -514,7 +514,7 @@ static int igt_ctx_exec(void *arg)
 		struct i915_gem_context *ctx;
 		unsigned int id;
 
-		ctx = i915_gem_create_context(i915, file->driver_priv);
+		ctx = live_context(i915, file);
 		if (IS_ERR(ctx)) {
 			err = PTR_ERR(ctx);
 			goto out_unlock;
@@ -960,7 +960,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
 
 	mutex_lock(&i915->drm.struct_mutex);
 
-	ctx = i915_gem_create_context(i915, file->driver_priv);
+	ctx = live_context(i915, file);
 	if (IS_ERR(ctx)) {
 		ret = PTR_ERR(ctx);
 		goto out_unlock;
@@ -1070,7 +1070,7 @@ static int igt_ctx_readonly(void *arg)
 	if (err)
 		goto out_unlock;
 
-	ctx = i915_gem_create_context(i915, file->driver_priv);
+	ctx = live_context(i915, file);
 	if (IS_ERR(ctx)) {
 		err = PTR_ERR(ctx);
 		goto out_unlock;
@@ -1390,13 +1390,13 @@ static int igt_vm_isolation(void *arg)
 	if (err)
 		goto out_unlock;
 
-	ctx_a = i915_gem_create_context(i915, file->driver_priv);
+	ctx_a = live_context(i915, file);
 	if (IS_ERR(ctx_a)) {
 		err = PTR_ERR(ctx_a);
 		goto out_unlock;
 	}
 
-	ctx_b = i915_gem_create_context(i915, file->driver_priv);
+	ctx_b = live_context(i915, file);
 	if (IS_ERR(ctx_b)) {
 		err = PTR_ERR(ctx_b);
 		goto out_unlock;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 826fd51c331e..01084f6b4fb7 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -1010,7 +1010,7 @@ static int exercise_ppgtt(struct drm_i915_private *dev_priv,
 		return PTR_ERR(file);
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	ppgtt = i915_ppgtt_create(dev_priv, file->driver_priv);
+	ppgtt = i915_ppgtt_create(dev_priv);
 	if (IS_ERR(ppgtt)) {
 		err = PTR_ERR(ppgtt);
 		goto out_unlock;
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index 8efa6892c6cd..1cc8be732435 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -88,9 +88,24 @@ void mock_init_contexts(struct drm_i915_private *i915)
 struct i915_gem_context *
 live_context(struct drm_i915_private *i915, struct drm_file *file)
 {
+	struct i915_gem_context *ctx;
+	int err;
+
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	return i915_gem_create_context(i915, file->driver_priv);
+	ctx = i915_gem_create_context(i915);
+	if (IS_ERR(ctx))
+		return ctx;
+
+	err = gem_context_register(ctx, file->driver_priv);
+	if (err)
+		goto err_ctx;
+
+	return ctx;
+
+err_ctx:
+	i915_gem_context_put(ctx);
+	return ERR_PTR(err);
 }
 
 struct i915_gem_context *
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (8 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18 16:28   ` Tvrtko Ursulin
  2019-03-18  9:51 ` [PATCH 12/22] drm/i915: Introduce the i915_user_extension_method Chris Wilson
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

Define a mutex for the exclusive use of interacting with the per-file
context-idr, that was previously guarded by struct_mutex. This allows us
to reduce the coverage of struct_mutex, with a view to removing the last
bits coordinating GEM context later. (In the short term, we avoid taking
struct_mutex while using the extended constructor functions, preventing
some nasty recursion.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 43 +++++++++++--------------
 2 files changed, 21 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 86080a6e0f45..90389333dd47 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -216,7 +216,9 @@ struct drm_i915_file_private {
  */
 #define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
 	} mm;
+
 	struct idr context_idr;
+	struct mutex context_lock; /* guards context_idr */
 
 	unsigned int bsd_engine;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5df3d423ec6c..94c466d4b29e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -579,9 +579,7 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
 
 static int context_idr_cleanup(int id, void *p, void *data)
 {
-	struct i915_gem_context *ctx = p;
-
-	context_close(ctx);
+	context_close(p);
 	return 0;
 }
 
@@ -596,8 +594,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
 		ctx->ppgtt->vm.file = fpriv;
 
 	/* And (nearly) finally expose ourselves to userspace via the idr */
+	mutex_lock(&fpriv->context_lock);
 	ret = idr_alloc(&fpriv->context_idr, ctx,
 			DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
+	mutex_unlock(&fpriv->context_lock);
 	if (ret < 0)
 		goto err_pid;
 
@@ -616,7 +616,9 @@ static int gem_context_register(struct i915_gem_context *ctx,
 	return 0;
 
 err_idr:
+	mutex_lock(&fpriv->context_lock);
 	idr_remove(&fpriv->context_idr, ctx->user_handle);
+	mutex_unlock(&fpriv->context_lock);
 	ctx->file_priv = NULL;
 err_pid:
 	put_pid(ctx->pid);
@@ -632,10 +634,11 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	int err;
 
 	idr_init(&file_priv->context_idr);
+	mutex_init(&file_priv->context_lock);
 
 	mutex_lock(&i915->drm.struct_mutex);
-
 	ctx = i915_gem_create_context(i915);
+	mutex_unlock(&i915->drm.struct_mutex);
 	if (IS_ERR(ctx)) {
 		err = PTR_ERR(ctx);
 		goto err;
@@ -648,14 +651,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
 	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
 
-	mutex_unlock(&i915->drm.struct_mutex);
-
 	return 0;
 
 err_ctx:
+	mutex_lock(&i915->drm.struct_mutex);
 	context_close(ctx);
-err:
 	mutex_unlock(&i915->drm.struct_mutex);
+err:
+	mutex_destroy(&file_priv->context_lock);
 	idr_destroy(&file_priv->context_idr);
 	return PTR_ERR(ctx);
 }
@@ -668,6 +671,7 @@ void i915_gem_context_close(struct drm_file *file)
 
 	idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
 	idr_destroy(&file_priv->context_idr);
+	mutex_destroy(&file_priv->context_lock);
 }
 
 static struct i915_request *
@@ -850,25 +854,22 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return ret;
 
 	ctx = i915_gem_create_context(i915);
-	if (IS_ERR(ctx)) {
-		ret = PTR_ERR(ctx);
-		goto err_unlock;
-	}
+	mutex_unlock(&dev->struct_mutex);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
 
 	ret = gem_context_register(ctx, file_priv);
 	if (ret)
 		goto err_ctx;
 
-	mutex_unlock(&dev->struct_mutex);
-
 	args->ctx_id = ctx->user_handle;
 	DRM_DEBUG("HW context %d created\n", args->ctx_id);
 
 	return 0;
 
 err_ctx:
+	mutex_lock(&dev->struct_mutex);
 	context_close(ctx);
-err_unlock:
 	mutex_unlock(&dev->struct_mutex);
 	return ret;
 }
@@ -879,7 +880,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_gem_context_destroy *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_gem_context *ctx;
-	int ret;
 
 	if (args->pad != 0)
 		return -EINVAL;
@@ -887,21 +887,16 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
 		return -ENOENT;
 
-	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
+	mutex_lock(&file_priv->context_lock);
+	ctx = idr_remove(&file_priv->context_idr, args->ctx_id);
+	mutex_lock(&file_priv->context_lock);
 	if (!ctx)
 		return -ENOENT;
 
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		goto out;
-
-	idr_remove(&file_priv->context_idr, ctx->user_handle);
+	mutex_lock(&dev->struct_mutex);
 	context_close(ctx);
-
 	mutex_unlock(&dev->struct_mutex);
 
-out:
-	i915_gem_context_put(ctx);
 	return 0;
 }
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 12/22] drm/i915: Introduce the i915_user_extension_method
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (9 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 13/22] drm/i915: Create/destroy VM (ppGTT) for use with contexts Chris Wilson
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

An idea for extending uABI inspired by Vulkan's extension chains.
Instead of expanding the data struct for each ioctl every time we need
to add a new feature, define an extension chain instead. As we add
optional interfaces to control the ioctl, we define a new extension
struct that can be linked into the ioctl data only when required by the
user. The key advantage being able to ignore large control structs for
optional interfaces/extensions, while being able to process them in a
consistent manner.

In comparison to other extensible ioctls, the key difference is the
use of a linked chain of extension structs vs an array of tagged
pointers. For example,

struct drm_amdgpu_cs_chunk {
        __u32           chunk_id;
        __u32           length_dw;
        __u64           chunk_data;
};

struct drm_amdgpu_cs_in {
        __u32           ctx_id;
        __u32           bo_list_handle;
        __u32           num_chunks;
        __u32           _pad;
        __u64           chunks;
};

allows userspace to pass in array of pointers to extension structs, but
must therefore keep constructing that array along side the command stream.
In dynamic situations like that, a linked list is preferred and does not
similar from extra cache line misses as the extension structs themselves
must still be loaded separate to the chunks array.

v2: Apply the tail call optimisation directly to nip the worry of stack
overflow in the bud.
v3: Defend against recursion.
v4: Fixup local types to match new uabi

Opens:
- do we include the result as an out-field in each chain?
struct i915_user_extension {
	__u64 next_extension;
	__u64 name;
	__s32 result;
	__u32 mbz; /* reserved for future use */
};
* Undecided, so provision some room for future expansion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Makefile               |  1 +
 drivers/gpu/drm/i915/i915_user_extensions.c | 61 +++++++++++++++++++++
 drivers/gpu/drm/i915/i915_user_extensions.h | 20 +++++++
 drivers/gpu/drm/i915/i915_utils.h           | 31 +++++++++++
 include/uapi/drm/i915_drm.h                 | 22 ++++++++
 5 files changed, 135 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_user_extensions.c
 create mode 100644 drivers/gpu/drm/i915/i915_user_extensions.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 197b081769b5..1f3e8b145fc0 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -46,6 +46,7 @@ i915-y := i915_drv.o \
 	  i915_sw_fence.o \
 	  i915_syncmap.o \
 	  i915_sysfs.o \
+	  i915_user_extensions.o \
 	  intel_csr.o \
 	  intel_device_info.o \
 	  intel_pm.o \
diff --git a/drivers/gpu/drm/i915/i915_user_extensions.c b/drivers/gpu/drm/i915/i915_user_extensions.c
new file mode 100644
index 000000000000..c822d0aafd2d
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_user_extensions.c
@@ -0,0 +1,61 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#include <linux/nospec.h>
+#include <linux/sched/signal.h>
+#include <linux/uaccess.h>
+
+#include <uapi/drm/i915_drm.h>
+
+#include "i915_user_extensions.h"
+#include "i915_utils.h"
+
+int i915_user_extensions(struct i915_user_extension __user *ext,
+			 const i915_user_extension_fn *tbl,
+			 unsigned int count,
+			 void *data)
+{
+	unsigned int stackdepth = 512;
+
+	while (ext) {
+		int i, err;
+		u32 name;
+		u64 next;
+
+		if (!stackdepth--) /* recursion vs useful flexibility */
+			return -E2BIG;
+
+		err = check_user_mbz(&ext->flags);
+		if (err)
+			return err;
+
+		for (i = 0; i < ARRAY_SIZE(ext->rsvd); i++) {
+			err = check_user_mbz(&ext->rsvd[i]);
+			if (err)
+				return err;
+		}
+
+		if (get_user(name, &ext->name))
+			return -EFAULT;
+
+		err = -EINVAL;
+		if (name < count) {
+			name = array_index_nospec(name, count);
+			if (tbl[name])
+				err = tbl[name](ext, data);
+		}
+		if (err)
+			return err;
+
+		if (get_user(next, &ext->next_extension) ||
+		    overflows_type(next, ext))
+			return -EFAULT;
+
+		ext = u64_to_user_ptr(next);
+	}
+
+	return 0;
+}
diff --git a/drivers/gpu/drm/i915/i915_user_extensions.h b/drivers/gpu/drm/i915/i915_user_extensions.h
new file mode 100644
index 000000000000..a14bf6bba9a1
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_user_extensions.h
@@ -0,0 +1,20 @@
+/*
+ * SPDX-License-Identifier: MIT
+ *
+ * Copyright © 2018 Intel Corporation
+ */
+
+#ifndef I915_USER_EXTENSIONS_H
+#define I915_USER_EXTENSIONS_H
+
+struct i915_user_extension;
+
+typedef int (*i915_user_extension_fn)(struct i915_user_extension __user *ext,
+				      void *data);
+
+int i915_user_extensions(struct i915_user_extension __user *ext,
+			 const i915_user_extension_fn *tbl,
+			 unsigned int count,
+			 void *data);
+
+#endif /* I915_USER_EXTENSIONS_H */
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 540e20eb032c..2dbe8933b50a 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -105,6 +105,37 @@
 	__T;								\
 })
 
+/*
+ * container_of_user: Extract the superclass from a pointer to a member.
+ *
+ * Exactly like container_of() with the exception that it plays nicely
+ * with sparse for __user @ptr.
+ */
+#define container_of_user(ptr, type, member) ({				\
+	void __user *__mptr = (void __user *)(ptr);			\
+	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
+			 !__same_type(*(ptr), void),			\
+			 "pointer type mismatch in container_of()");	\
+	((type __user *)(__mptr - offsetof(type, member))); })
+
+/*
+ * check_user_mbz: Check that a user value exists and is zero
+ *
+ * Frequently in our uABI we reserve space for future extensions, and
+ * two ensure that userspace is prepared we enforce that space must
+ * be zero. (Then any future extension can safely assume a default value
+ * of 0.)
+ *
+ * check_user_mbz() combines checking that the user pointer is accessible
+ * and that the contained value is zero.
+ *
+ * Returns: -EFAULT if not accessible, -EINVAL if !zero, or 0 on success.
+ */
+#define check_user_mbz(U) ({						\
+	typeof(*(U)) mbz__;						\
+	get_user(mbz__, (U)) ? -EFAULT : mbz__ ? -EINVAL : 0;		\
+})
+
 static inline u64 ptr_to_u64(const void *ptr)
 {
 	return (uintptr_t)ptr;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index aa2d4c73a97d..1c69ed16a923 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -62,6 +62,28 @@ extern "C" {
 #define I915_ERROR_UEVENT		"ERROR"
 #define I915_RESET_UEVENT		"RESET"
 
+/*
+ * i915_user_extension: Base class for defining a chain of extensions
+ *
+ * Many interfaces need to grow over time. In most cases we can simply
+ * extend the struct and have userspace pass in more data. Another option,
+ * as demonstrated by Vulkan's approach to providing extensions for forward
+ * and backward compatibility, is to use a list of optional structs to
+ * provide those extra details.
+ *
+ * The key advantage to using an extension chain is that it allows us to
+ * redefine the interface more easily than an ever growing struct of
+ * increasing complexity, and for large parts of that interface to be
+ * entirely optional. The downside is more pointer chasing; chasing across
+ * the __user boundary with pointers encapsulated inside u64.
+ */
+struct i915_user_extension {
+	__u64 next_extension;
+	__u32 name;
+	__u32 flags; /* All undefined bits must be zero. */
+	__u32 rsvd[4]; /* Reserved for future use; must be zero. */
+};
+
 /*
  * MOCS indexes used for GPU surfaces, defining the cacheability of the
  * surface data and the coherency for this data wrt. CPU vs. GPU accesses.
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 13/22] drm/i915: Create/destroy VM (ppGTT) for use with contexts
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (10 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 12/22] drm/i915: Introduce the i915_user_extension_method Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 14/22] drm/i915: Extend CONTEXT_CREATE to set parameters upon construction Chris Wilson
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

In preparation to making the ppGTT binding for a context explicit (to
facilitate reusing the same ppGTT between different contexts), allow the
user to create and destroy named ppGTT.

v2: Replace global barrier for swapping over the ppgtt and tlbs with a
local context barrier (Tvrtko)
v3: serialise with struct_mutex; it's lazy but required dammit
v4: Rewrite igt_ctx_shared_exec to be more different (aimed to be more
similarly, turned out different!)

v5: Fix up test unwind for aliasing-ppgtt (snb)
v6: Tighten language for uapi struct drm_i915_gem_vm_control.
v7: Patch the context image for runtime ppgtt switching!

Testcase: igt/gem_vm_create
Testcase: igt/gem_ctx_param/vm
Testcase: igt/gem_ctx_clone/vm
Testcase: igt/gem_ctx_shared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c               |   2 +
 drivers/gpu/drm/i915/i915_drv.h               |   3 +
 drivers/gpu/drm/i915/i915_gem_context.c       | 331 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.h       |   5 +
 drivers/gpu/drm/i915/i915_gem_gtt.c           |  19 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h           |  11 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c   |   1 -
 .../gpu/drm/i915/selftests/i915_gem_context.c | 243 ++++++++++---
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   1 -
 drivers/gpu/drm/i915/selftests/mock_context.c |   8 +-
 include/uapi/drm/i915_drm.h                   |  43 +++
 11 files changed, 586 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a3b00ecc58c9..fa991144e0f2 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3121,6 +3121,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_QUERY, i915_query_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_CREATE, i915_gem_vm_create_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_VM_DESTROY, i915_gem_vm_destroy_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 90389333dd47..263a64ddd6d2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -220,6 +220,9 @@ struct drm_i915_file_private {
 	struct idr context_idr;
 	struct mutex context_lock; /* guards context_idr */
 
+	struct mutex vm_lock;
+	struct idr vm_idr;
+
 	unsigned int bsd_engine;
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 94c466d4b29e..c392f7af5546 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -90,6 +90,7 @@
 #include "i915_drv.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
+#include "i915_user_extensions.h"
 #include "intel_lrc_reg.h"
 #include "intel_workarounds.h"
 
@@ -120,12 +121,15 @@ static void lut_close(struct i915_gem_context *ctx)
 		list_del(&lut->obj_link);
 		i915_lut_handle_free(lut);
 	}
+	INIT_LIST_HEAD(&ctx->handles_list);
 
 	rcu_read_lock();
 	radix_tree_for_each_slot(slot, &ctx->handles_vma, &iter, 0) {
 		struct i915_vma *vma = rcu_dereference_raw(*slot);
 
 		radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
+
+		vma->open_count--;
 		__i915_gem_object_release_unless_active(vma->obj);
 	}
 	rcu_read_unlock();
@@ -305,8 +309,6 @@ static void context_close(struct i915_gem_context *ctx)
 	 * the ppgtt).
 	 */
 	lut_close(ctx);
-	if (ctx->ppgtt)
-		i915_ppgtt_close(&ctx->ppgtt->vm);
 
 	ctx->file_priv = ERR_PTR(-EBADF);
 	i915_gem_context_put(ctx);
@@ -378,6 +380,28 @@ __create_context(struct drm_i915_private *dev_priv)
 	return ctx;
 }
 
+static struct i915_hw_ppgtt *
+__set_ppgtt(struct i915_gem_context *ctx, struct i915_hw_ppgtt *ppgtt)
+{
+	struct i915_hw_ppgtt *old = ctx->ppgtt;
+
+	ctx->ppgtt = i915_ppgtt_get(ppgtt);
+	ctx->desc_template = default_desc_template(ctx->i915, ppgtt);
+
+	return old;
+}
+
+static void __assign_ppgtt(struct i915_gem_context *ctx,
+			   struct i915_hw_ppgtt *ppgtt)
+{
+	if (ppgtt == ctx->ppgtt)
+		return;
+
+	ppgtt = __set_ppgtt(ctx, ppgtt);
+	if (ppgtt)
+		i915_ppgtt_put(ppgtt);
+}
+
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *dev_priv)
 {
@@ -403,8 +427,8 @@ i915_gem_create_context(struct drm_i915_private *dev_priv)
 			return ERR_CAST(ppgtt);
 		}
 
-		ctx->ppgtt = ppgtt;
-		ctx->desc_template = default_desc_template(dev_priv, ppgtt);
+		__assign_ppgtt(ctx, ppgtt);
+		i915_ppgtt_put(ppgtt);
 	}
 
 	trace_i915_context_create(ctx);
@@ -583,6 +607,12 @@ static int context_idr_cleanup(int id, void *p, void *data)
 	return 0;
 }
 
+static int vm_idr_cleanup(int id, void *p, void *data)
+{
+	i915_ppgtt_put(p);
+	return 0;
+}
+
 static int gem_context_register(struct i915_gem_context *ctx,
 				struct drm_i915_file_private *fpriv)
 {
@@ -633,8 +663,11 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	struct i915_gem_context *ctx;
 	int err;
 
-	idr_init(&file_priv->context_idr);
 	mutex_init(&file_priv->context_lock);
+	mutex_init(&file_priv->vm_lock);
+
+	idr_init(&file_priv->context_idr);
+	idr_init_base(&file_priv->vm_idr, 1);
 
 	mutex_lock(&i915->drm.struct_mutex);
 	ctx = i915_gem_create_context(i915);
@@ -658,8 +691,10 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	context_close(ctx);
 	mutex_unlock(&i915->drm.struct_mutex);
 err:
-	mutex_destroy(&file_priv->context_lock);
+	idr_destroy(&file_priv->vm_idr);
 	idr_destroy(&file_priv->context_idr);
+	mutex_destroy(&file_priv->vm_lock);
+	mutex_destroy(&file_priv->context_lock);
 	return PTR_ERR(ctx);
 }
 
@@ -672,6 +707,99 @@ void i915_gem_context_close(struct drm_file *file)
 	idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
 	idr_destroy(&file_priv->context_idr);
 	mutex_destroy(&file_priv->context_lock);
+
+	idr_for_each(&file_priv->vm_idr, vm_idr_cleanup, NULL);
+	idr_destroy(&file_priv->vm_idr);
+	mutex_destroy(&file_priv->vm_lock);
+}
+
+int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_gem_vm_control *args = data;
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_hw_ppgtt *ppgtt;
+	int err;
+
+	if (!HAS_FULL_PPGTT(i915))
+		return -ENODEV;
+
+	if (args->flags)
+		return -EINVAL;
+
+	ppgtt = i915_ppgtt_create(i915);
+	if (IS_ERR(ppgtt))
+		return PTR_ERR(ppgtt);
+
+	ppgtt->vm.file = file_priv;
+
+	if (args->extensions) {
+		err = i915_user_extensions(u64_to_user_ptr(args->extensions),
+					   NULL, 0,
+					   ppgtt);
+		if (err)
+			goto err_put;
+	}
+
+	err = mutex_lock_interruptible(&file_priv->vm_lock);
+	if (err)
+		goto err_put;
+
+	err = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
+	if (err < 0)
+		goto err_unlock;
+
+	GEM_BUG_ON(err == 0); /* reserved for default/unassigned ppgtt */
+	ppgtt->user_handle = err;
+
+	mutex_unlock(&file_priv->vm_lock);
+
+	args->vm_id = err;
+	return 0;
+
+err_unlock:
+	mutex_unlock(&file_priv->vm_lock);
+err_put:
+	i915_ppgtt_put(ppgtt);
+	return err;
+}
+
+int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
+			      struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_vm_control *args = data;
+	struct i915_hw_ppgtt *ppgtt;
+	int err;
+	u32 id;
+
+	if (args->flags)
+		return -EINVAL;
+
+	if (args->extensions)
+		return -EINVAL;
+
+	id = args->vm_id;
+	if (!id)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_lock);
+	if (err)
+		return err;
+
+	ppgtt = idr_remove(&file_priv->vm_idr, id);
+	if (ppgtt) {
+		GEM_BUG_ON(!ppgtt->user_handle);
+		ppgtt->user_handle = 0;
+	}
+
+	mutex_unlock(&file_priv->vm_lock);
+	if (!ppgtt)
+		return -ENOENT;
+
+	i915_ppgtt_put(ppgtt);
+	return 0;
 }
 
 static struct i915_request *
@@ -715,12 +843,13 @@ static void cb_retire(struct i915_active *base)
 I915_SELFTEST_DECLARE(static unsigned long context_barrier_inject_fault);
 static int context_barrier_task(struct i915_gem_context *ctx,
 				unsigned long engines,
+				int (*emit)(struct i915_request *rq, void *data),
 				void (*task)(void *data),
 				void *data)
 {
 	struct drm_i915_private *i915 = ctx->i915;
 	struct context_barrier_task *cb;
-	struct intel_context *ce;
+	struct intel_context *ce, *next;
 	intel_wakeref_t wakeref;
 	int err = 0;
 
@@ -735,11 +864,11 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 	i915_active_acquire(&cb->base);
 
 	wakeref = intel_runtime_pm_get(i915);
-	list_for_each_entry(ce, &ctx->active_engines, active_link) {
+	rbtree_postorder_for_each_entry_safe(ce, next, &ctx->hw_contexts, node) {
 		struct intel_engine_cs *engine = ce->engine;
 		struct i915_request *rq;
 
-		if (!(ce->engine->mask & engines))
+		if (!(engine->mask & engines))
 			continue;
 
 		if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
@@ -754,7 +883,12 @@ static int context_barrier_task(struct i915_gem_context *ctx,
 			break;
 		}
 
-		err = i915_active_ref(&cb->base, rq->fence.context, rq);
+		err = 0;
+		if (emit)
+			err = emit(rq, data);
+		if (err == 0)
+			err = i915_active_ref(&cb->base, rq->fence.context, rq);
+
 		i915_request_add(rq);
 		if (err)
 			break;
@@ -817,6 +951,170 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *i915,
 	return 0;
 }
 
+static int get_ppgtt(struct i915_gem_context *ctx,
+		     struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_file_private *file_priv = ctx->file_priv;
+	struct i915_hw_ppgtt *ppgtt;
+	int ret;
+
+	if (!ctx->ppgtt)
+		return -ENODEV;
+
+	/* XXX rcu acquire? */
+	ret = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
+	if (ret)
+		return ret;
+
+	ppgtt = i915_ppgtt_get(ctx->ppgtt);
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+	ret = mutex_lock_interruptible(&file_priv->vm_lock);
+	if (ret)
+		goto err_put;
+
+	if (!ppgtt->user_handle) {
+		ret = idr_alloc(&file_priv->vm_idr, ppgtt, 0, 0, GFP_KERNEL);
+		GEM_BUG_ON(!ret);
+		if (ret < 0)
+			goto err_unlock;
+
+		ppgtt->user_handle = ret;
+		i915_ppgtt_get(ppgtt);
+	}
+
+	args->size = 0;
+	args->value = ppgtt->user_handle;
+
+	ret = 0;
+err_unlock:
+	mutex_unlock(&file_priv->vm_lock);
+err_put:
+	i915_ppgtt_put(ppgtt);
+	return ret;
+}
+
+static void set_ppgtt_barrier(void *data)
+{
+	struct i915_hw_ppgtt *old = data;
+
+	if (INTEL_GEN(old->vm.i915) < 8)
+		gen6_ppgtt_unpin_all(old);
+
+	i915_ppgtt_put(old);
+}
+
+static int emit_ppgtt_update(struct i915_request *rq, void *data)
+{
+	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
+	struct intel_engine_cs *engine = rq->engine;
+	u32 *cs;
+	int i;
+
+	if (i915_vm_is_4lvl(&ppgtt->vm)) {
+		const dma_addr_t pd_daddr = px_dma(&ppgtt->pml4);
+
+		cs = intel_ring_begin(rq, 6);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(2);
+
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 0));
+		*cs++ = upper_32_bits(pd_daddr);
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 0));
+		*cs++ = lower_32_bits(pd_daddr);
+
+		*cs++ = MI_NOOP;
+		intel_ring_advance(rq, cs);
+	} else if (HAS_LOGICAL_RING_CONTEXTS(engine->i915)) {
+		cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES);
+		for (i = GEN8_3LVL_PDPES; i--; ) {
+			const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+
+			*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
+			*cs++ = upper_32_bits(pd_daddr);
+			*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
+			*cs++ = lower_32_bits(pd_daddr);
+		}
+		*cs++ = MI_NOOP;
+		intel_ring_advance(rq, cs);
+	} else {
+		/* ppGTT is not part of the legacy context image */
+		gen6_ppgtt_pin(ppgtt);
+	}
+
+	return 0;
+}
+
+static int set_ppgtt(struct i915_gem_context *ctx,
+		     struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_file_private *file_priv = ctx->file_priv;
+	struct i915_hw_ppgtt *ppgtt, *old;
+	int err;
+
+	if (args->size)
+		return -EINVAL;
+
+	if (!ctx->ppgtt)
+		return -ENODEV;
+
+	if (upper_32_bits(args->value))
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&file_priv->vm_lock);
+	if (err)
+		return err;
+
+	ppgtt = idr_find(&file_priv->vm_idr, args->value);
+	if (ppgtt) {
+		GEM_BUG_ON(ppgtt->user_handle != args->value);
+		i915_ppgtt_get(ppgtt);
+	}
+	mutex_unlock(&file_priv->vm_lock);
+	if (!ppgtt)
+		return -ENOENT;
+
+	err = mutex_lock_interruptible(&ctx->i915->drm.struct_mutex);
+	if (err)
+		goto out;
+
+	if (ppgtt == ctx->ppgtt)
+		goto unlock;
+
+	/* Teardown the existing obj:vma cache, it will have to be rebuilt. */
+	lut_close(ctx);
+
+	old = __set_ppgtt(ctx, ppgtt);
+
+	/*
+	 * We need to flush any requests using the current ppgtt before
+	 * we release it as the requests do not hold a reference themselves,
+	 * only indirectly through the context.
+	 */
+	err = context_barrier_task(ctx, ALL_ENGINES,
+				   emit_ppgtt_update,
+				   set_ppgtt_barrier,
+				   old);
+	if (err) {
+		ctx->ppgtt = old;
+		ctx->desc_template = default_desc_template(ctx->i915, old);
+		i915_ppgtt_put(ppgtt);
+	}
+
+unlock:
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+out:
+	i915_ppgtt_put(ppgtt);
+	return err;
+}
+
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
 {
 	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
@@ -995,6 +1293,9 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_SSEU:
 		ret = get_sseu(ctx, args);
 		break;
+	case I915_CONTEXT_PARAM_VM:
+		ret = get_ppgtt(ctx, args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -1296,9 +1597,6 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	switch (args->param) {
-	case I915_CONTEXT_PARAM_BAN_PERIOD:
-		ret = -EINVAL;
-		break;
 	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 		if (args->size)
 			ret = -EINVAL;
@@ -1354,9 +1652,16 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 					I915_USER_PRIORITY(priority);
 		}
 		break;
+
 	case I915_CONTEXT_PARAM_SSEU:
 		ret = set_sseu(ctx, args);
 		break;
+
+	case I915_CONTEXT_PARAM_VM:
+		ret = set_ppgtt(ctx, args);
+		break;
+
+	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 5a32c4b4816f..1e670372892c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -153,6 +153,11 @@ void i915_gem_context_release(struct kref *ctx_ref);
 struct i915_gem_context *
 i915_gem_context_create_gvt(struct drm_device *dev);
 
+int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file);
+int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data,
+			      struct drm_file *file);
+
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file);
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b9e0e3a00223..736c845eb77f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1937,6 +1937,8 @@ int gen6_ppgtt_pin(struct i915_hw_ppgtt *base)
 	struct gen6_hw_ppgtt *ppgtt = to_gen6_ppgtt(base);
 	int err;
 
+	GEM_BUG_ON(ppgtt->base.vm.closed);
+
 	/*
 	 * Workaround the limited maximum vma->pin_count and the aliasing_ppgtt
 	 * which will be pinned into every active context.
@@ -1975,6 +1977,17 @@ void gen6_ppgtt_unpin(struct i915_hw_ppgtt *base)
 	i915_vma_unpin(ppgtt->vma);
 }
 
+void gen6_ppgtt_unpin_all(struct i915_hw_ppgtt *base)
+{
+	struct gen6_hw_ppgtt *ppgtt = to_gen6_ppgtt(base);
+
+	if (!ppgtt->pin_count)
+		return;
+
+	ppgtt->pin_count = 0;
+	i915_vma_unpin(ppgtt->vma);
+}
+
 static struct i915_hw_ppgtt *gen6_ppgtt_create(struct drm_i915_private *i915)
 {
 	struct i915_ggtt * const ggtt = &i915->ggtt;
@@ -2082,12 +2095,6 @@ i915_ppgtt_create(struct drm_i915_private *i915)
 	return ppgtt;
 }
 
-void i915_ppgtt_close(struct i915_address_space *vm)
-{
-	GEM_BUG_ON(vm->closed);
-	vm->closed = true;
-}
-
 static void ppgtt_destroy_vma(struct i915_address_space *vm)
 {
 	struct list_head *phases[] = {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index b76ab4c2a0e6..14983fb63c3d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -391,11 +391,14 @@ struct i915_hw_ppgtt {
 	struct kref ref;
 
 	unsigned long pd_dirty_engines;
+
 	union {
 		struct i915_pml4 pml4;		/* GEN8+ & 48b PPGTT */
 		struct i915_page_directory_pointer pdp;	/* GEN8+ */
 		struct i915_page_directory pd;		/* GEN6-7 */
 	};
+
+	u32 user_handle;
 };
 
 struct gen6_hw_ppgtt {
@@ -605,13 +608,12 @@ void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv);
 int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv);
 
 struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_i915_private *dev_priv);
-void i915_ppgtt_close(struct i915_address_space *vm);
 void i915_ppgtt_release(struct kref *kref);
 
-static inline void i915_ppgtt_get(struct i915_hw_ppgtt *ppgtt)
+static inline struct i915_hw_ppgtt *i915_ppgtt_get(struct i915_hw_ppgtt *ppgtt)
 {
-	if (ppgtt)
-		kref_get(&ppgtt->ref);
+	kref_get(&ppgtt->ref);
+	return ppgtt;
 }
 
 static inline void i915_ppgtt_put(struct i915_hw_ppgtt *ppgtt)
@@ -622,6 +624,7 @@ static inline void i915_ppgtt_put(struct i915_hw_ppgtt *ppgtt)
 
 int gen6_ppgtt_pin(struct i915_hw_ppgtt *base);
 void gen6_ppgtt_unpin(struct i915_hw_ppgtt *base);
+void gen6_ppgtt_unpin_all(struct i915_hw_ppgtt *base);
 
 void i915_check_and_clear_faults(struct drm_i915_private *dev_priv);
 void i915_gem_suspend_gtt_mappings(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index c5c8ba6c059f..90721b54e7ae 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -1732,7 +1732,6 @@ int i915_gem_huge_page_mock_selftests(void)
 	err = i915_subtests(tests, ppgtt);
 
 out_close:
-	i915_ppgtt_close(&ppgtt->vm);
 	i915_ppgtt_put(ppgtt);
 
 out_unlock:
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 4dc96e28d89f..28c2334abcaf 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -373,7 +373,8 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 value)
 	return 0;
 }
 
-static int cpu_check(struct drm_i915_gem_object *obj, unsigned int max)
+static noinline int cpu_check(struct drm_i915_gem_object *obj,
+			      unsigned int idx, unsigned int max)
 {
 	unsigned int n, m, needs_flush;
 	int err;
@@ -391,8 +392,10 @@ static int cpu_check(struct drm_i915_gem_object *obj, unsigned int max)
 
 		for (m = 0; m < max; m++) {
 			if (map[m] != m) {
-				pr_err("Invalid value at page %d, offset %d: found %x expected %x\n",
-				       n, m, map[m], m);
+				pr_err("%pS: Invalid value at object %d page %d/%ld, offset %d/%d: found %x expected %x\n",
+				       __builtin_return_address(0), idx,
+				       n, real_page_count(obj), m, max,
+				       map[m], m);
 				err = -EINVAL;
 				goto out_unmap;
 			}
@@ -400,8 +403,9 @@ static int cpu_check(struct drm_i915_gem_object *obj, unsigned int max)
 
 		for (; m < DW_PER_PAGE; m++) {
 			if (map[m] != STACK_MAGIC) {
-				pr_err("Invalid value at page %d, offset %d: found %x expected %x\n",
-				       n, m, map[m], STACK_MAGIC);
+				pr_err("%pS: Invalid value at object %d page %d, offset %d: found %x expected %x (uninitialised)\n",
+				       __builtin_return_address(0), idx, n, m,
+				       map[m], STACK_MAGIC);
 				err = -EINVAL;
 				goto out_unmap;
 			}
@@ -479,12 +483,8 @@ static unsigned long max_dwords(struct drm_i915_gem_object *obj)
 static int igt_ctx_exec(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
-	struct drm_i915_gem_object *obj = NULL;
-	unsigned long ncontexts, ndwords, dw;
-	struct igt_live_test t;
-	struct drm_file *file;
-	IGT_TIMEOUT(end_time);
-	LIST_HEAD(objects);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 	int err = -ENODEV;
 
 	/*
@@ -496,44 +496,175 @@ static int igt_ctx_exec(void *arg)
 	if (!DRIVER_CAPS(i915)->has_logical_contexts)
 		return 0;
 
+	for_each_engine(engine, i915, id) {
+		struct drm_i915_gem_object *obj = NULL;
+		unsigned long ncontexts, ndwords, dw;
+		struct igt_live_test t;
+		struct drm_file *file;
+		IGT_TIMEOUT(end_time);
+		LIST_HEAD(objects);
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		if (!engine->context_size)
+			continue; /* No logical context support in HW */
+
+		file = mock_file(i915);
+		if (IS_ERR(file))
+			return PTR_ERR(file);
+
+		mutex_lock(&i915->drm.struct_mutex);
+
+		err = igt_live_test_begin(&t, i915, __func__, engine->name);
+		if (err)
+			goto out_unlock;
+
+		ncontexts = 0;
+		ndwords = 0;
+		dw = 0;
+		while (!time_after(jiffies, end_time)) {
+			struct i915_gem_context *ctx;
+			intel_wakeref_t wakeref;
+
+			ctx = live_context(i915, file);
+			if (IS_ERR(ctx)) {
+				err = PTR_ERR(ctx);
+				goto out_unlock;
+			}
+
+			if (!obj) {
+				obj = create_test_object(ctx, file, &objects);
+				if (IS_ERR(obj)) {
+					err = PTR_ERR(obj);
+					goto out_unlock;
+				}
+			}
+
+			with_intel_runtime_pm(i915, wakeref)
+				err = gpu_fill(obj, ctx, engine, dw);
+			if (err) {
+				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
+				       ndwords, dw, max_dwords(obj),
+				       engine->name, ctx->hw_id,
+				       yesno(!!ctx->ppgtt), err);
+				goto out_unlock;
+			}
+
+			if (++dw == max_dwords(obj)) {
+				obj = NULL;
+				dw = 0;
+			}
+
+			ndwords++;
+			ncontexts++;
+		}
+
+		pr_info("Submitted %lu contexts to %s, filling %lu dwords\n",
+			ncontexts, engine->name, ndwords);
+
+		ncontexts = dw = 0;
+		list_for_each_entry(obj, &objects, st_link) {
+			unsigned int rem =
+				min_t(unsigned int, ndwords - dw, max_dwords(obj));
+
+			err = cpu_check(obj, ncontexts++, rem);
+			if (err)
+				break;
+
+			dw += rem;
+		}
+
+out_unlock:
+		if (igt_live_test_end(&t))
+			err = -EIO;
+		mutex_unlock(&i915->drm.struct_mutex);
+
+		mock_file_free(i915, file);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int igt_shared_ctx_exec(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct i915_gem_context *parent;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	struct igt_live_test t;
+	struct drm_file *file;
+	int err = 0;
+
+	/*
+	 * Create a few different contexts with the same mm and write
+	 * through each ctx using the GPU making sure those writes end
+	 * up in the expected pages of our obj.
+	 */
+	if (!DRIVER_CAPS(i915)->has_logical_contexts)
+		return 0;
+
 	file = mock_file(i915);
 	if (IS_ERR(file))
 		return PTR_ERR(file);
 
 	mutex_lock(&i915->drm.struct_mutex);
 
+	parent = live_context(i915, file);
+	if (IS_ERR(parent)) {
+		err = PTR_ERR(parent);
+		goto out_unlock;
+	}
+
+	if (!parent->ppgtt) { /* not full-ppgtt; nothing to share */
+		err = 0;
+		goto out_unlock;
+	}
+
 	err = igt_live_test_begin(&t, i915, __func__, "");
 	if (err)
 		goto out_unlock;
 
-	ncontexts = 0;
-	ndwords = 0;
-	dw = 0;
-	while (!time_after(jiffies, end_time)) {
-		struct intel_engine_cs *engine;
-		struct i915_gem_context *ctx;
-		unsigned int id;
+	for_each_engine(engine, i915, id) {
+		unsigned long ncontexts, ndwords, dw;
+		struct drm_i915_gem_object *obj = NULL;
+		struct i915_gem_context *ctx = NULL;
+		IGT_TIMEOUT(end_time);
+		LIST_HEAD(objects);
 
-		ctx = live_context(i915, file);
-		if (IS_ERR(ctx)) {
-			err = PTR_ERR(ctx);
-			goto out_unlock;
-		}
+		if (!intel_engine_can_store_dword(engine))
+			continue;
 
-		for_each_engine(engine, i915, id) {
+		dw = 0;
+		ndwords = 0;
+		ncontexts = 0;
+		while (!time_after(jiffies, end_time)) {
 			intel_wakeref_t wakeref;
 
-			if (!engine->context_size)
-				continue; /* No logical context support in HW */
+			if (ctx) {
+				struct drm_i915_file_private *file_priv =
+					file->driver_priv;
 
-			if (!intel_engine_can_store_dword(engine))
-				continue;
+				idr_remove(&file_priv->context_idr,
+					   ctx->user_handle);
+				context_close(ctx);
+			}
+
+			ctx = live_context(i915, file);
+			if (IS_ERR(ctx)) {
+				err = PTR_ERR(ctx);
+				goto out_test;
+			}
+
+			__assign_ppgtt(ctx, parent->ppgtt);
 
 			if (!obj) {
-				obj = create_test_object(ctx, file, &objects);
+				obj = create_test_object(parent, file, &objects);
 				if (IS_ERR(obj)) {
 					err = PTR_ERR(obj);
-					goto out_unlock;
+					goto out_test;
 				}
 			}
 
@@ -545,35 +676,36 @@ static int igt_ctx_exec(void *arg)
 				       ndwords, dw, max_dwords(obj),
 				       engine->name, ctx->hw_id,
 				       yesno(!!ctx->ppgtt), err);
-				goto out_unlock;
+				goto out_test;
 			}
 
 			if (++dw == max_dwords(obj)) {
 				obj = NULL;
 				dw = 0;
 			}
+
 			ndwords++;
+			ncontexts++;
 		}
-		ncontexts++;
-	}
-	pr_info("Submitted %lu contexts (across %u engines), filling %lu dwords\n",
-		ncontexts, RUNTIME_INFO(i915)->num_engines, ndwords);
+		pr_info("Submitted %lu contexts to %s, filling %lu dwords\n",
+			ncontexts, engine->name, ndwords);
 
-	dw = 0;
-	list_for_each_entry(obj, &objects, st_link) {
-		unsigned int rem =
-			min_t(unsigned int, ndwords - dw, max_dwords(obj));
+		ncontexts = dw = 0;
+		list_for_each_entry(obj, &objects, st_link) {
+			unsigned int rem =
+				min_t(unsigned int, ndwords - dw, max_dwords(obj));
 
-		err = cpu_check(obj, rem);
-		if (err)
-			break;
+			err = cpu_check(obj, ncontexts++, rem);
+			if (err)
+				goto out_test;
 
-		dw += rem;
+			dw += rem;
+		}
 	}
-
-out_unlock:
+out_test:
 	if (igt_live_test_end(&t))
 		err = -EIO;
+out_unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	mock_file_free(i915, file);
@@ -1046,7 +1178,7 @@ static int igt_ctx_readonly(void *arg)
 	struct drm_i915_gem_object *obj = NULL;
 	struct i915_gem_context *ctx;
 	struct i915_hw_ppgtt *ppgtt;
-	unsigned long ndwords, dw;
+	unsigned long idx, ndwords, dw;
 	struct igt_live_test t;
 	struct drm_file *file;
 	I915_RND_STATE(prng);
@@ -1127,6 +1259,7 @@ static int igt_ctx_readonly(void *arg)
 		ndwords, RUNTIME_INFO(i915)->num_engines);
 
 	dw = 0;
+	idx = 0;
 	list_for_each_entry(obj, &objects, st_link) {
 		unsigned int rem =
 			min_t(unsigned int, ndwords - dw, max_dwords(obj));
@@ -1136,7 +1269,7 @@ static int igt_ctx_readonly(void *arg)
 		if (i915_gem_object_is_readonly(obj))
 			num_writes = 0;
 
-		err = cpu_check(obj, num_writes);
+		err = cpu_check(obj, idx++, num_writes);
 		if (err)
 			break;
 
@@ -1619,7 +1752,8 @@ static int mock_context_barrier(void *arg)
 	}
 
 	counter = 0;
-	err = context_barrier_task(ctx, 0, mock_barrier_task, &counter);
+	err = context_barrier_task(ctx, 0,
+				   NULL, mock_barrier_task, &counter);
 	if (err) {
 		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
 		goto out;
@@ -1631,8 +1765,8 @@ static int mock_context_barrier(void *arg)
 	}
 
 	counter = 0;
-	err = context_barrier_task(ctx,
-				   ALL_ENGINES, mock_barrier_task, &counter);
+	err = context_barrier_task(ctx, ALL_ENGINES,
+				   NULL, mock_barrier_task, &counter);
 	if (err) {
 		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
 		goto out;
@@ -1655,8 +1789,8 @@ static int mock_context_barrier(void *arg)
 
 	counter = 0;
 	context_barrier_inject_fault = BIT(RCS0);
-	err = context_barrier_task(ctx,
-				   ALL_ENGINES, mock_barrier_task, &counter);
+	err = context_barrier_task(ctx, ALL_ENGINES,
+				   NULL, mock_barrier_task, &counter);
 	context_barrier_inject_fault = 0;
 	if (err == -ENXIO)
 		err = 0;
@@ -1670,8 +1804,8 @@ static int mock_context_barrier(void *arg)
 		goto out;
 
 	counter = 0;
-	err = context_barrier_task(ctx,
-				   ALL_ENGINES, mock_barrier_task, &counter);
+	err = context_barrier_task(ctx, ALL_ENGINES,
+				   NULL, mock_barrier_task, &counter);
 	if (err) {
 		pr_err("Failed at line %d, err=%d\n", __LINE__, err);
 		goto out;
@@ -1719,6 +1853,7 @@ int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
 		SUBTEST(igt_ctx_exec),
 		SUBTEST(igt_ctx_readonly),
 		SUBTEST(igt_ctx_sseu),
+		SUBTEST(igt_shared_ctx_exec),
 		SUBTEST(igt_vm_isolation),
 	};
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 01084f6b4fb7..9cca66e4420a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -1020,7 +1020,6 @@ static int exercise_ppgtt(struct drm_i915_private *dev_priv,
 
 	err = func(dev_priv, &ppgtt->vm, 0, ppgtt->vm.total, end_time);
 
-	i915_ppgtt_close(&ppgtt->vm);
 	i915_ppgtt_put(ppgtt);
 out_unlock:
 	mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index 1cc8be732435..cfc9012c8e49 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -54,13 +54,17 @@ mock_context(struct drm_i915_private *i915,
 		goto err_handles;
 
 	if (name) {
+		struct i915_hw_ppgtt *ppgtt;
+
 		ctx->name = kstrdup(name, GFP_KERNEL);
 		if (!ctx->name)
 			goto err_put;
 
-		ctx->ppgtt = mock_ppgtt(i915, name);
-		if (!ctx->ppgtt)
+		ppgtt = mock_ppgtt(i915, name);
+		if (!ppgtt)
 			goto err_put;
+
+		__set_ppgtt(ctx, ppgtt);
 	}
 
 	return ctx;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 1c69ed16a923..9af7a8e6a46e 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -343,6 +343,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_PERF_ADD_CONFIG	0x37
 #define DRM_I915_PERF_REMOVE_CONFIG	0x38
 #define DRM_I915_QUERY			0x39
+#define DRM_I915_GEM_VM_CREATE		0x3a
+#define DRM_I915_GEM_VM_DESTROY		0x3b
 /* Must be kept compact -- no holes */
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
@@ -402,6 +404,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
 #define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 #define DRM_IOCTL_I915_QUERY			DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1453,6 +1457,33 @@ struct drm_i915_gem_context_destroy {
 	__u32 pad;
 };
 
+/*
+ * DRM_I915_GEM_VM_CREATE -
+ *
+ * Create a new virtual memory address space (ppGTT) for use within a context
+ * on the same file. Extensions can be provided to configure exactly how the
+ * address space is setup upon creation.
+ *
+ * The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
+ * returned in the outparam @vm_id.
+ *
+ * No flags are defined, with all bits reserved and must be zero.
+ *
+ * An extension chain maybe provided, starting with @extensions, and terminated
+ * by the @next_extension being 0. Currently, no extensions are defined.
+ *
+ * DRM_I915_GEM_VM_DESTROY -
+ *
+ * Destroys a previously created VM id, specified in @vm_id.
+ *
+ * No extensions or flags are allowed currently, and so must be zero.
+ */
+struct drm_i915_gem_vm_control {
+	__u64 extensions;
+	__u32 flags;
+	__u32 vm_id;
+};
+
 struct drm_i915_reg_read {
 	/*
 	 * Register offset.
@@ -1542,7 +1573,19 @@ struct drm_i915_gem_context_param {
  * On creation, all new contexts are marked as recoverable.
  */
 #define I915_CONTEXT_PARAM_RECOVERABLE	0x8
+
+	/*
+	 * The id of the associated virtual memory address space (ppGTT) of
+	 * this context. Can be retrieved and passed to another context
+	 * (on the same fd) for both to use the same ppGTT and so share
+	 * address layouts, and avoid reloading the page tables on context
+	 * switches between themselves.
+	 *
+	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
+	 */
+#define I915_CONTEXT_PARAM_VM		0x9
 /* Must be kept compact -- no holes and well documented */
+
 	__u64 value;
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 14/22] drm/i915: Extend CONTEXT_CREATE to set parameters upon construction
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (11 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 13/22] drm/i915: Create/destroy VM (ppGTT) for use with contexts Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 15/22] drm/i915: Allow contexts to share a single timeline across all engines Chris Wilson
                   ` (10 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

It can be useful to have a single ioctl to create a context with all
the initial parameters instead of a series of create + setparam + setparam
ioctls. This extension to create context allows any of the parameters
to be passed in as a linked list to be applied to the newly constructed
context.

v2: Make a local copy of user setparam (Tvrtko)
v3: Use flags to detect availability of extension interface

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         |   2 +-
 drivers/gpu/drm/i915/i915_gem_context.c | 454 +++++++++++++-----------
 include/uapi/drm/i915_drm.h             | 180 +++++-----
 3 files changed, 353 insertions(+), 283 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index fa991144e0f2..9a0fa3b21e9d 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -3110,7 +3110,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_SET_SPRITE_COLORKEY, intel_sprite_set_colorkey_ioctl, DRM_MASTER),
 	DRM_IOCTL_DEF_DRV(I915_GET_SPRITE_COLORKEY, drm_noop, DRM_MASTER),
 	DRM_IOCTL_DEF_DRV(I915_GEM_WAIT, i915_gem_wait_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
-	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE_EXT, i915_gem_context_create_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_gem_context_reset_stats_ioctl, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c392f7af5546..0d72e0cadde8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1115,196 +1115,6 @@ static int set_ppgtt(struct i915_gem_context *ctx,
 	return err;
 }
 
-static bool client_is_banned(struct drm_i915_file_private *file_priv)
-{
-	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
-}
-
-int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
-				  struct drm_file *file)
-{
-	struct drm_i915_private *i915 = to_i915(dev);
-	struct drm_i915_gem_context_create *args = data;
-	struct drm_i915_file_private *file_priv = file->driver_priv;
-	struct i915_gem_context *ctx;
-	int ret;
-
-	if (!DRIVER_CAPS(i915)->has_logical_contexts)
-		return -ENODEV;
-
-	if (args->pad != 0)
-		return -EINVAL;
-
-	ret = i915_terminally_wedged(i915);
-	if (ret)
-		return ret;
-
-	if (client_is_banned(file_priv)) {
-		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
-			  current->comm,
-			  pid_nr(get_task_pid(current, PIDTYPE_PID)));
-
-		return -EIO;
-	}
-
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
-
-	ctx = i915_gem_create_context(i915);
-	mutex_unlock(&dev->struct_mutex);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
-
-	ret = gem_context_register(ctx, file_priv);
-	if (ret)
-		goto err_ctx;
-
-	args->ctx_id = ctx->user_handle;
-	DRM_DEBUG("HW context %d created\n", args->ctx_id);
-
-	return 0;
-
-err_ctx:
-	mutex_lock(&dev->struct_mutex);
-	context_close(ctx);
-	mutex_unlock(&dev->struct_mutex);
-	return ret;
-}
-
-int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
-				   struct drm_file *file)
-{
-	struct drm_i915_gem_context_destroy *args = data;
-	struct drm_i915_file_private *file_priv = file->driver_priv;
-	struct i915_gem_context *ctx;
-
-	if (args->pad != 0)
-		return -EINVAL;
-
-	if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
-		return -ENOENT;
-
-	mutex_lock(&file_priv->context_lock);
-	ctx = idr_remove(&file_priv->context_idr, args->ctx_id);
-	mutex_lock(&file_priv->context_lock);
-	if (!ctx)
-		return -ENOENT;
-
-	mutex_lock(&dev->struct_mutex);
-	context_close(ctx);
-	mutex_unlock(&dev->struct_mutex);
-
-	return 0;
-}
-
-static int get_sseu(struct i915_gem_context *ctx,
-		    struct drm_i915_gem_context_param *args)
-{
-	struct drm_i915_gem_context_param_sseu user_sseu;
-	struct intel_engine_cs *engine;
-	struct intel_context *ce;
-
-	if (args->size == 0)
-		goto out;
-	else if (args->size < sizeof(user_sseu))
-		return -EINVAL;
-
-	if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value),
-			   sizeof(user_sseu)))
-		return -EFAULT;
-
-	if (user_sseu.flags || user_sseu.rsvd)
-		return -EINVAL;
-
-	engine = intel_engine_lookup_user(ctx->i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
-	if (!engine)
-		return -EINVAL;
-
-	ce = intel_context_pin_lock(ctx, engine); /* serialises with set_sseu */
-	if (IS_ERR(ce))
-		return PTR_ERR(ce);
-
-	user_sseu.slice_mask = ce->sseu.slice_mask;
-	user_sseu.subslice_mask = ce->sseu.subslice_mask;
-	user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice;
-	user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
-
-	intel_context_pin_unlock(ce);
-
-	if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu,
-			 sizeof(user_sseu)))
-		return -EFAULT;
-
-out:
-	args->size = sizeof(user_sseu);
-
-	return 0;
-}
-
-int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
-				    struct drm_file *file)
-{
-	struct drm_i915_file_private *file_priv = file->driver_priv;
-	struct drm_i915_gem_context_param *args = data;
-	struct i915_gem_context *ctx;
-	int ret = 0;
-
-	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
-
-	switch (args->param) {
-	case I915_CONTEXT_PARAM_BAN_PERIOD:
-		ret = -EINVAL;
-		break;
-	case I915_CONTEXT_PARAM_NO_ZEROMAP:
-		args->size = 0;
-		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
-		break;
-	case I915_CONTEXT_PARAM_GTT_SIZE:
-		args->size = 0;
-
-		if (ctx->ppgtt)
-			args->value = ctx->ppgtt->vm.total;
-		else if (to_i915(dev)->mm.aliasing_ppgtt)
-			args->value = to_i915(dev)->mm.aliasing_ppgtt->vm.total;
-		else
-			args->value = to_i915(dev)->ggtt.vm.total;
-		break;
-	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
-		args->size = 0;
-		args->value = i915_gem_context_no_error_capture(ctx);
-		break;
-	case I915_CONTEXT_PARAM_BANNABLE:
-		args->size = 0;
-		args->value = i915_gem_context_is_bannable(ctx);
-		break;
-	case I915_CONTEXT_PARAM_RECOVERABLE:
-		args->size = 0;
-		args->value = i915_gem_context_is_recoverable(ctx);
-		break;
-	case I915_CONTEXT_PARAM_PRIORITY:
-		args->size = 0;
-		args->value = ctx->sched.priority >> I915_USER_PRIORITY_SHIFT;
-		break;
-	case I915_CONTEXT_PARAM_SSEU:
-		ret = get_sseu(ctx, args);
-		break;
-	case I915_CONTEXT_PARAM_VM:
-		ret = get_ppgtt(ctx, args);
-		break;
-	default:
-		ret = -EINVAL;
-		break;
-	}
-
-	i915_gem_context_put(ctx);
-	return ret;
-}
-
 static int gen8_emit_rpcs_config(struct i915_request *rq,
 				 struct intel_context *ce,
 				 struct intel_sseu sseu)
@@ -1584,18 +1394,11 @@ static int set_sseu(struct i915_gem_context *ctx,
 	return 0;
 }
 
-int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
-				    struct drm_file *file)
+static int ctx_setparam(struct i915_gem_context *ctx,
+			struct drm_i915_gem_context_param *args)
 {
-	struct drm_i915_file_private *file_priv = file->driver_priv;
-	struct drm_i915_gem_context_param *args = data;
-	struct i915_gem_context *ctx;
 	int ret = 0;
 
-	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (!ctx)
-		return -ENOENT;
-
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_NO_ZEROMAP:
 		if (args->size)
@@ -1605,6 +1408,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		else
 			clear_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
 		break;
+
 	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
 		if (args->size)
 			ret = -EINVAL;
@@ -1613,6 +1417,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		else
 			i915_gem_context_clear_no_error_capture(ctx);
 		break;
+
 	case I915_CONTEXT_PARAM_BANNABLE:
 		if (args->size)
 			ret = -EINVAL;
@@ -1639,7 +1444,7 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 
 			if (args->size)
 				ret = -EINVAL;
-			else if (!(to_i915(dev)->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
+			else if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
 				ret = -ENODEV;
 			else if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
 				 priority < I915_CONTEXT_MIN_USER_PRIORITY)
@@ -1667,6 +1472,255 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 		break;
 	}
 
+	return ret;
+}
+
+struct create_ext {
+	struct i915_gem_context *ctx;
+	struct drm_i915_file_private *fpriv;
+};
+
+static int create_setparam(struct i915_user_extension __user *ext, void *data)
+{
+	struct drm_i915_gem_context_create_ext_setparam local;
+	const struct create_ext *arg = data;
+
+	if (copy_from_user(&local, ext, sizeof(local)))
+		return -EFAULT;
+
+	if (local.param.ctx_id)
+		return -EINVAL;
+
+	return ctx_setparam(arg->ctx, &local.param);
+}
+
+static const i915_user_extension_fn create_extensions[] = {
+	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
+};
+
+static bool client_is_banned(struct drm_i915_file_private *file_priv)
+{
+	return atomic_read(&file_priv->ban_score) >= I915_CLIENT_SCORE_BANNED;
+}
+
+int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
+				  struct drm_file *file)
+{
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_gem_context_create_ext *args = data;
+	struct create_ext ext_data;
+	int ret;
+
+	if (!DRIVER_CAPS(i915)->has_logical_contexts)
+		return -ENODEV;
+
+	if (args->flags & I915_CONTEXT_CREATE_FLAGS_UNKNOWN)
+		return -EINVAL;
+
+	ret = i915_terminally_wedged(i915);
+	if (ret)
+		return ret;
+
+	ext_data.fpriv = file->driver_priv;
+	if (client_is_banned(ext_data.fpriv)) {
+		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
+			  current->comm,
+			  pid_nr(get_task_pid(current, PIDTYPE_PID)));
+		return -EIO;
+	}
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		return ret;
+
+	ext_data.ctx = i915_gem_create_context(i915);
+	mutex_unlock(&dev->struct_mutex);
+	if (IS_ERR(ext_data.ctx))
+		return PTR_ERR(ext_data.ctx);
+
+	if (args->flags & I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS) {
+		ret = i915_user_extensions(u64_to_user_ptr(args->extensions),
+					   create_extensions,
+					   ARRAY_SIZE(create_extensions),
+					   &ext_data);
+		if (ret)
+			goto err_ctx;
+	}
+
+	ret = gem_context_register(ext_data.ctx, ext_data.fpriv);
+	if (ret)
+		goto err_ctx;
+
+	args->ctx_id = ext_data.ctx->user_handle;
+	DRM_DEBUG("HW context %d created\n", args->ctx_id);
+
+	return 0;
+
+err_ctx:
+	mutex_lock(&dev->struct_mutex);
+	context_close(ext_data.ctx);
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
+}
+
+int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
+				   struct drm_file *file)
+{
+	struct drm_i915_gem_context_destroy *args = data;
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_context *ctx;
+	int ret;
+
+	if (args->pad != 0)
+		return -EINVAL;
+
+	if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
+		return -ENOENT;
+
+	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
+	if (!ctx)
+		return -ENOENT;
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		goto out;
+
+	idr_remove(&file_priv->context_idr, ctx->user_handle);
+	context_close(ctx);
+
+	mutex_unlock(&dev->struct_mutex);
+
+out:
+	i915_gem_context_put(ctx);
+	return 0;
+}
+
+static int get_sseu(struct i915_gem_context *ctx,
+		    struct drm_i915_gem_context_param *args)
+{
+	struct drm_i915_gem_context_param_sseu user_sseu;
+	struct intel_engine_cs *engine;
+	struct intel_context *ce;
+
+	if (args->size == 0)
+		goto out;
+	else if (args->size < sizeof(user_sseu))
+		return -EINVAL;
+
+	if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value),
+			   sizeof(user_sseu)))
+		return -EFAULT;
+
+	if (user_sseu.flags || user_sseu.rsvd)
+		return -EINVAL;
+
+	engine = intel_engine_lookup_user(ctx->i915,
+					  user_sseu.engine_class,
+					  user_sseu.engine_instance);
+	if (!engine)
+		return -EINVAL;
+
+	ce = intel_context_pin_lock(ctx, engine); /* serialises with set_sseu */
+	if (IS_ERR(ce))
+		return PTR_ERR(ce);
+
+	user_sseu.slice_mask = ce->sseu.slice_mask;
+	user_sseu.subslice_mask = ce->sseu.subslice_mask;
+	user_sseu.min_eus_per_subslice = ce->sseu.min_eus_per_subslice;
+	user_sseu.max_eus_per_subslice = ce->sseu.max_eus_per_subslice;
+
+	intel_context_pin_unlock(ce);
+
+	if (copy_to_user(u64_to_user_ptr(args->value), &user_sseu,
+			 sizeof(user_sseu)))
+		return -EFAULT;
+
+out:
+	args->size = sizeof(user_sseu);
+
+	return 0;
+}
+
+int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
+				    struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_context_param *args = data;
+	struct i915_gem_context *ctx;
+	int ret = 0;
+
+	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
+	if (!ctx)
+		return -ENOENT;
+
+	switch (args->param) {
+	case I915_CONTEXT_PARAM_NO_ZEROMAP:
+		args->size = 0;
+		args->value = test_bit(UCONTEXT_NO_ZEROMAP, &ctx->user_flags);
+		break;
+
+	case I915_CONTEXT_PARAM_GTT_SIZE:
+		args->size = 0;
+		if (ctx->ppgtt)
+			args->value = ctx->ppgtt->vm.total;
+		else if (to_i915(dev)->mm.aliasing_ppgtt)
+			args->value = to_i915(dev)->mm.aliasing_ppgtt->vm.total;
+		else
+			args->value = to_i915(dev)->ggtt.vm.total;
+		break;
+
+	case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
+		args->size = 0;
+		args->value = i915_gem_context_no_error_capture(ctx);
+		break;
+
+	case I915_CONTEXT_PARAM_BANNABLE:
+		args->size = 0;
+		args->value = i915_gem_context_is_bannable(ctx);
+		break;
+
+	case I915_CONTEXT_PARAM_RECOVERABLE:
+		args->size = 0;
+		args->value = i915_gem_context_is_recoverable(ctx);
+		break;
+
+	case I915_CONTEXT_PARAM_PRIORITY:
+		args->size = 0;
+		args->value = ctx->sched.priority >> I915_USER_PRIORITY_SHIFT;
+		break;
+
+	case I915_CONTEXT_PARAM_SSEU:
+		ret = get_sseu(ctx, args);
+		break;
+
+	case I915_CONTEXT_PARAM_VM:
+		ret = get_ppgtt(ctx, args);
+		break;
+
+	case I915_CONTEXT_PARAM_BAN_PERIOD:
+	default:
+		ret = -EINVAL;
+		break;
+	}
+
+	i915_gem_context_put(ctx);
+	return ret;
+}
+
+int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
+				    struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_context_param *args = data;
+	struct i915_gem_context *ctx;
+	int ret;
+
+	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
+	if (!ctx)
+		return -ENOENT;
+
+	ret = ctx_setparam(ctx, args);
+
 	i915_gem_context_put(ctx);
 	return ret;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 9af7a8e6a46e..d45b79746fc4 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -394,6 +394,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey)
 #define DRM_IOCTL_I915_GEM_WAIT		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_WAIT, struct drm_i915_gem_wait)
 #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
+#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create_ext)
 #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
 #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
 #define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
@@ -1447,92 +1448,17 @@ struct drm_i915_gem_wait {
 };
 
 struct drm_i915_gem_context_create {
-	/*  output: id of new context*/
-	__u32 ctx_id;
-	__u32 pad;
-};
-
-struct drm_i915_gem_context_destroy {
-	__u32 ctx_id;
-	__u32 pad;
-};
-
-/*
- * DRM_I915_GEM_VM_CREATE -
- *
- * Create a new virtual memory address space (ppGTT) for use within a context
- * on the same file. Extensions can be provided to configure exactly how the
- * address space is setup upon creation.
- *
- * The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
- * returned in the outparam @vm_id.
- *
- * No flags are defined, with all bits reserved and must be zero.
- *
- * An extension chain maybe provided, starting with @extensions, and terminated
- * by the @next_extension being 0. Currently, no extensions are defined.
- *
- * DRM_I915_GEM_VM_DESTROY -
- *
- * Destroys a previously created VM id, specified in @vm_id.
- *
- * No extensions or flags are allowed currently, and so must be zero.
- */
-struct drm_i915_gem_vm_control {
-	__u64 extensions;
-	__u32 flags;
-	__u32 vm_id;
-};
-
-struct drm_i915_reg_read {
-	/*
-	 * Register offset.
-	 * For 64bit wide registers where the upper 32bits don't immediately
-	 * follow the lower 32bits, the offset of the lower 32bits must
-	 * be specified
-	 */
-	__u64 offset;
-#define I915_REG_READ_8B_WA (1ul << 0)
-
-	__u64 val; /* Return value */
-};
-/* Known registers:
- *
- * Render engine timestamp - 0x2358 + 64bit - gen7+
- * - Note this register returns an invalid value if using the default
- *   single instruction 8byte read, in order to workaround that pass
- *   flag I915_REG_READ_8B_WA in offset field.
- *
- */
-
-struct drm_i915_reset_stats {
-	__u32 ctx_id;
-	__u32 flags;
-
-	/* All resets since boot/module reload, for all contexts */
-	__u32 reset_count;
-
-	/* Number of batches lost when active in GPU, for this context */
-	__u32 batch_active;
-
-	/* Number of batches lost pending for execution, for this context */
-	__u32 batch_pending;
-
+	__u32 ctx_id; /* output: id of new context*/
 	__u32 pad;
 };
 
-struct drm_i915_gem_userptr {
-	__u64 user_ptr;
-	__u64 user_size;
+struct drm_i915_gem_context_create_ext {
+	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
-#define I915_USERPTR_READ_ONLY 0x1
-#define I915_USERPTR_UNSYNCHRONIZED 0x80000000
-	/**
-	 * Returned handle for the object.
-	 *
-	 * Object handles are nonzero.
-	 */
-	__u32 handle;
+#define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
+	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	__u64 extensions;
 };
 
 struct drm_i915_gem_context_param {
@@ -1648,6 +1574,96 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+struct drm_i915_gem_context_create_ext_setparam {
+#define I915_CONTEXT_CREATE_EXT_SETPARAM 0
+	struct i915_user_extension base;
+	struct drm_i915_gem_context_param param;
+};
+
+struct drm_i915_gem_context_destroy {
+	__u32 ctx_id;
+	__u32 pad;
+};
+
+/*
+ * DRM_I915_GEM_VM_CREATE -
+ *
+ * Create a new virtual memory address space (ppGTT) for use within a context
+ * on the same file. Extensions can be provided to configure exactly how the
+ * address space is setup upon creation.
+ *
+ * The id of new VM (bound to the fd) for use with I915_CONTEXT_PARAM_VM is
+ * returned in the outparam @id.
+ *
+ * No flags are defined, with all bits reserved and must be zero.
+ *
+ * An extension chain maybe provided, starting with @extensions, and terminated
+ * by the @next_extension being 0. Currently, no extensions are defined.
+ *
+ * DRM_I915_GEM_VM_DESTROY -
+ *
+ * Destroys a previously created VM id, specified in @id.
+ *
+ * No extensions or flags are allowed currently, and so must be zero.
+ */
+struct drm_i915_gem_vm_control {
+	__u64 extensions;
+	__u32 flags;
+	__u32 vm_id;
+};
+
+struct drm_i915_reg_read {
+	/*
+	 * Register offset.
+	 * For 64bit wide registers where the upper 32bits don't immediately
+	 * follow the lower 32bits, the offset of the lower 32bits must
+	 * be specified
+	 */
+	__u64 offset;
+#define I915_REG_READ_8B_WA (1ul << 0)
+
+	__u64 val; /* Return value */
+};
+
+/* Known registers:
+ *
+ * Render engine timestamp - 0x2358 + 64bit - gen7+
+ * - Note this register returns an invalid value if using the default
+ *   single instruction 8byte read, in order to workaround that pass
+ *   flag I915_REG_READ_8B_WA in offset field.
+ *
+ */
+
+struct drm_i915_reset_stats {
+	__u32 ctx_id;
+	__u32 flags;
+
+	/* All resets since boot/module reload, for all contexts */
+	__u32 reset_count;
+
+	/* Number of batches lost when active in GPU, for this context */
+	__u32 batch_active;
+
+	/* Number of batches lost pending for execution, for this context */
+	__u32 batch_pending;
+
+	__u32 pad;
+};
+
+struct drm_i915_gem_userptr {
+	__u64 user_ptr;
+	__u64 user_size;
+	__u32 flags;
+#define I915_USERPTR_READ_ONLY 0x1
+#define I915_USERPTR_UNSYNCHRONIZED 0x80000000
+	/**
+	 * Returned handle for the object.
+	 *
+	 * Object handles are nonzero.
+	 */
+	__u32 handle;
+};
+
 enum drm_i915_oa_format {
 	I915_OA_FORMAT_A13 = 1,	    /* HSW only */
 	I915_OA_FORMAT_A29,	    /* HSW only */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 15/22] drm/i915: Allow contexts to share a single timeline across all engines
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (12 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 14/22] drm/i915: Extend CONTEXT_CREATE to set parameters upon construction Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 16/22] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

Previously, our view has been always to run the engines independently
within a context. (Multiple engines happened before we had contexts and
timelines, so they always operated independently and that behaviour
persisted into contexts.) However, at the user level the context often
represents a single timeline (e.g. GL contexts) and userspace must
ensure that the individual engines are serialised to present that
ordering to the client (or forgot about this detail entirely and hope no
one notices - a fair ploy if the client can only directly control one
engine themselves ;)

In the next patch, we will want to construct a set of engines that
operate as one, that have a single timeline interwoven between them, to
present a single virtual engine to the user. (They submit to the virtual
engine, then we decide which engine to execute on based.)

To that end, we want to be able to create contexts which have a single
timeline (fence context) shared between all engines, rather than multiple
timelines.

v2: Move the specialised timeline ordering to its own function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c       | 31 +++++--
 drivers/gpu/drm/i915/i915_gem_context_types.h |  2 +
 drivers/gpu/drm/i915/i915_request.c           | 80 +++++++++++++------
 drivers/gpu/drm/i915/i915_request.h           |  5 +-
 drivers/gpu/drm/i915/i915_sw_fence.c          | 39 +++++++--
 drivers/gpu/drm/i915/i915_sw_fence.h          | 13 ++-
 drivers/gpu/drm/i915/intel_lrc.c              |  5 +-
 drivers/gpu/drm/i915/selftests/mock_context.c |  2 +-
 include/uapi/drm/i915_drm.h                   |  3 +-
 9 files changed, 138 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 0d72e0cadde8..57b4e760fa4b 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -238,6 +238,9 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
 		intel_context_put(it);
 
+	if (ctx->timeline)
+		i915_timeline_put(ctx->timeline);
+
 	kfree(ctx->name);
 	put_pid(ctx->pid);
 
@@ -403,12 +406,16 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
 }
 
 static struct i915_gem_context *
-i915_gem_create_context(struct drm_i915_private *dev_priv)
+i915_gem_create_context(struct drm_i915_private *dev_priv, unsigned int flags)
 {
 	struct i915_gem_context *ctx;
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
+	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
+	    !HAS_EXECLISTS(dev_priv))
+		return ERR_PTR(-EINVAL);
+
 	/* Reap the most stale context */
 	contexts_free_first(dev_priv);
 
@@ -431,6 +438,18 @@ i915_gem_create_context(struct drm_i915_private *dev_priv)
 		i915_ppgtt_put(ppgtt);
 	}
 
+	if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
+		struct i915_timeline *timeline;
+
+		timeline = i915_timeline_create(dev_priv, ctx->name, NULL);
+		if (IS_ERR(timeline)) {
+			context_close(ctx);
+			return ERR_CAST(timeline);
+		}
+
+		ctx->timeline = timeline;
+	}
+
 	trace_i915_context_create(ctx);
 
 	return ctx;
@@ -459,7 +478,7 @@ i915_gem_context_create_gvt(struct drm_device *dev)
 	if (ret)
 		return ERR_PTR(ret);
 
-	ctx = i915_gem_create_context(to_i915(dev));
+	ctx = i915_gem_create_context(to_i915(dev), 0);
 	if (IS_ERR(ctx))
 		goto out;
 
@@ -495,7 +514,7 @@ i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio)
 	struct i915_gem_context *ctx;
 	int err;
 
-	ctx = i915_gem_create_context(i915);
+	ctx = i915_gem_create_context(i915, 0);
 	if (IS_ERR(ctx))
 		return ctx;
 
@@ -670,7 +689,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	idr_init_base(&file_priv->vm_idr, 1);
 
 	mutex_lock(&i915->drm.struct_mutex);
-	ctx = i915_gem_create_context(i915);
+	ctx = i915_gem_create_context(i915, 0);
 	mutex_unlock(&i915->drm.struct_mutex);
 	if (IS_ERR(ctx)) {
 		err = PTR_ERR(ctx);
@@ -812,7 +831,7 @@ last_request_on_engine(struct i915_timeline *timeline,
 
 	rq = i915_active_request_raw(&timeline->last_request,
 				     &engine->i915->drm.struct_mutex);
-	if (rq && rq->engine == engine) {
+	if (rq && rq->engine->mask & engine->mask) {
 		GEM_TRACE("last request for %s on engine %s: %llx:%llu\n",
 			  timeline->name, engine->name,
 			  rq->fence.context, rq->fence.seqno);
@@ -1533,7 +1552,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
-	ext_data.ctx = i915_gem_create_context(i915);
+	ext_data.ctx = i915_gem_create_context(i915, args->flags);
 	mutex_unlock(&dev->struct_mutex);
 	if (IS_ERR(ext_data.ctx))
 		return PTR_ERR(ext_data.ctx);
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index 2bf19730eaa9..f8f6e6c960a7 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -41,6 +41,8 @@ struct i915_gem_context {
 	/** file_priv: owning file descriptor */
 	struct drm_i915_file_private *file_priv;
 
+	struct i915_timeline *timeline;
+
 	/**
 	 * @ppgtt: unique address space (GTT)
 	 *
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0a3d94517d0a..2382339172b4 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -993,6 +993,60 @@ void i915_request_skip(struct i915_request *rq, int error)
 	memset(vaddr + head, 0, rq->postfix - head);
 }
 
+static struct i915_request *
+__i915_request_add_to_timeline(struct i915_request *rq)
+{
+	struct i915_timeline *timeline = rq->timeline;
+	struct i915_request *prev;
+
+	/*
+	 * Dependency tracking and request ordering along the timeline
+	 * is special cased so that we can eliminate redundant ordering
+	 * operations while building the request (we know that the timeline
+	 * itself is ordered, and here we guarantee it).
+	 *
+	 * As we know we will need to emit tracking along the timeline,
+	 * we embed the hooks into our request struct -- at the cost of
+	 * having to have specialised no-allocation interfaces (which will
+	 * be beneficial elsewhere).
+	 *
+	 * A second benefit to open-coding i915_request_await_request is
+	 * that we can apply a slight variant of the rules specialised
+	 * for timelines that jump between engines (such as virtual engines).
+	 * If we consider the case of virtual engine, we must emit a dma-fence
+	 * to prevent scheduling of the second request until the first is
+	 * complete (to maximise our greedy late load balancing) and this
+	 * precludes optimising to use semaphores serialisation of a single
+	 * timeline across engines.
+	 */
+	prev = i915_active_request_raw(&timeline->last_request,
+				       &rq->i915->drm.struct_mutex);
+	if (prev && !i915_request_completed(prev)) {
+		if (is_power_of_2(prev->engine->mask | rq->engine->mask))
+			i915_sw_fence_await_sw_fence(&rq->submit,
+						     &prev->submit,
+						     &rq->submitq);
+		else
+			__i915_sw_fence_await_dma_fence(&rq->submit,
+							&prev->fence,
+							&rq->dmaq);
+		if (rq->engine->schedule)
+			__i915_sched_node_add_dependency(&rq->sched,
+							 &prev->sched,
+							 &rq->dep,
+							 0);
+	}
+
+	spin_lock_irq(&timeline->lock);
+	list_add_tail(&rq->link, &timeline->requests);
+	spin_unlock_irq(&timeline->lock);
+
+	GEM_BUG_ON(timeline->seqno != rq->fence.seqno);
+	__i915_active_request_set(&timeline->last_request, rq);
+
+	return prev;
+}
+
 /*
  * NB: This function is not allowed to fail. Doing so would mean the the
  * request is not being tracked for completion but the work itself is
@@ -1037,31 +1091,7 @@ void i915_request_add(struct i915_request *request)
 	GEM_BUG_ON(IS_ERR(cs));
 	request->postfix = intel_ring_offset(request, cs);
 
-	/*
-	 * Seal the request and mark it as pending execution. Note that
-	 * we may inspect this state, without holding any locks, during
-	 * hangcheck. Hence we apply the barrier to ensure that we do not
-	 * see a more recent value in the hws than we are tracking.
-	 */
-
-	prev = i915_active_request_raw(&timeline->last_request,
-				       &request->i915->drm.struct_mutex);
-	if (prev && !i915_request_completed(prev)) {
-		i915_sw_fence_await_sw_fence(&request->submit, &prev->submit,
-					     &request->submitq);
-		if (engine->schedule)
-			__i915_sched_node_add_dependency(&request->sched,
-							 &prev->sched,
-							 &request->dep,
-							 0);
-	}
-
-	spin_lock_irq(&timeline->lock);
-	list_add_tail(&request->link, &timeline->requests);
-	spin_unlock_irq(&timeline->lock);
-
-	GEM_BUG_ON(timeline->seqno != request->fence.seqno);
-	__i915_active_request_set(&timeline->last_request, request);
+	prev = __i915_request_add_to_timeline(request);
 
 	list_add_tail(&request->ring_link, &ring->request_list);
 	if (list_is_first(&request->ring_link, &ring->request_list)) {
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 8c8fa5010644..cd6c130964cd 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -128,7 +128,10 @@ struct i915_request {
 	 * It is used by the driver to then queue the request for execution.
 	 */
 	struct i915_sw_fence submit;
-	wait_queue_entry_t submitq;
+	union {
+		wait_queue_entry_t submitq;
+		struct i915_sw_dma_fence_cb dmaq;
+	};
 	struct list_head execute_cb;
 
 	/*
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 8d1400d378d7..5387aafd3424 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -359,11 +359,6 @@ int i915_sw_fence_await_sw_fence_gfp(struct i915_sw_fence *fence,
 	return __i915_sw_fence_await_sw_fence(fence, signaler, NULL, gfp);
 }
 
-struct i915_sw_dma_fence_cb {
-	struct dma_fence_cb base;
-	struct i915_sw_fence *fence;
-};
-
 struct i915_sw_dma_fence_cb_timer {
 	struct i915_sw_dma_fence_cb base;
 	struct dma_fence *dma;
@@ -480,6 +475,40 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 	return ret;
 }
 
+static void __dma_i915_sw_fence_wake(struct dma_fence *dma,
+				     struct dma_fence_cb *data)
+{
+	struct i915_sw_dma_fence_cb *cb = container_of(data, typeof(*cb), base);
+
+	i915_sw_fence_complete(cb->fence);
+}
+
+int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
+				    struct dma_fence *dma,
+				    struct i915_sw_dma_fence_cb *cb)
+{
+	int ret;
+
+	debug_fence_assert(fence);
+
+	if (dma_fence_is_signaled(dma))
+		return 0;
+
+	cb->fence = fence;
+	i915_sw_fence_await(fence);
+
+	ret = dma_fence_add_callback(dma, &cb->base, __dma_i915_sw_fence_wake);
+	if (ret == 0) {
+		ret = 1;
+	} else {
+		i915_sw_fence_complete(fence);
+		if (ret == -ENOENT) /* fence already signaled */
+			ret = 0;
+	}
+
+	return ret;
+}
+
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    struct reservation_object *resv,
 				    const struct dma_fence_ops *exclude,
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h b/drivers/gpu/drm/i915/i915_sw_fence.h
index 6dec9e1d1102..9cb5c3b307a6 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.h
+++ b/drivers/gpu/drm/i915/i915_sw_fence.h
@@ -9,14 +9,13 @@
 #ifndef _I915_SW_FENCE_H_
 #define _I915_SW_FENCE_H_
 
+#include <linux/dma-fence.h>
 #include <linux/gfp.h>
 #include <linux/kref.h>
 #include <linux/notifier.h> /* for NOTIFY_DONE */
 #include <linux/wait.h>
 
 struct completion;
-struct dma_fence;
-struct dma_fence_ops;
 struct reservation_object;
 
 struct i915_sw_fence {
@@ -68,10 +67,20 @@ int i915_sw_fence_await_sw_fence(struct i915_sw_fence *fence,
 int i915_sw_fence_await_sw_fence_gfp(struct i915_sw_fence *fence,
 				     struct i915_sw_fence *after,
 				     gfp_t gfp);
+
+struct i915_sw_dma_fence_cb {
+	struct dma_fence_cb base;
+	struct i915_sw_fence *fence;
+};
+
+int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
+				    struct dma_fence *dma,
+				    struct i915_sw_dma_fence_cb *cb);
 int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 				  struct dma_fence *dma,
 				  unsigned long timeout,
 				  gfp_t gfp);
+
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
 				    struct reservation_object *resv,
 				    const struct dma_fence_ops *exclude,
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7e0c20a2d733..ae89174fd08f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2802,7 +2802,10 @@ populate_lr_context(struct intel_context *ce,
 
 static struct i915_timeline *get_timeline(struct i915_gem_context *ctx)
 {
-	return i915_timeline_create(ctx->i915, ctx->name, NULL);
+	if (ctx->timeline)
+		return i915_timeline_get(ctx->timeline);
+	else
+		return i915_timeline_create(ctx->i915, ctx->name, NULL);
 }
 
 static int execlists_context_deferred_alloc(struct intel_context *ce,
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index cfc9012c8e49..163aa9b66f25 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -97,7 +97,7 @@ live_context(struct drm_i915_private *i915, struct drm_file *file)
 
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	ctx = i915_gem_create_context(i915);
+	ctx = i915_gem_create_context(i915, 0);
 	if (IS_ERR(ctx))
 		return ctx;
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d45b79746fc4..9999f7d6a5a9 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,8 +1456,9 @@ struct drm_i915_gem_context_create_ext {
 	__u32 ctx_id; /* output: id of new context*/
 	__u32 flags;
 #define I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS	(1u << 0)
+#define I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE	(1u << 1)
 #define I915_CONTEXT_CREATE_FLAGS_UNKNOWN \
-	(-(I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS << 1))
+	(-(I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE << 1))
 	__u64 extensions;
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 16/22] drm/i915: Allow userspace to clone contexts on creation
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (13 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 15/22] drm/i915: Allow contexts to share a single timeline across all engines Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:51 ` [PATCH 17/22] drm/i915: Allow a context to define its set of engines Chris Wilson
                   ` (8 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

A usecase arose out of handling context recovery in mesa, whereby they
wish to recreate a context with fresh logical state but preserving all
other details of the original. Currently, they create a new context and
iterate over which bits they want to copy across, but it would much more
convenient if they were able to just pass in a target context to clone
during creation. This essentially extends the setparam during creation
to pull the details from a target context instead of the user supplied
parameters.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 154 ++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             |  14 +++
 2 files changed, 168 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 57b4e760fa4b..1972f112d72b 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1513,8 +1513,162 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->ctx, &local.param);
 }
 
+static int clone_flags(struct i915_gem_context *dst,
+		       struct i915_gem_context *src)
+{
+	dst->user_flags = src->user_flags;
+	return 0;
+}
+
+static int clone_schedattr(struct i915_gem_context *dst,
+			   struct i915_gem_context *src)
+{
+	dst->sched = src->sched;
+	return 0;
+}
+
+static int clone_sseu(struct i915_gem_context *dst,
+		      struct i915_gem_context *src)
+{
+	const struct intel_sseu default_sseu =
+		intel_device_default_sseu(dst->i915);
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, dst->i915, id) {
+		struct intel_context *ce;
+		struct intel_sseu sseu;
+
+		ce = intel_context_lookup(src, engine);
+		if (!ce)
+			continue;
+
+		sseu = ce->sseu;
+		if (!memcmp(&sseu, &default_sseu, sizeof(sseu)))
+			continue;
+
+		ce = intel_context_pin_lock(dst, engine);
+		if (IS_ERR(ce))
+			return PTR_ERR(ce);
+
+		ce->sseu = sseu;
+		intel_context_pin_unlock(ce);
+	}
+
+	return 0;
+}
+
+static int clone_timeline(struct i915_gem_context *dst,
+			  struct i915_gem_context *src)
+{
+	if (src->timeline) {
+		GEM_BUG_ON(src->timeline == dst->timeline);
+
+		if (dst->timeline)
+			i915_timeline_put(dst->timeline);
+		dst->timeline = i915_timeline_get(src->timeline);
+	}
+
+	return 0;
+}
+
+static int clone_vm(struct i915_gem_context *dst,
+		    struct i915_gem_context *src)
+{
+	struct i915_hw_ppgtt *ppgtt;
+
+	rcu_read_lock();
+	do {
+		ppgtt = READ_ONCE(src->ppgtt);
+		if (!ppgtt)
+			break;
+
+		if (!kref_get_unless_zero(&ppgtt->ref))
+			continue;
+
+		/*
+		 * This ppgtt may have be reallocated between
+		 * the read and the kref, and reassigned to a third
+		 * context. In order to avoid inadvertent sharing
+		 * of this ppgtt with that third context (and not
+		 * src), we have to confirm that we have the same
+		 * ppgtt after passing through the strong memory
+		 * barrier implied by a successful
+		 * kref_get_unless_zero().
+		 *
+		 * Once we have acquired the current ppgtt of src,
+		 * we no longer care if it is released from src, as
+		 * it cannot be reallocated elsewhere.
+		 */
+
+		if (ppgtt == READ_ONCE(src->ppgtt))
+			break;
+
+		i915_ppgtt_put(ppgtt);
+	} while (1);
+	rcu_read_unlock();
+
+	if (ppgtt) {
+		__assign_ppgtt(dst, ppgtt);
+		i915_ppgtt_put(ppgtt);
+	}
+
+	return 0;
+}
+
+static int create_clone(struct i915_user_extension __user *ext, void *data)
+{
+	static int (* const fn[])(struct i915_gem_context *dst,
+				  struct i915_gem_context *src) = {
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
+		MAP(FLAGS, clone_flags),
+		MAP(SCHEDATTR, clone_schedattr),
+		MAP(SSEU, clone_sseu),
+		MAP(TIMELINE, clone_timeline),
+		MAP(VM, clone_vm),
+#undef MAP
+	};
+	struct drm_i915_gem_context_create_ext_clone local;
+	const struct create_ext *arg = data;
+	struct i915_gem_context *dst = arg->ctx;
+	struct i915_gem_context *src;
+	int err, bit;
+
+	if (copy_from_user(&local, ext, sizeof(local)))
+		return -EFAULT;
+
+	BUILD_BUG_ON(GENMASK(BITS_PER_TYPE(local.flags) - 1, ARRAY_SIZE(fn)) !=
+		     I915_CONTEXT_CLONE_UNKNOWN);
+
+	if (local.flags & I915_CONTEXT_CLONE_UNKNOWN)
+		return -EINVAL;
+
+	if (local.rsvd)
+		return -EINVAL;
+
+	rcu_read_lock();
+	src = __i915_gem_context_lookup_rcu(arg->fpriv, local.clone_id);
+	rcu_read_unlock();
+	if (!src)
+		return -ENOENT;
+
+	GEM_BUG_ON(src == dst);
+
+	for (bit = 0; bit < ARRAY_SIZE(fn); bit++) {
+		if (!(local.flags & BIT(bit)))
+			continue;
+
+		err = fn[bit](dst, src);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
 static const i915_user_extension_fn create_extensions[] = {
 	[I915_CONTEXT_CREATE_EXT_SETPARAM] = create_setparam,
+	[I915_CONTEXT_CREATE_EXT_CLONE] = create_clone,
 };
 
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 9999f7d6a5a9..a5bdb86858f6 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1581,6 +1581,20 @@ struct drm_i915_gem_context_create_ext_setparam {
 	struct drm_i915_gem_context_param param;
 };
 
+struct drm_i915_gem_context_create_ext_clone {
+#define I915_CONTEXT_CREATE_EXT_CLONE 1
+	struct i915_user_extension base;
+	__u32 clone_id;
+	__u32 flags;
+#define I915_CONTEXT_CLONE_FLAGS	(1u << 0)
+#define I915_CONTEXT_CLONE_SCHEDATTR	(1u << 1)
+#define I915_CONTEXT_CLONE_SSEU		(1u << 2)
+#define I915_CONTEXT_CLONE_TIMELINE	(1u << 3)
+#define I915_CONTEXT_CLONE_VM		(1u << 4)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+	__u64 rsvd;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 17/22] drm/i915: Allow a context to define its set of engines
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (14 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 16/22] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
@ 2019-03-18  9:51 ` Chris Wilson
  2019-03-18  9:52 ` [PATCH 18/22] drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[] Chris Wilson
                   ` (7 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:51 UTC (permalink / raw)
  To: intel-gfx

Over the last few years, we have debated how to extend the user API to
support an increase in the number of engines, that may be sparse and
even be heterogeneous within a class (not all video decoders created
equal). We settled on using (class, instance) tuples to identify a
specific engine, with an API for the user to construct a map of engines
to capabilities. Into this picture, we then add a challenge of virtual
engines; one user engine that maps behind the scenes to any number of
physical engines. To keep it general, we want the user to have full
control over that mapping. To that end, we allow the user to constrain a
context to define the set of engines that it can access, order fully
controlled by the user via (class, instance). With such precise control
in context setup, we can continue to use the existing execbuf uABI of
specifying a single index; only now it doesn't automagically map onto
the engines, it uses the user defined engine map from the context.

The I915_EXEC_DEFAULT slot is left empty, and invalid for use by
execbuf. It's use will be revealed in the next patch.

v2: Fixup freeing of local on success of get_engines()
v3: Allow empty engines[]
v4: s/nengine/num_engines/

Testcase: igt/gem_ctx_engines
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c       | 226 +++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context_types.h |  21 ++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    |  19 +-
 drivers/gpu/drm/i915/i915_utils.h             |  36 +++
 include/uapi/drm/i915_drm.h                   |  42 +++-
 5 files changed, 331 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1972f112d72b..31231b139d1e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,7 +86,9 @@
  */
 
 #include <linux/log2.h>
+
 #include <drm/i915_drm.h>
+
 #include "i915_drv.h"
 #include "i915_globals.h"
 #include "i915_trace.h"
@@ -101,6 +103,21 @@ static struct i915_global_gem_context {
 	struct kmem_cache *slab_luts;
 } global;
 
+static struct intel_engine_cs *
+lookup_user_engine(struct i915_gem_context *ctx,
+		   unsigned long flags, u16 class, u16 instance)
+#define LOOKUP_USER_INDEX BIT(0)
+{
+	if (flags & LOOKUP_USER_INDEX) {
+		if (instance >= ctx->num_engines)
+			return NULL;
+
+		return ctx->engines[instance];
+	}
+
+	return intel_engine_lookup_user(ctx->i915, class, instance);
+}
+
 struct i915_lut_handle *i915_lut_handle_alloc(void)
 {
 	return kmem_cache_alloc(global.slab_luts, GFP_KERNEL);
@@ -235,6 +252,8 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	release_hw_id(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
 
+	kfree(ctx->engines);
+
 	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
 		intel_context_put(it);
 
@@ -1390,9 +1409,9 @@ static int set_sseu(struct i915_gem_context *ctx,
 	if (user_sseu.flags || user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = intel_engine_lookup_user(i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
+	engine = lookup_user_engine(ctx, 0,
+				    user_sseu.engine_class,
+				    user_sseu.engine_instance);
 	if (!engine)
 		return -EINVAL;
 
@@ -1410,9 +1429,166 @@ static int set_sseu(struct i915_gem_context *ctx,
 
 	args->size = sizeof(user_sseu);
 
+	return 0;
+};
+
+struct set_engines {
+	struct i915_gem_context *ctx;
+	struct intel_engine_cs **engines;
+	unsigned int num_engines;
+};
+
+static const i915_user_extension_fn set_engines__extensions[] = {
+};
+
+static int
+set_engines(struct i915_gem_context *ctx,
+	    const struct drm_i915_gem_context_param *args)
+{
+	struct i915_context_param_engines __user *user;
+	struct set_engines set = { .ctx = ctx };
+	u64 size, extensions;
+	unsigned int n;
+	int err;
+
+	user = u64_to_user_ptr(args->value);
+	size = args->size;
+	if (!size)
+		goto out;
+
+	BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->class_instance)));
+	if (size < sizeof(*user) ||
+	    !IS_ALIGNED(size, sizeof(*user->class_instance)))
+		return -EINVAL;
+
+	/* Internal limitation of u64 bitmaps + a few bits of u64 in the uABI */
+	set.num_engines =
+		(size - sizeof(*user)) / sizeof(*user->class_instance);
+	if (set.num_engines > I915_EXEC_RING_MASK + 1)
+		return -EINVAL;
+
+	set.engines = kmalloc_array(set.num_engines,
+				    sizeof(*set.engines),
+				    GFP_KERNEL);
+	if (!set.engines)
+		return -ENOMEM;
+
+	for (n = 0; n < set.num_engines; n++) {
+		u16 class, inst;
+
+		if (get_user(class, &user->class_instance[n].engine_class) ||
+		    get_user(inst, &user->class_instance[n].engine_instance)) {
+			kfree(set.engines);
+			return -EFAULT;
+		}
+
+		if (class == (u16)I915_ENGINE_CLASS_INVALID &&
+		    inst == (u16)I915_ENGINE_CLASS_INVALID_NONE) {
+			set.engines[n] = NULL;
+			continue;
+		}
+
+		set.engines[n] = lookup_user_engine(ctx, 0, class, inst);
+		if (!set.engines[n]) {
+			kfree(set.engines);
+			return -ENOENT;
+		}
+	}
+
+	err = -EFAULT;
+	if (!get_user(extensions, &user->extensions))
+		err = i915_user_extensions(u64_to_user_ptr(extensions),
+					   set_engines__extensions,
+					   ARRAY_SIZE(set_engines__extensions),
+					   &set);
+	if (err) {
+		kfree(set.engines);
+		return err;
+	}
+
+out:
+	mutex_lock(&ctx->i915->drm.struct_mutex);
+	kfree(ctx->engines);
+	ctx->engines = set.engines;
+	ctx->num_engines = set.num_engines;
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
 	return 0;
 }
 
+static int
+get_engines(struct i915_gem_context *ctx,
+	    struct drm_i915_gem_context_param *args)
+{
+	struct i915_context_param_engines *local;
+	size_t n, count, size;
+	int err = 0;
+
+restart:
+	if (!READ_ONCE(ctx->engines)) {
+		args->size = 0;
+		return 0;
+	}
+
+	count = READ_ONCE(ctx->num_engines);
+
+	/* Be paranoid in case we have an impedance mismatch */
+	if (!check_struct_size(local, class_instance, count, &size))
+		return -ENOMEM;
+	if (unlikely(overflows_type(size, args->size)))
+		return -ENOMEM;
+
+	if (!args->size) {
+		args->size = size;
+		return 0;
+	}
+
+	if (args->size < size)
+		return -EINVAL;
+
+	local = kmalloc(size, GFP_KERNEL);
+	if (!local)
+		return -ENOMEM;
+
+	if (mutex_lock_interruptible(&ctx->i915->drm.struct_mutex)) {
+		err = -EINTR;
+		goto out;
+	}
+
+	if (!ctx->engines || ctx->num_engines != count) {
+		mutex_unlock(&ctx->i915->drm.struct_mutex);
+		kfree(local);
+		goto restart;
+	}
+
+	local->extensions = 0;
+	for (n = 0; n < count; n++) {
+		if (ctx->engines[n]) {
+			local->class_instance[n].engine_class =
+				ctx->engines[n]->uabi_class;
+			local->class_instance[n].engine_instance =
+				ctx->engines[n]->instance;
+		} else {
+			local->class_instance[n].engine_class =
+				I915_ENGINE_CLASS_INVALID;
+			local->class_instance[n].engine_instance =
+				I915_ENGINE_CLASS_INVALID_NONE;
+		}
+	}
+
+	mutex_unlock(&ctx->i915->drm.struct_mutex);
+
+	if (copy_to_user(u64_to_user_ptr(args->value), local, size)) {
+		err = -EFAULT;
+		goto out;
+	}
+	args->size = size;
+
+out:
+	kfree(local);
+	return err;
+}
+
 static int ctx_setparam(struct i915_gem_context *ctx,
 			struct drm_i915_gem_context_param *args)
 {
@@ -1485,6 +1661,10 @@ static int ctx_setparam(struct i915_gem_context *ctx,
 		ret = set_ppgtt(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = set_engines(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
@@ -1513,6 +1693,35 @@ static int create_setparam(struct i915_user_extension __user *ext, void *data)
 	return ctx_setparam(arg->ctx, &local.param);
 }
 
+static int clone_engines(struct i915_gem_context *dst,
+			 struct i915_gem_context *src)
+{
+	struct intel_engine_cs **engines;
+	unsigned int num_engines;
+
+	mutex_lock(&src->i915->drm.struct_mutex); /* serialise src->engine[] */
+
+	/* handle ZERO_SIZE_PTR on behalf of kmemdup */
+	num_engines = src->num_engines;
+	engines = src->engines;
+	if (!ZERO_OR_NULL_PTR(engines)) {
+		engines = kmemdup(engines,
+				  sizeof(*engines) * num_engines,
+				  GFP_KERNEL);
+		if (!engines) {
+			mutex_unlock(&src->i915->drm.struct_mutex);
+			return -ENOMEM;
+		}
+	}
+
+	mutex_unlock(&src->i915->drm.struct_mutex);
+
+	kfree(dst->engines);
+	dst->engines = engines;
+	dst->num_engines = num_engines;
+	return 0;
+}
+
 static int clone_flags(struct i915_gem_context *dst,
 		       struct i915_gem_context *src)
 {
@@ -1621,6 +1830,7 @@ static int create_clone(struct i915_user_extension __user *ext, void *data)
 	static int (* const fn[])(struct i915_gem_context *dst,
 				  struct i915_gem_context *src) = {
 #define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y
+		MAP(ENGINES, clone_engines),
 		MAP(FLAGS, clone_flags),
 		MAP(SCHEDATTR, clone_schedattr),
 		MAP(SSEU, clone_sseu),
@@ -1787,9 +1997,9 @@ static int get_sseu(struct i915_gem_context *ctx,
 	if (user_sseu.flags || user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = intel_engine_lookup_user(ctx->i915,
-					  user_sseu.engine_class,
-					  user_sseu.engine_instance);
+	engine = lookup_user_engine(ctx, 0,
+				    user_sseu.engine_class,
+				    user_sseu.engine_instance);
 	if (!engine)
 		return -EINVAL;
 
@@ -1870,6 +2080,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		ret = get_ppgtt(ctx, args);
 		break;
 
+	case I915_CONTEXT_PARAM_ENGINES:
+		ret = get_engines(ctx, args);
+		break;
+
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
 	default:
 		ret = -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_gem_context_types.h b/drivers/gpu/drm/i915/i915_gem_context_types.h
index f8f6e6c960a7..c9c77b937a8d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/i915_gem_context_types.h
@@ -41,6 +41,20 @@ struct i915_gem_context {
 	/** file_priv: owning file descriptor */
 	struct drm_i915_file_private *file_priv;
 
+	/**
+	 * @engines: User defined engines for this context
+	 *
+	 * NULL means to use legacy definitions (including random meaning of
+	 * I915_EXEC_BSD with I915_EXEC_BSD_SELECTOR overrides).
+	 *
+	 * If defined, execbuf uses the I915_EXEC_MASK as an index into
+	 * array, and various uAPI other the ability to lookup up an
+	 * index from this array to select an engine operate on.
+	 *
+	 * User defined by I915_CONTEXT_PARAM_ENGINE.
+	 */
+	struct intel_engine_cs **engines;
+
 	struct i915_timeline *timeline;
 
 	/**
@@ -110,6 +124,13 @@ struct i915_gem_context {
 #define CONTEXT_CLOSED			1
 #define CONTEXT_FORCE_SINGLE_SUBMISSION	2
 
+	/**
+	 * @num_engines: Number of user defined engines for this context
+	 *
+	 * See @engines for the elements.
+	 */
+	unsigned int num_engines;
+
 	/**
 	 * @hw_id: - unique identifier for the context
 	 *
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3d672c9edb94..66b3921cc8bd 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2089,13 +2089,20 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 };
 
 static struct intel_engine_cs *
-eb_select_engine(struct drm_i915_private *dev_priv,
+eb_select_engine(struct i915_execbuffer *eb,
 		 struct drm_file *file,
 		 struct drm_i915_gem_execbuffer2 *args)
 {
 	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
 	struct intel_engine_cs *engine;
 
+	if (eb->ctx->engines) {
+		if (user_ring_id >= eb->ctx->num_engines)
+			return NULL;
+
+		return eb->ctx->engines[user_ring_id];
+	}
+
 	if (user_ring_id > I915_USER_RINGS) {
 		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
 		return NULL;
@@ -2108,11 +2115,11 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 		return NULL;
 	}
 
-	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(dev_priv, VCS1)) {
+	if (user_ring_id == I915_EXEC_BSD && HAS_ENGINE(eb->i915, VCS1)) {
 		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
 
 		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
-			bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
+			bsd_idx = gen8_dispatch_bsd_engine(eb->i915, file);
 		} else if (bsd_idx >= I915_EXEC_BSD_RING1 &&
 			   bsd_idx <= I915_EXEC_BSD_RING2) {
 			bsd_idx >>= I915_EXEC_BSD_SHIFT;
@@ -2123,9 +2130,9 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 			return NULL;
 		}
 
-		engine = dev_priv->engine[_VCS(bsd_idx)];
+		engine = eb->i915->engine[_VCS(bsd_idx)];
 	} else {
-		engine = dev_priv->engine[user_ring_map[user_ring_id]];
+		engine = eb->i915->engine[user_ring_map[user_ring_id]];
 	}
 
 	if (!engine) {
@@ -2335,7 +2342,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_destroy;
 
-	eb.engine = eb_select_engine(eb.i915, file, args);
+	eb.engine = eb_select_engine(&eb, file, args);
 	if (!eb.engine) {
 		err = -EINVAL;
 		goto err_engine;
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 2dbe8933b50a..1436fe2fb5f8 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -25,6 +25,9 @@
 #ifndef __I915_UTILS_H
 #define __I915_UTILS_H
 
+#include <linux/kernel.h>
+#include <linux/overflow.h>
+
 #undef WARN_ON
 /* Many gcc seem to no see through this and fall over :( */
 #if 0
@@ -73,6 +76,39 @@
 #define overflows_type(x, T) \
 	(sizeof(x) > sizeof(T) && (x) >> BITS_PER_TYPE(T))
 
+static inline bool
+__check_struct_size(size_t base, size_t arr, size_t count, size_t *size)
+{
+	size_t sz;
+
+	if (check_mul_overflow(count, arr, &sz))
+		return false;
+
+	if (check_add_overflow(sz, base, &sz))
+		return false;
+
+	*size = sz;
+	return true;
+}
+
+/**
+ * check_struct_size() - Calculate size of structure with trailing array.
+ * @p: Pointer to the structure.
+ * @member: Name of the array member.
+ * @n: Number of elements in the array.
+ * @sz: Total size of structure and array
+ *
+ * Calculates size of memory needed for structure @p followed by an
+ * array of @n @member elements, like struct_size() but reports
+ * whether it overflowed, and the resultant size in @sz
+ *
+ * Return: false if the calculation overflowed.
+ */
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))
+
 #define ptr_mask_bits(ptr, n) ({					\
 	unsigned long __v = (unsigned long)(ptr);			\
 	(typeof(ptr))(__v & -BIT(n));					\
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index a5bdb86858f6..4e67c2395b46 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -126,6 +126,8 @@ enum drm_i915_gem_engine_class {
 	I915_ENGINE_CLASS_INVALID	= -1
 };
 
+#define I915_ENGINE_CLASS_INVALID_NONE -1
+
 /**
  * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
  *
@@ -1511,6 +1513,26 @@ struct drm_i915_gem_context_param {
 	 * See DRM_I915_GEM_VM_CREATE and DRM_I915_GEM_VM_DESTROY.
 	 */
 #define I915_CONTEXT_PARAM_VM		0x9
+
+/*
+ * I915_CONTEXT_PARAM_ENGINES:
+ *
+ * Bind this context to operate on this subset of available engines. Henceforth,
+ * the I915_EXEC_RING selector for DRM_IOCTL_I915_GEM_EXECBUFFER2 operates as
+ * an index into this array of engines; I915_EXEC_DEFAULT selecting engine[0]
+ * and upwards. Slots 0...N are filled in using the specified (class, instance).
+ * Use
+ *	engine_class: I915_ENGINE_CLASS_INVALID,
+ *	engine_instance: I915_ENGINE_CLASS_INVALID_NONE
+ * to specify a gap in the array that can be filled in later, e.g. by a
+ * virtual engine used for load balancing.
+ *
+ * Setting the number of engines bound to the context to 0, by passing a zero
+ * sized argument, will revert back to default settings.
+ *
+ * See struct i915_context_param_engines.
+ */
+#define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
 
 	__u64 value;
@@ -1575,6 +1597,23 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+struct i915_context_param_engines {
+	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+
+	struct {
+		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
+		__u16 engine_instance;
+	} class_instance[0];
+} __attribute__((packed));
+
+#define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \
+	__u64 extensions; \
+	struct { \
+		__u16 engine_class; \
+		__u16 engine_instance; \
+	} class_instance[N__]; \
+} __attribute__((packed)) name__
+
 struct drm_i915_gem_context_create_ext_setparam {
 #define I915_CONTEXT_CREATE_EXT_SETPARAM 0
 	struct i915_user_extension base;
@@ -1591,7 +1630,8 @@ struct drm_i915_gem_context_create_ext_clone {
 #define I915_CONTEXT_CLONE_SSEU		(1u << 2)
 #define I915_CONTEXT_CLONE_TIMELINE	(1u << 3)
 #define I915_CONTEXT_CLONE_VM		(1u << 4)
-#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_VM << 1)
+#define I915_CONTEXT_CLONE_ENGINES	(1u << 5)
+#define I915_CONTEXT_CLONE_UNKNOWN -(I915_CONTEXT_CLONE_ENGINES << 1)
 	__u64 rsvd;
 };
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 18/22] drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (15 preceding siblings ...)
  2019-03-18  9:51 ` [PATCH 17/22] drm/i915: Allow a context to define its set of engines Chris Wilson
@ 2019-03-18  9:52 ` Chris Wilson
  2019-03-18  9:52 ` [PATCH 19/22] drm/i915: Load balancing across a virtual engine Chris Wilson
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:52 UTC (permalink / raw)
  To: intel-gfx

Allow the user to specify a local engine index (as opposed to
class:index) that they can use to refer to a preset engine inside the
ctx->engine[] array defined by an earlier I915_CONTEXT_PARAM_ENGINES.
This will be useful for setting SSEU parameters on virtual engines that
are local to the context and do not have a valid global class:instance
lookup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 24 ++++++++++++++++++++----
 include/uapi/drm/i915_drm.h             |  3 ++-
 2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 31231b139d1e..313a18e37071 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1394,6 +1394,7 @@ static int set_sseu(struct i915_gem_context *ctx,
 	struct drm_i915_gem_context_param_sseu user_sseu;
 	struct intel_engine_cs *engine;
 	struct intel_sseu sseu;
+	unsigned long lookup;
 	int ret;
 
 	if (args->size < sizeof(user_sseu))
@@ -1406,10 +1407,17 @@ static int set_sseu(struct i915_gem_context *ctx,
 			   sizeof(user_sseu)))
 		return -EFAULT;
 
-	if (user_sseu.flags || user_sseu.rsvd)
+	if (user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = lookup_user_engine(ctx, 0,
+	if (user_sseu.flags & ~(I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX))
+		return -EINVAL;
+
+	lookup = 0;
+	if (user_sseu.flags & I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX)
+		lookup |= LOOKUP_USER_INDEX;
+
+	engine = lookup_user_engine(ctx, lookup,
 				    user_sseu.engine_class,
 				    user_sseu.engine_instance);
 	if (!engine)
@@ -1984,6 +1992,7 @@ static int get_sseu(struct i915_gem_context *ctx,
 	struct drm_i915_gem_context_param_sseu user_sseu;
 	struct intel_engine_cs *engine;
 	struct intel_context *ce;
+	unsigned long lookup;
 
 	if (args->size == 0)
 		goto out;
@@ -1994,10 +2003,17 @@ static int get_sseu(struct i915_gem_context *ctx,
 			   sizeof(user_sseu)))
 		return -EFAULT;
 
-	if (user_sseu.flags || user_sseu.rsvd)
+	if (user_sseu.rsvd)
 		return -EINVAL;
 
-	engine = lookup_user_engine(ctx, 0,
+	if (user_sseu.flags & ~(I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX))
+		return -EINVAL;
+
+	lookup = 0;
+	if (user_sseu.flags & I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX)
+		lookup |= LOOKUP_USER_INDEX;
+
+	engine = lookup_user_engine(ctx, lookup,
 				    user_sseu.engine_class,
 				    user_sseu.engine_instance);
 	if (!engine)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 4e67c2395b46..8ef6d60929c6 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1567,9 +1567,10 @@ struct drm_i915_gem_context_param_sseu {
 	__u16 engine_instance;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * Unknown flags must be cleared to zero.
 	 */
 	__u32 flags;
+#define I915_CONTEXT_SSEU_FLAG_ENGINE_INDEX (1u << 0)
 
 	/*
 	 * Mask of slices to enable for the context. Valid values are a subset
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 19/22] drm/i915: Load balancing across a virtual engine
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (16 preceding siblings ...)
  2019-03-18  9:52 ` [PATCH 18/22] drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[] Chris Wilson
@ 2019-03-18  9:52 ` Chris Wilson
  2019-03-18  9:52 ` [PATCH 20/22] drm/i915: Extend execution fence to support a callback Chris Wilson
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:52 UTC (permalink / raw)
  To: intel-gfx

Having allowed the user to define a set of engines that they will want
to only use, we go one step further and allow them to bind those engines
into a single virtual instance. Submitting a batch to the virtual engine
will then forward it to any one of the set in a manner as best to
distribute load.  The virtual engine has a single timeline across all
engines (it operates as a single queue), so it is not able to concurrently
run batches across multiple engines by itself; that is left up to the user
to submit multiple concurrent batches to multiple queues. Multiple users
will be load balanced across the system.

The mechanism used for load balancing in this patch is a late greedy
balancer. When a request is ready for execution, it is added to each
engine's queue, and when an engine is ready for its next request it
claims it from the virtual engine. The first engine to do so, wins, i.e.
the request is executed at the earliest opportunity (idle moment) in the
system.

As not all HW is created equal, the user is still able to skip the
virtual engine and execute the batch on a specific engine, all within the
same queue. It will then be executed in order on the correct engine,
with execution on other virtual engines being moved away due to the load
detection.

A couple of areas for potential improvement left!

- The virtual engine always take priority over equal-priority tasks.
Mostly broken up by applying FQ_CODEL rules for prioritising new clients,
and hopefully the virtual and real engines are not then congested (i.e.
all work is via virtual engines, or all work is to the real engine).

- We require the breadcrumb irq around every virtual engine request. For
normal engines, we eliminate the need for the slow round trip via
interrupt by using the submit fence and queueing in order. For virtual
engines, we have to allow any job to transfer to a new ring, and cannot
coalesce the submissions, so require the completion fence instead,
forcing the persistent use of interrupts.

- We only drip feed single requests through each virtual engine and onto
the physical engines, even if there was enough work to fill all ELSP,
leaving small stalls with an idle CS event at the end of every request.
Could we be greedy and fill both slots? Being lazy is virtuous for load
distribution on less-than-full workloads though.

Other areas of improvement are more general, such as reducing lock
contention, reducing dispatch overhead, looking at direct submission
rather than bouncing around tasklets etc.

sseu: Lift the restriction to allow sseu to be reconfigured on virtual
engines composed of RENDER_CLASS (rcs).

v2: macroize check_user_mbz()
v3: Cancel virtual engines on wedging
v4: Commence commenting

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.h            |   5 +
 drivers/gpu/drm/i915/i915_gem_context.c    | 126 ++++-
 drivers/gpu/drm/i915/i915_scheduler.c      |  18 +-
 drivers/gpu/drm/i915/i915_timeline_types.h |   1 +
 drivers/gpu/drm/i915/intel_engine_types.h  |   8 +
 drivers/gpu/drm/i915/intel_lrc.c           | 570 ++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_lrc.h           |  11 +
 drivers/gpu/drm/i915/selftests/intel_lrc.c | 165 ++++++
 include/uapi/drm/i915_drm.h                |  30 ++
 9 files changed, 915 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index 5c073fe73664..3ca855505715 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -96,4 +96,9 @@ static inline bool __tasklet_enable(struct tasklet_struct *t)
 	return atomic_dec_and_test(&t->count);
 }
 
+static inline bool __tasklet_is_scheduled(struct tasklet_struct *t)
+{
+	return test_bit(TASKLET_STATE_SCHED, &t->state);
+}
+
 #endif /* __I915_GEM_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 313a18e37071..b387b71e2cb5 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -86,6 +86,7 @@
  */
 
 #include <linux/log2.h>
+#include <linux/nospec.h>
 
 #include <drm/i915_drm.h>
 
@@ -94,6 +95,7 @@
 #include "i915_trace.h"
 #include "i915_user_extensions.h"
 #include "intel_lrc_reg.h"
+#include "intel_lrc.h"
 #include "intel_workarounds.h"
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
@@ -241,6 +243,20 @@ static void release_hw_id(struct i915_gem_context *ctx)
 	mutex_unlock(&i915->contexts.mutex);
 }
 
+static void free_engines(struct intel_engine_cs **engines, int count)
+{
+	int i;
+
+	if (ZERO_OR_NULL_PTR(engines))
+		return;
+
+	/* We own the veng we created; regular engines are ignored */
+	for (i = 0; i < count; i++)
+		intel_virtual_engine_destroy(engines[i]);
+
+	kfree(engines);
+}
+
 static void i915_gem_context_free(struct i915_gem_context *ctx)
 {
 	struct intel_context *it, *n;
@@ -251,8 +267,7 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 
 	release_hw_id(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
-
-	kfree(ctx->engines);
+	free_engines(ctx->engines, ctx->num_engines);
 
 	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
 		intel_context_put(it);
@@ -1252,7 +1267,6 @@ __i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
 	int ret = 0;
 
 	GEM_BUG_ON(INTEL_GEN(ctx->i915) < 8);
-	GEM_BUG_ON(engine->id != RCS0);
 
 	ce = intel_context_pin_lock(ctx, engine);
 	if (IS_ERR(ce))
@@ -1446,7 +1460,80 @@ struct set_engines {
 	unsigned int num_engines;
 };
 
+static int
+set_engines__load_balance(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_load_balance __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_engines *set = data;
+	struct intel_engine_cs *ve;
+	unsigned int n;
+	u64 mask;
+	u16 idx;
+	int err;
+
+	if (!HAS_EXECLISTS(set->ctx->i915))
+		return -ENODEV;
+
+	if (USES_GUC_SUBMISSION(set->ctx->i915))
+		return -ENODEV; /* not implement yet */
+
+	if (get_user(idx, &ext->engine_index))
+		return -EFAULT;
+
+	if (idx >= set->num_engines)
+		return -EINVAL;
+
+	idx = array_index_nospec(idx, set->num_engines);
+	if (set->engines[idx])
+		return -EEXIST;
+
+	err = check_user_mbz(&ext->mbz16);
+	if (err)
+		return err;
+
+	err = check_user_mbz(&ext->flags);
+	if (err)
+		return err;
+
+	for (n = 0; n < ARRAY_SIZE(ext->mbz64); n++) {
+		err = check_user_mbz(&ext->mbz64[n]);
+		if (err)
+			return err;
+	}
+
+	if (get_user(mask, &ext->engines_mask))
+		return -EFAULT;
+
+	mask &= GENMASK_ULL(set->num_engines - 1, 0) & ~BIT_ULL(idx);
+	if (!mask)
+		return -EINVAL;
+
+	if (is_power_of_2(mask)) {
+		ve = set->engines[__ffs64(mask)];
+	} else {
+		struct intel_engine_cs *stack[64];
+		int bit;
+
+		n = 0;
+		for_each_set_bit(bit, (unsigned long *)&mask, set->num_engines)
+			stack[n++] = set->engines[bit];
+
+		ve = intel_execlists_create_virtual(set->ctx, stack, n);
+	}
+	if (IS_ERR(ve))
+		return PTR_ERR(ve);
+
+	if (cmpxchg(&set->engines[idx], NULL, ve)) {
+		intel_virtual_engine_destroy(ve);
+		return -EEXIST;
+	}
+
+	return 0;
+}
+
 static const i915_user_extension_fn set_engines__extensions[] = {
+	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
 };
 
 static int
@@ -1510,13 +1597,13 @@ set_engines(struct i915_gem_context *ctx,
 					   ARRAY_SIZE(set_engines__extensions),
 					   &set);
 	if (err) {
-		kfree(set.engines);
+		free_engines(set.engines, set.num_engines);
 		return err;
 	}
 
 out:
 	mutex_lock(&ctx->i915->drm.struct_mutex);
-	kfree(ctx->engines);
+	free_engines(ctx->engines, ctx->num_engines);
 	ctx->engines = set.engines;
 	ctx->num_engines = set.num_engines;
 	mutex_unlock(&ctx->i915->drm.struct_mutex);
@@ -1705,7 +1792,7 @@ static int clone_engines(struct i915_gem_context *dst,
 			 struct i915_gem_context *src)
 {
 	struct intel_engine_cs **engines;
-	unsigned int num_engines;
+	unsigned int num_engines, i;
 
 	mutex_lock(&src->i915->drm.struct_mutex); /* serialise src->engine[] */
 
@@ -1720,11 +1807,36 @@ static int clone_engines(struct i915_gem_context *dst,
 			mutex_unlock(&src->i915->drm.struct_mutex);
 			return -ENOMEM;
 		}
+
+		/*
+		 * Virtual engines are singletons; they can only exist
+		 * inside a single context, because they embed their
+		 * HW context... As each virtual context implies a single
+		 * timeline (each engine can only dequeue a single request
+		 * at any time), it would be surprising for two contexts
+		 * to use the same engine. So let's create a copy of
+		 * the virtual engine instead.
+		 */
+		for (i = 0; i < num_engines; i++) {
+			struct intel_engine_cs *engine = engines[i];
+
+			if (!engine || !intel_engine_is_virtual(engine))
+				continue;
+
+			engine = intel_execlists_clone_virtual(dst, engine);
+			if (IS_ERR(engine)) {
+				free_engines(engines, i);
+				mutex_unlock(&src->i915->drm.struct_mutex);
+				return PTR_ERR(engine);
+			}
+
+			engines[i] = engine;
+		}
 	}
 
 	mutex_unlock(&src->i915->drm.struct_mutex);
 
-	kfree(dst->engines);
+	free_engines(dst->engines, dst->num_engines);
 	dst->engines = engines;
 	dst->num_engines = num_engines;
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c
index e0f609d01564..8cff4f6d6158 100644
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -247,17 +247,26 @@ sched_lock_engine(const struct i915_sched_node *node,
 		  struct intel_engine_cs *locked,
 		  struct sched_cache *cache)
 {
-	struct intel_engine_cs *engine = node_to_request(node)->engine;
+	const struct i915_request *rq = node_to_request(node);
+	struct intel_engine_cs *engine;
 
 	GEM_BUG_ON(!locked);
 
-	if (engine != locked) {
+	/*
+	 * Virtual engines complicate acquiring the engine timeline lock,
+	 * as their rq->engine pointer is not stable until under that
+	 * engine lock. The simple ploy we use is to take the lock then
+	 * check that the rq still belongs to the newly locked engine.
+	 */
+	while (locked != (engine = READ_ONCE(rq->engine))) {
 		spin_unlock(&locked->timeline.lock);
 		memset(cache, 0, sizeof(*cache));
 		spin_lock(&engine->timeline.lock);
+		locked = engine;
 	}
 
-	return engine;
+	GEM_BUG_ON(locked != engine);
+	return locked;
 }
 
 static bool inflight(const struct i915_request *rq,
@@ -370,8 +379,11 @@ static void __i915_schedule(struct i915_request *rq,
 		if (prio <= node->attr.priority || node_signaled(node))
 			continue;
 
+		GEM_BUG_ON(node_to_request(node)->engine != engine);
+
 		node->attr.priority = prio;
 		if (!list_empty(&node->link)) {
+			GEM_BUG_ON(intel_engine_is_virtual(engine));
 			if (!cache.priolist)
 				cache.priolist =
 					i915_sched_lookup_priolist(engine,
diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
index d42053544d7c..17b5314fb847 100644
--- a/drivers/gpu/drm/i915/i915_timeline_types.h
+++ b/drivers/gpu/drm/i915/i915_timeline_types.h
@@ -26,6 +26,7 @@ struct i915_timeline {
 	spinlock_t lock;
 #define TIMELINE_CLIENT 0 /* default subclass */
 #define TIMELINE_ENGINE 1
+#define TIMELINE_VIRTUAL 2
 	struct mutex mutex; /* protects the flow of requests */
 
 	unsigned int pin_count;
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 549fdfca17aa..66ab1deeb7f5 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -225,6 +225,7 @@ struct intel_engine_execlists {
 	 * @queue: queue of requests, in priority lists
 	 */
 	struct rb_root_cached queue;
+	struct rb_root_cached virtual;
 
 	/**
 	 * @csb_write: control register for Context Switch buffer
@@ -430,6 +431,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_SUPPORTS_STATS   BIT(1)
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
+#define I915_ENGINE_IS_VIRTUAL       BIT(4)
 	unsigned int flags;
 
 	/*
@@ -513,6 +515,12 @@ intel_engine_has_semaphores(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_HAS_SEMAPHORES;
 }
 
+static inline bool
+intel_engine_is_virtual(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_IS_VIRTUAL;
+}
+
 #define instdone_slice_mask(dev_priv__) \
 	(IS_GEN(dev_priv__, 7) ? \
 	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index ae89174fd08f..926ef07f5c38 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -166,6 +166,41 @@
 
 #define ACTIVE_PRIORITY (I915_PRIORITY_NEWCLIENT | I915_PRIORITY_NOSEMAPHORE)
 
+struct virtual_engine {
+	struct intel_engine_cs base;
+	struct intel_context context;
+
+	/*
+	 * We allow only a single request through the virtual engine at a time
+	 * (each request in the timeline waits for the completion fence of
+	 * the previous before being submitted). By restricting ourselves to
+	 * only submitting a single request, each request is placed on to a
+	 * physical to maximise load spreading (by virtue of the late greedy
+	 * scheduling -- each real engine takes the next available request
+	 * upon idling).
+	 */
+	struct i915_request *request;
+
+	/*
+	 * We keep a rbtree of available virtual engines inside each physical
+	 * engine, sorted by priority. Here we preallocate the nodes we need
+	 * for the virtual engine, indexed by physical_engine->id.
+	 */
+	struct ve_node {
+		struct rb_node rb;
+		int prio;
+	} nodes[I915_NUM_ENGINES];
+
+	/* And finally, which physical engines this virtual engine maps onto. */
+	unsigned int count;
+	struct intel_engine_cs *siblings[0];
+};
+
+static struct virtual_engine *to_virtual_engine(struct intel_engine_cs *engine)
+{
+	return container_of(engine, struct virtual_engine, base);
+}
+
 static int execlists_context_deferred_alloc(struct intel_context *ce,
 					    struct intel_engine_cs *engine);
 static void execlists_init_reg_state(u32 *reg_state,
@@ -229,7 +264,8 @@ static int queue_prio(const struct intel_engine_execlists *execlists)
 }
 
 static inline bool need_preempt(const struct intel_engine_cs *engine,
-				const struct i915_request *rq)
+				const struct i915_request *rq,
+				struct rb_node *rb)
 {
 	int last_prio;
 
@@ -264,6 +300,22 @@ static inline bool need_preempt(const struct intel_engine_cs *engine,
 	    rq_prio(list_next_entry(rq, link)) > last_prio)
 		return true;
 
+	if (rb) { /* XXX virtual precedence */
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		bool preempt = false;
+
+		if (engine == ve->siblings[0]) { /* only preempt one sibling */
+			spin_lock(&ve->base.timeline.lock);
+			if (ve->request)
+				preempt = rq_prio(ve->request) > last_prio;
+			spin_unlock(&ve->base.timeline.lock);
+		}
+
+		if (preempt)
+			return preempt;
+	}
+
 	/*
 	 * If the inflight context did not trigger the preemption, then maybe
 	 * it was the set of queued requests? Pick the highest priority in
@@ -382,6 +434,8 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 	list_for_each_entry_safe_reverse(rq, rn,
 					 &engine->timeline.requests,
 					 link) {
+		struct intel_engine_cs *owner;
+
 		if (i915_request_completed(rq))
 			break;
 
@@ -390,14 +444,30 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
 
 		GEM_BUG_ON(rq->hw_context->active);
 
-		GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
-		if (rq_prio(rq) != prio) {
-			prio = rq_prio(rq);
-			pl = i915_sched_lookup_priolist(engine, prio);
-		}
-		GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+		/*
+		 * Push the request back into the queue for later resubmission.
+		 * If this request is not native to this physical engine (i.e.
+		 * it came from a virtual source), push it back onto the virtual
+		 * engine so that it can be moved across onto another physical
+		 * engine as load dictates.
+		 */
+		owner = rq->hw_context->engine;
+		if (likely(owner == engine)) {
+			GEM_BUG_ON(rq_prio(rq) == I915_PRIORITY_INVALID);
+			if (rq_prio(rq) != prio) {
+				prio = rq_prio(rq);
+				pl = i915_sched_lookup_priolist(engine, prio);
+			}
+			GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
 
-		list_add(&rq->sched.link, pl);
+			list_add(&rq->sched.link, pl);
+		} else {
+			if (__i915_request_has_started(rq))
+				rq->sched.attr.priority |= ACTIVE_PRIORITY;
+
+			rq->engine = owner;
+			owner->submit_request(rq);
+		}
 
 		active = rq;
 	}
@@ -659,6 +729,50 @@ static void complete_preempt_context(struct intel_engine_execlists *execlists)
 						  execlists));
 }
 
+static void virtual_update_register_offsets(u32 *regs,
+					    struct intel_engine_cs *engine)
+{
+	u32 base = engine->mmio_base;
+
+	regs[CTX_CONTEXT_CONTROL] =
+		i915_mmio_reg_offset(RING_CONTEXT_CONTROL(engine));
+	regs[CTX_RING_HEAD] = i915_mmio_reg_offset(RING_HEAD(base));
+	regs[CTX_RING_TAIL] = i915_mmio_reg_offset(RING_TAIL(base));
+	regs[CTX_RING_BUFFER_START] = i915_mmio_reg_offset(RING_START(base));
+	regs[CTX_RING_BUFFER_CONTROL] = i915_mmio_reg_offset(RING_CTL(base));
+
+	regs[CTX_BB_HEAD_U] = i915_mmio_reg_offset(RING_BBADDR_UDW(base));
+	regs[CTX_BB_HEAD_L] = i915_mmio_reg_offset(RING_BBADDR(base));
+	regs[CTX_BB_STATE] = i915_mmio_reg_offset(RING_BBSTATE(base));
+	regs[CTX_SECOND_BB_HEAD_U] =
+		i915_mmio_reg_offset(RING_SBBADDR_UDW(base));
+	regs[CTX_SECOND_BB_HEAD_L] = i915_mmio_reg_offset(RING_SBBADDR(base));
+	regs[CTX_SECOND_BB_STATE] = i915_mmio_reg_offset(RING_SBBSTATE(base));
+
+	regs[CTX_CTX_TIMESTAMP] =
+		i915_mmio_reg_offset(RING_CTX_TIMESTAMP(base));
+	regs[CTX_PDP3_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 3));
+	regs[CTX_PDP3_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 3));
+	regs[CTX_PDP2_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 2));
+	regs[CTX_PDP2_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 2));
+	regs[CTX_PDP1_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 1));
+	regs[CTX_PDP1_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 1));
+	regs[CTX_PDP0_UDW] = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, 0));
+	regs[CTX_PDP0_LDW] = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, 0));
+
+	if (engine->class == RENDER_CLASS) {
+		regs[CTX_RCS_INDIRECT_CTX] =
+			i915_mmio_reg_offset(RING_INDIRECT_CTX(base));
+		regs[CTX_RCS_INDIRECT_CTX_OFFSET] =
+			i915_mmio_reg_offset(RING_INDIRECT_CTX_OFFSET(base));
+		regs[CTX_BB_PER_CTX_PTR] =
+			i915_mmio_reg_offset(RING_BB_PER_CTX_PTR(base));
+
+		regs[CTX_R_PWR_CLK_STATE] =
+			i915_mmio_reg_offset(GEN8_R_PWR_CLK_STATE);
+	}
+}
+
 static void execlists_dequeue(struct intel_engine_cs *engine)
 {
 	struct intel_engine_execlists * const execlists = &engine->execlists;
@@ -691,6 +805,37 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 	 * and context switches) submission.
 	 */
 
+	for (rb = rb_first_cached(&execlists->virtual); rb; ) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+		struct intel_engine_cs *active;
+
+		if (!rq) { /* lazily cleanup after another engine handled rq */
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		/*
+		 * We track when the HW has completed saving the context image
+		 * (i.e. when we have seen the final CS event switching out of
+		 * the context) and must not overwrite the context image before
+		 * then. This restricts us to only using the active engine
+		 * while the previous virtualized request is inflight (so
+		 * we reuse the register offsets). This is a very small
+		 * hystersis on the greedy seelction algorithm.
+		 */
+		active = READ_ONCE(ve->context.active);
+		if (active && active != engine) {
+			rb = rb_next(rb);
+			continue;
+		}
+
+		break;
+	}
+
 	if (last) {
 		/*
 		 * Don't resubmit or switch until all outstanding
@@ -712,7 +857,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		if (!execlists_is_active(execlists, EXECLISTS_ACTIVE_HWACK))
 			return;
 
-		if (need_preempt(engine, last)) {
+		if (need_preempt(engine, last, rb)) {
 			inject_preempt_context(engine);
 			return;
 		}
@@ -752,6 +897,72 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 		last->tail = last->wa_tail;
 	}
 
+	while (rb) { /* XXX virtual is always taking precedence */
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq;
+
+		spin_lock(&ve->base.timeline.lock);
+
+		rq = ve->request;
+		if (unlikely(!rq)) { /* lost the race to a sibling */
+			spin_unlock(&ve->base.timeline.lock);
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+			rb = rb_first_cached(&execlists->virtual);
+			continue;
+		}
+
+		if (rq_prio(rq) >= queue_prio(execlists)) {
+			if (last && !can_merge_rq(last, rq)) {
+				spin_unlock(&ve->base.timeline.lock);
+				return; /* leave this rq for another engine */
+			}
+
+			GEM_BUG_ON(rq->engine != &ve->base);
+			ve->request = NULL;
+			ve->base.execlists.queue_priority_hint = INT_MIN;
+			rb_erase_cached(rb, &execlists->virtual);
+			RB_CLEAR_NODE(rb);
+
+			GEM_BUG_ON(rq->hw_context != &ve->context);
+			rq->engine = engine;
+
+			if (engine != ve->siblings[0]) {
+				u32 *regs = ve->context.lrc_reg_state;
+				unsigned int n;
+
+				GEM_BUG_ON(READ_ONCE(ve->context.active));
+				virtual_update_register_offsets(regs, engine);
+
+				/*
+				 * Move the bound engine to the top of the list
+				 * for future execution. We then kick this
+				 * tasklet first before checking others, so that
+				 * we preferentially reuse this set of bound
+				 * registers.
+				 */
+				for (n = 1; n < ve->count; n++) {
+					if (ve->siblings[n] == engine) {
+						swap(ve->siblings[n],
+						     ve->siblings[0]);
+						break;
+					}
+				}
+
+				GEM_BUG_ON(ve->siblings[0] != engine);
+			}
+
+			__i915_request_submit(rq);
+			trace_i915_request_in(rq, port_index(port, execlists));
+			submit = true;
+			last = rq;
+		}
+
+		spin_unlock(&ve->base.timeline.lock);
+		break;
+	}
+
 	while ((rb = rb_first_cached(&execlists->queue))) {
 		struct i915_priolist *p = to_priolist(rb);
 		struct i915_request *rq, *rn;
@@ -971,6 +1182,24 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 		i915_priolist_free(p);
 	}
 
+	/* Cancel all attached virtual engines */
+	while ((rb = rb_first_cached(&execlists->virtual))) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+
+		rb_erase_cached(rb, &execlists->virtual);
+		RB_CLEAR_NODE(rb);
+
+		spin_lock(&ve->base.timeline.lock);
+		if (ve->request) {
+			__i915_request_submit(ve->request);
+			dma_fence_set_error(&ve->request->fence, -EIO);
+			i915_request_mark_complete(ve->request);
+			ve->request = NULL;
+		}
+		spin_unlock(&ve->base.timeline.lock);
+	}
+
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority_hint = INT_MIN;
@@ -2897,6 +3126,306 @@ void intel_lr_context_resume(struct drm_i915_private *i915)
 	}
 }
 
+static void virtual_context_destroy(struct kref *kref)
+{
+	struct virtual_engine *ve =
+		container_of(kref, typeof(*ve), context.ref);
+	unsigned int n;
+
+	GEM_BUG_ON(ve->request);
+	GEM_BUG_ON(ve->context.active);
+
+	for (n = 0; n < ve->count; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct rb_node *node = &ve->nodes[sibling->id].rb;
+
+		if (RB_EMPTY_NODE(node))
+			continue;
+
+		spin_lock_irq(&sibling->timeline.lock);
+
+		if (!RB_EMPTY_NODE(node))
+			rb_erase_cached(node, &sibling->execlists.virtual);
+
+		spin_unlock_irq(&sibling->timeline.lock);
+	}
+	GEM_BUG_ON(__tasklet_is_scheduled(&ve->base.execlists.tasklet));
+
+	if (ve->context.state)
+		__execlists_context_fini(&ve->context);
+
+	i915_timeline_fini(&ve->base.timeline);
+	kfree(ve);
+}
+
+static void virtual_engine_initial_hint(struct virtual_engine *ve)
+{
+	int swp;
+
+	/*
+	 * Pick a random sibling on starting to help spread the load around.
+	 *
+	 * New contexts are typically created with exactly the same order
+	 * of siblings, and often started in batches. Due to the way we iterate
+	 * the array of sibling when submitting requests, sibling[0] is
+	 * prioritised for dequeuing. If we make sure that sibling[0] is fairly
+	 * randomised across the system, we also help spread the load by the
+	 * first engine we inspect being different each time.
+	 *
+	 * NB This does not force us to execute on this engine, it will just
+	 * typically be the first we inspect for submission.
+	 */
+	swp = prandom_u32_max(ve->count);
+	if (!swp)
+		return;
+
+	swap(ve->siblings[swp], ve->siblings[0]);
+	virtual_update_register_offsets(ve->context.lrc_reg_state,
+					ve->siblings[0]);
+}
+
+static int virtual_context_pin(struct intel_context *ce)
+{
+	struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
+	int err;
+
+	/* Note: we must use a real engine class for setting up reg state */
+	err = __execlists_context_pin(ce, ve->siblings[0]);
+	if (err)
+		return err;
+
+	virtual_engine_initial_hint(ve);
+	return 0;
+}
+
+static const struct intel_context_ops virtual_context_ops = {
+	.pin = virtual_context_pin,
+	.unpin = execlists_context_unpin,
+
+	.destroy = virtual_context_destroy,
+};
+
+static void virtual_submission_tasklet(unsigned long data)
+{
+	struct virtual_engine * const ve = (struct virtual_engine *)data;
+	unsigned int n;
+	int prio;
+
+	prio = READ_ONCE(ve->base.execlists.queue_priority_hint);
+	if (prio == INT_MIN)
+		return;
+
+	local_irq_disable();
+	for (n = 0; READ_ONCE(ve->request) && n < ve->count; n++) {
+		struct intel_engine_cs *sibling = ve->siblings[n];
+		struct ve_node * const node = &ve->nodes[sibling->id];
+		struct rb_node **parent, *rb;
+		bool first;
+
+		spin_lock(&sibling->timeline.lock);
+
+		if (!RB_EMPTY_NODE(&node->rb)) {
+			/*
+			 * Cheat and avoid rebalancing the tree if we can
+			 * reuse this node in situ.
+			 */
+			first = rb_first_cached(&sibling->execlists.virtual) ==
+				&node->rb;
+			if (prio == node->prio || (prio > node->prio && first))
+				goto submit_engine;
+
+			rb_erase_cached(&node->rb, &sibling->execlists.virtual);
+		}
+
+		rb = NULL;
+		first = true;
+		parent = &sibling->execlists.virtual.rb_root.rb_node;
+		while (*parent) {
+			struct ve_node *other;
+
+			rb = *parent;
+			other = rb_entry(rb, typeof(*other), rb);
+			if (prio > other->prio) {
+				parent = &rb->rb_left;
+			} else {
+				parent = &rb->rb_right;
+				first = false;
+			}
+		}
+
+		rb_link_node(&node->rb, rb, parent);
+		rb_insert_color_cached(&node->rb,
+				       &sibling->execlists.virtual,
+				       first);
+
+submit_engine:
+		GEM_BUG_ON(RB_EMPTY_NODE(&node->rb));
+		node->prio = prio;
+		if (first && prio > sibling->execlists.queue_priority_hint) {
+			sibling->execlists.queue_priority_hint = prio;
+			tasklet_hi_schedule(&sibling->execlists.tasklet);
+		}
+
+		spin_unlock(&sibling->timeline.lock);
+	}
+	local_irq_enable();
+}
+
+static void virtual_submit_request(struct i915_request *request)
+{
+	struct virtual_engine *ve = to_virtual_engine(request->engine);
+
+	GEM_BUG_ON(ve->base.submit_request != virtual_submit_request);
+
+	GEM_BUG_ON(ve->request);
+	ve->base.execlists.queue_priority_hint = rq_prio(request);
+	WRITE_ONCE(ve->request, request);
+
+	tasklet_schedule(&ve->base.execlists.tasklet);
+}
+
+struct intel_engine_cs *
+intel_execlists_create_virtual(struct i915_gem_context *ctx,
+			       struct intel_engine_cs **siblings,
+			       unsigned int count)
+{
+	struct virtual_engine *ve;
+	unsigned int n;
+	int err;
+
+	if (!count)
+		return ERR_PTR(-EINVAL);
+
+	ve = kzalloc(struct_size(ve, siblings, count), GFP_KERNEL);
+	if (!ve)
+		return ERR_PTR(-ENOMEM);
+
+	ve->base.i915 = ctx->i915;
+	ve->base.id = -1;
+	ve->base.class = OTHER_CLASS;
+	ve->base.uabi_class = I915_ENGINE_CLASS_INVALID;
+	ve->base.instance = I915_ENGINE_CLASS_INVALID_VIRTUAL;
+	ve->base.flags = I915_ENGINE_IS_VIRTUAL;
+
+	snprintf(ve->base.name, sizeof(ve->base.name), "virtual");
+
+	err = i915_timeline_init(ctx->i915,
+				 &ve->base.timeline,
+				 ve->base.name,
+				 NULL);
+	if (err)
+		goto err_put;
+	i915_timeline_set_subclass(&ve->base.timeline, TIMELINE_VIRTUAL);
+
+	ve->base.cops = &virtual_context_ops;
+	ve->base.request_alloc = execlists_request_alloc;
+
+	ve->base.schedule = i915_schedule;
+	ve->base.submit_request = virtual_submit_request;
+
+	ve->base.execlists.queue_priority_hint = INT_MIN;
+	tasklet_init(&ve->base.execlists.tasklet,
+		     virtual_submission_tasklet,
+		     (unsigned long)ve);
+
+	intel_context_init(&ve->context, ctx, &ve->base);
+
+	for (n = 0; n < count; n++) {
+		struct intel_engine_cs *sibling = siblings[n];
+
+		GEM_BUG_ON(!is_power_of_2(sibling->mask));
+		if (sibling->mask & ve->base.mask)
+			continue;
+
+		/*
+		 * The virtual engine implementation is tightly coupled to
+		 * the execlists backend -- we push out request directly
+		 * into a tree inside each physical engine. We could support
+		 * layering if we handling cloning of the requests and
+		 * submitting a copy into each backend.
+		 */
+		if (sibling->execlists.tasklet.func !=
+		    execlists_submission_tasklet) {
+			err = -ENODEV;
+			goto err_put;
+		}
+
+		GEM_BUG_ON(RB_EMPTY_NODE(&ve->nodes[sibling->id].rb));
+		RB_CLEAR_NODE(&ve->nodes[sibling->id].rb);
+
+		ve->siblings[ve->count++] = sibling;
+		ve->base.mask |= sibling->mask;
+
+		/*
+		 * All physical engines must be compatible for their emission
+		 * functions (as we build the instructions during request
+		 * construction and do not alter them before submission
+		 * on the physical engine). We use the engine class as a guide
+		 * here, although that could be refined.
+		 */
+		if (ve->base.class != OTHER_CLASS) {
+			if (ve->base.class != sibling->class) {
+				err = -EINVAL;
+				goto err_put;
+			}
+			continue;
+		}
+
+		ve->base.class = sibling->class;
+		snprintf(ve->base.name, sizeof(ve->base.name),
+			 "v%dx%d", ve->base.class, count);
+		ve->base.context_size = sibling->context_size;
+
+		ve->base.emit_bb_start = sibling->emit_bb_start;
+		ve->base.emit_flush = sibling->emit_flush;
+		ve->base.emit_init_breadcrumb = sibling->emit_init_breadcrumb;
+		ve->base.emit_fini_breadcrumb = sibling->emit_fini_breadcrumb;
+		ve->base.emit_fini_breadcrumb_dw =
+			sibling->emit_fini_breadcrumb_dw;
+	}
+
+	/* gracefully replace a degenerate virtual engine */
+	if (ve->count == 1) {
+		struct intel_engine_cs *actual = ve->siblings[0];
+		intel_context_put(&ve->context);
+		return actual;
+	}
+
+	__intel_context_insert(ctx, &ve->base, &ve->context);
+	return &ve->base;
+
+err_put:
+	intel_context_put(&ve->context);
+	return ERR_PTR(err);
+}
+
+struct intel_engine_cs *
+intel_execlists_clone_virtual(struct i915_gem_context *ctx,
+			      struct intel_engine_cs *src)
+{
+	struct virtual_engine *se = to_virtual_engine(src);
+	struct intel_engine_cs *dst;
+
+	dst = intel_execlists_create_virtual(ctx,
+					     se->siblings,
+					     se->count);
+	if (IS_ERR(dst))
+		return dst;
+
+	return dst;
+}
+
+void intel_virtual_engine_destroy(struct intel_engine_cs *engine)
+{
+	struct virtual_engine *ve = to_virtual_engine(engine);
+
+	if (!engine || !intel_engine_is_virtual(engine))
+		return;
+
+	__intel_context_remove(&ve->context);
+	intel_context_put(&ve->context);
+}
+
 void intel_execlists_show_requests(struct intel_engine_cs *engine,
 				   struct drm_printer *m,
 				   void (*show_request)(struct drm_printer *m,
@@ -2954,6 +3483,29 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 		show_request(m, last, "\t\tQ ");
 	}
 
+	last = NULL;
+	count = 0;
+	for (rb = rb_first_cached(&execlists->virtual); rb; rb = rb_next(rb)) {
+		struct virtual_engine *ve =
+			rb_entry(rb, typeof(*ve), nodes[engine->id].rb);
+		struct i915_request *rq = READ_ONCE(ve->request);
+
+		if (rq) {
+			if (count++ < max - 1)
+				show_request(m, rq, "\t\tV ");
+			else
+				last = rq;
+		}
+	}
+	if (last) {
+		if (count > max) {
+			drm_printf(m,
+				   "\t\t...skipping %d virtual requests...\n",
+				   count - max);
+		}
+		show_request(m, last, "\t\tV ");
+	}
+
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index f1aec8a6986f..9d90dc68e02b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -112,6 +112,17 @@ void intel_execlists_show_requests(struct intel_engine_cs *engine,
 							const char *prefix),
 				   unsigned int max);
 
+struct intel_engine_cs *
+intel_execlists_create_virtual(struct i915_gem_context *ctx,
+			       struct intel_engine_cs **siblings,
+			       unsigned int count);
+
+struct intel_engine_cs *
+intel_execlists_clone_virtual(struct i915_gem_context *ctx,
+			      struct intel_engine_cs *src);
+
+void intel_virtual_engine_destroy(struct intel_engine_cs *engine);
+
 u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
 
 #endif /* _INTEL_LRC_H_ */
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 9e871eb0bfb1..6df033960350 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -10,6 +10,7 @@
 
 #include "../i915_selftest.h"
 #include "igt_flush_test.h"
+#include "igt_live_test.h"
 #include "igt_spinner.h"
 #include "i915_random.h"
 
@@ -1057,6 +1058,169 @@ static int live_preempt_smoke(void *arg)
 	return err;
 }
 
+static int nop_virtual_engine(struct drm_i915_private *i915,
+			      struct intel_engine_cs **siblings,
+			      unsigned int nsibling,
+			      unsigned int nctx,
+			      unsigned int flags)
+#define CHAIN BIT(0)
+{
+	IGT_TIMEOUT(end_time);
+	struct i915_request *request[16];
+	struct i915_gem_context *ctx[16];
+	struct intel_engine_cs *ve[16];
+	unsigned long n, prime, nc;
+	struct igt_live_test t;
+	ktime_t times[2] = {};
+	int err;
+
+	GEM_BUG_ON(!nctx || nctx > ARRAY_SIZE(ctx));
+
+	for (n = 0; n < nctx; n++) {
+		ctx[n] = kernel_context(i915);
+		if (!ctx[n])
+			return -ENOMEM;
+
+		ve[n] = intel_execlists_create_virtual(ctx[n],
+						       siblings, nsibling);
+		if (IS_ERR(ve[n]))
+			return PTR_ERR(ve[n]);
+	}
+
+	err = igt_live_test_begin(&t, i915, __func__, ve[0]->name);
+	if (err)
+		goto out;
+
+	for_each_prime_number_from(prime, 1, 8192) {
+		times[1] = ktime_get_raw();
+
+		if (flags & CHAIN) {
+			for (nc = 0; nc < nctx; nc++) {
+				for (n = 0; n < prime; n++) {
+					request[nc] =
+						i915_request_alloc(ve[nc], ctx[nc]);
+					if (IS_ERR(request[nc])) {
+						err = PTR_ERR(request[nc]);
+						goto out;
+					}
+
+					i915_request_add(request[nc]);
+				}
+			}
+		} else {
+			for (n = 0; n < prime; n++) {
+				for (nc = 0; nc < nctx; nc++) {
+					request[nc] =
+						i915_request_alloc(ve[nc], ctx[nc]);
+					if (IS_ERR(request[nc])) {
+						err = PTR_ERR(request[nc]);
+						goto out;
+					}
+
+					i915_request_add(request[nc]);
+				}
+			}
+		}
+
+		for (nc = 0; nc < nctx; nc++) {
+			if (i915_request_wait(request[nc],
+					      I915_WAIT_LOCKED,
+					      HZ / 10) < 0) {
+				pr_err("%s(%s): wait for %llx:%lld timed out\n",
+				       __func__, ve[0]->name,
+				       request[nc]->fence.context,
+				       request[nc]->fence.seqno);
+
+				GEM_TRACE("%s(%s) failed at request %llx:%lld\n",
+					  __func__, ve[0]->name,
+					  request[nc]->fence.context,
+					  request[nc]->fence.seqno);
+				GEM_TRACE_DUMP();
+				i915_gem_set_wedged(i915);
+				break;
+			}
+		}
+
+		times[1] = ktime_sub(ktime_get_raw(), times[1]);
+		if (prime == 1)
+			times[0] = times[1];
+
+		if (__igt_timeout(end_time, NULL))
+			break;
+	}
+
+	err = igt_live_test_end(&t);
+	if (err)
+		goto out;
+
+	pr_info("Requestx%d latencies on %s: 1 = %lluns, %lu = %lluns\n",
+		nctx, ve[0]->name, ktime_to_ns(times[0]),
+		prime, div64_u64(ktime_to_ns(times[1]), prime));
+
+out:
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	for (nc = 0; nc < nctx; nc++) {
+		intel_virtual_engine_destroy(ve[nc]);
+		kernel_context_close(ctx[nc]);
+	}
+	return err;
+}
+
+static int live_virtual_engine(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned int class, inst;
+	int err = -ENODEV;
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	for_each_engine(engine, i915, id) {
+		err = nop_virtual_engine(i915, &engine, 1, 1, 0);
+		if (err) {
+			pr_err("Failed to wrap engine %s: err=%d\n",
+			       engine->name, err);
+			goto out_unlock;
+		}
+	}
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		int nsibling, n;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!i915->engine_class[class][inst])
+				break;
+
+			siblings[nsibling++] = i915->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (n = 1; n <= nsibling + 1; n++) {
+			err = nop_virtual_engine(i915, siblings, nsibling,
+						 n, 0);
+			if (err)
+				goto out_unlock;
+		}
+
+		err = nop_virtual_engine(i915, siblings, nsibling, n, CHAIN);
+		if (err)
+			goto out_unlock;
+	}
+
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -1068,6 +1232,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_chain_preempt),
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
+		SUBTEST(live_virtual_engine),
 	};
 
 	if (!HAS_EXECLISTS(i915))
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8ef6d60929c6..9c94c037d13b 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -127,6 +127,7 @@ enum drm_i915_gem_engine_class {
 };
 
 #define I915_ENGINE_CLASS_INVALID_NONE -1
+#define I915_ENGINE_CLASS_INVALID_VIRTUAL 0
 
 /**
  * DOC: perf_events exposed by i915 through /sys/bus/event_sources/drivers/i915
@@ -1598,8 +1599,37 @@ struct drm_i915_gem_context_param_sseu {
 	__u32 rsvd;
 };
 
+/*
+ * i915_context_engines_load_balance:
+ *
+ * Enable load balancing across this set of engines.
+ *
+ * Into the I915_EXEC_DEFAULT slot [0], a virtual engine is created that when
+ * used will proxy the execbuffer request onto one of the set of engines
+ * in such a way as to distribute the load evenly across the set.
+ *
+ * The set of engines must be compatible (e.g. the same HW class) as they
+ * will share the same logical GPU context and ring.
+ *
+ * To intermix rendering with the virtual engine and direct rendering onto
+ * the backing engines (bypassing the load balancing proxy), the context must
+ * be defined to use a single timeline for all engines.
+ */
+struct i915_context_engines_load_balance {
+	struct i915_user_extension base;
+
+	__u16 engine_index;
+	__u16 mbz16; /* reserved for future use; must be zero */
+	__u32 flags; /* all undefined flags must be zero */
+
+	__u64 engines_mask; /* selection mask of engines[] */
+
+	__u64 mbz64[4]; /* reserved for future use; must be zero */
+};
+
 struct i915_context_param_engines {
 	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
+#define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0
 
 	struct {
 		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 20/22] drm/i915: Extend execution fence to support a callback
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (17 preceding siblings ...)
  2019-03-18  9:52 ` [PATCH 19/22] drm/i915: Load balancing across a virtual engine Chris Wilson
@ 2019-03-18  9:52 ` Chris Wilson
  2019-03-18  9:52 ` [PATCH 21/22] drm/i915/execlists: Virtual engine bonding Chris Wilson
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:52 UTC (permalink / raw)
  To: intel-gfx

In the next patch, we will want to configure the slave request
depending on which physical engine the master request is executed on.
For this, we introduce a callback from the execute fence to convey this
information.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_request.c | 84 +++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_request.h |  4 ++
 2 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 2382339172b4..0a46f8113f5c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -38,6 +38,8 @@ struct execute_cb {
 	struct list_head link;
 	struct irq_work work;
 	struct i915_sw_fence *fence;
+	void (*hook)(struct i915_request *rq, struct dma_fence *signal);
+	struct i915_request *signal;
 };
 
 static struct i915_global_request {
@@ -343,6 +345,17 @@ static void irq_execute_cb(struct irq_work *wrk)
 	kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
+static void irq_execute_cb_hook(struct irq_work *wrk)
+{
+	struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
+
+	cb->hook(container_of(cb->fence, struct i915_request, submit),
+		 &cb->signal->fence);
+	i915_request_put(cb->signal);
+
+	irq_execute_cb(wrk);
+}
+
 static void __notify_execute_cb(struct i915_request *rq)
 {
 	struct execute_cb *cb;
@@ -369,14 +382,19 @@ static void __notify_execute_cb(struct i915_request *rq)
 }
 
 static int
-i915_request_await_execution(struct i915_request *rq,
-			     struct i915_request *signal,
-			     gfp_t gfp)
+__i915_request_await_execution(struct i915_request *rq,
+			       struct i915_request *signal,
+			       void (*hook)(struct i915_request *rq,
+					    struct dma_fence *signal),
+			       gfp_t gfp)
 {
 	struct execute_cb *cb;
 
-	if (i915_request_is_active(signal))
+	if (i915_request_is_active(signal)) {
+		if (hook)
+			hook(rq, &signal->fence);
 		return 0;
+	}
 
 	cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
 	if (!cb)
@@ -386,8 +404,18 @@ i915_request_await_execution(struct i915_request *rq,
 	i915_sw_fence_await(cb->fence);
 	init_irq_work(&cb->work, irq_execute_cb);
 
+	if (hook) {
+		cb->hook = hook;
+		cb->signal = i915_request_get(signal);
+		cb->work.func = irq_execute_cb_hook;
+	}
+
 	spin_lock_irq(&signal->lock);
 	if (i915_request_is_active(signal)) {
+		if (hook) {
+			hook(rq, &signal->fence);
+			i915_request_put(signal);
+		}
 		i915_sw_fence_complete(cb->fence);
 		kmem_cache_free(global.slab_execute_cbs, cb);
 	} else {
@@ -790,7 +818,7 @@ emit_semaphore_wait(struct i915_request *to,
 		return err;
 
 	/* Only submit our spinner after the signaler is running! */
-	err = i915_request_await_execution(to, from, gfp);
+	err = __i915_request_await_execution(to, from, NULL, gfp);
 	if (err)
 		return err;
 
@@ -910,6 +938,52 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence)
 	return 0;
 }
 
+int
+i915_request_await_execution(struct i915_request *rq,
+			     struct dma_fence *fence,
+			     void (*hook)(struct i915_request *rq,
+					  struct dma_fence *signal))
+{
+	struct dma_fence **child = &fence;
+	unsigned int nchild = 1;
+	int ret;
+
+	if (dma_fence_is_array(fence)) {
+		struct dma_fence_array *array = to_dma_fence_array(fence);
+
+		/* XXX Error for signal-on-any fence arrays */
+
+		child = array->fences;
+		nchild = array->num_fences;
+		GEM_BUG_ON(!nchild);
+	}
+
+	do {
+		fence = *child++;
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+			continue;
+
+		/*
+		 * We don't squash repeated fence dependencies here as we
+		 * want to run our callback in all cases.
+		 */
+
+		if (dma_fence_is_i915(fence))
+			ret = __i915_request_await_execution(rq,
+							     to_request(fence),
+							     hook,
+							     I915_FENCE_GFP);
+		else
+			ret = i915_sw_fence_await_dma_fence(&rq->submit, fence,
+							    I915_FENCE_TIMEOUT,
+							    GFP_KERNEL);
+		if (ret < 0)
+			return ret;
+	} while (--nchild);
+
+	return 0;
+}
+
 /**
  * i915_request_await_object - set this request to (async) wait upon a bo
  * @to: request we are wishing to use
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index cd6c130964cd..d4f6b2940130 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -265,6 +265,10 @@ int i915_request_await_object(struct i915_request *to,
 			      bool write);
 int i915_request_await_dma_fence(struct i915_request *rq,
 				 struct dma_fence *fence);
+int i915_request_await_execution(struct i915_request *rq,
+				 struct dma_fence *fence,
+				 void (*hook)(struct i915_request *rq,
+					      struct dma_fence *signal));
 
 void i915_request_add(struct i915_request *rq);
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 21/22] drm/i915/execlists: Virtual engine bonding
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (18 preceding siblings ...)
  2019-03-18  9:52 ` [PATCH 20/22] drm/i915: Extend execution fence to support a callback Chris Wilson
@ 2019-03-18  9:52 ` Chris Wilson
  2019-03-18  9:52 ` [PATCH 22/22] drm/i915: Allow specification of parallel execbuf Chris Wilson
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:52 UTC (permalink / raw)
  To: intel-gfx

Some users require that when a master batch is executed on one particular
engine, a companion batch is run simultaneously on a specific slave
engine. For this purpose, we introduce virtual engine bonding, allowing
maps of master:slaves to be constructed to constrain which physical
engines a virtual engine may select given a fence on a master engine.

For the moment, we continue to ignore the issue of preemption deferring
the master request for later. Ideally, we would like to then also remove
the slave and run something else rather than have it stall the pipeline.
With load balancing, we should be able to move workload around it, but
there is a similar stall on the master pipeline while it may wait for
the slave to be executed. At the cost of more latency for the bonded
request, it may be interesting to launch both on their engines in
lockstep. (Bubbles abound.)

Opens: Also what about bonding an engine as its own master? It doesn't
break anything internally, so allow the silliness.

v2: Emancipate the bonds
v3: Couple in delayed scheduling for the selftests
v4: Handle invalid mutually exclusive bonding
v5: Mention what the uapi does
v6: s/nbond/num_bonds/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c       |  50 +++++
 drivers/gpu/drm/i915/i915_request.c           |   1 +
 drivers/gpu/drm/i915/i915_request.h           |   3 +
 drivers/gpu/drm/i915/intel_engine_types.h     |   7 +
 drivers/gpu/drm/i915/intel_lrc.c              | 152 ++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.h              |   4 +
 drivers/gpu/drm/i915/selftests/intel_lrc.c    | 185 ++++++++++++++++++
 drivers/gpu/drm/i915/selftests/lib_sw_fence.c |   3 +
 include/uapi/drm/i915_drm.h                   |  33 ++++
 9 files changed, 438 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b387b71e2cb5..8e3b16462956 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1532,8 +1532,58 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data)
 	return 0;
 }
 
+static int
+set_engines__bond(struct i915_user_extension __user *base, void *data)
+{
+	struct i915_context_engines_bond __user *ext =
+		container_of_user(base, typeof(*ext), base);
+	const struct set_engines *set = data;
+	unsigned int idx, class, instance;
+	struct intel_engine_cs *master;
+	u64 siblings;
+	int err;
+
+	if (get_user(idx, &ext->virtual_index))
+		return -EFAULT;
+
+	if (idx >= set->num_engines)
+		return -EINVAL;
+
+	idx = array_index_nospec(idx, set->num_engines);
+	if (!set->engines[idx])
+		return -EINVAL;
+
+	/*
+	 * A non-virtual engine has 0 siblings to choose between; and submit
+	 * fence will always be directed to the one engine.
+	 */
+	if (!intel_engine_is_virtual(set->engines[idx]))
+		return 0;
+
+	err = check_user_mbz(&ext->mbz);
+	if (err)
+		return err;
+
+	if (get_user(class, &ext->master_class))
+		return -EFAULT;
+
+	if (get_user(instance, &ext->master_instance))
+		return -EFAULT;
+
+	master = intel_engine_lookup_user(set->ctx->i915, class, instance);
+	if (!master)
+		return -EINVAL;
+
+	if (get_user(siblings, &ext->sibling_mask))
+		return -EFAULT;
+
+	return intel_virtual_engine_attach_bond(set->engines[idx],
+						master, siblings);
+}
+
 static const i915_user_extension_fn set_engines__extensions[] = {
 	[I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE] = set_engines__load_balance,
+	[I915_CONTEXT_ENGINES_EXT_BOND] = set_engines__bond,
 };
 
 static int
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 0a46f8113f5c..2d209519a6d5 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -743,6 +743,7 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	rq->batch = NULL;
 	rq->capture_list = NULL;
 	rq->waitboost = false;
+	INIT_ALL_ENGINES(rq->execution_mask);
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index d4f6b2940130..5bdab6881b13 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -32,6 +32,8 @@
 #include "i915_selftest.h"
 #include "i915_sw_fence.h"
 
+#include "intel_engine_types.h"
+
 #include <uapi/drm/i915_drm.h>
 
 struct drm_file;
@@ -145,6 +147,7 @@ struct i915_request {
 	 */
 	struct i915_sched_node sched;
 	struct i915_dependency dep;
+	intel_engine_mask_t execution_mask;
 
 	/*
 	 * A convenience pointer to the current breadcrumb value stored in
diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
index 66ab1deeb7f5..2f148809f63c 100644
--- a/drivers/gpu/drm/i915/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/intel_engine_types.h
@@ -391,6 +391,13 @@ struct intel_engine_cs {
 	 */
 	void		(*submit_request)(struct i915_request *rq);
 
+	/*
+	 * Called on signaling of a SUBMIT_FENCE, passing along the signaling
+	 * request down to the bonded pairs.
+	 */
+	void            (*bond_execute)(struct i915_request *rq,
+					struct dma_fence *signal);
+
 	/*
 	 * Call when the priority on a request has changed and it and its
 	 * dependencies may need rescheduling. Note the request itself may
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 926ef07f5c38..027c5eff2dce 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -191,6 +191,18 @@ struct virtual_engine {
 		int prio;
 	} nodes[I915_NUM_ENGINES];
 
+	/*
+	 * Keep track of bonded pairs -- restrictions upon on our selection
+	 * of physical engines any particular request may be submitted to.
+	 * If we receive a submit-fence from a master engine, we will only
+	 * use one of sibling_mask physical engines.
+	 */
+	struct ve_bond {
+		struct intel_engine_cs *master;
+		intel_engine_mask_t sibling_mask;
+	} *bonds;
+	unsigned int num_bonds;
+
 	/* And finally, which physical engines this virtual engine maps onto. */
 	unsigned int count;
 	struct intel_engine_cs *siblings[0];
@@ -818,6 +830,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			continue;
 		}
 
+		if (!(rq->execution_mask & engine->mask)) {
+			/* We peeked too soon! */
+			rb = rb_next(rb);
+			continue;
+		}
+
 		/*
 		 * We track when the HW has completed saving the context image
 		 * (i.e. when we have seen the final CS event switching out of
@@ -912,6 +930,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			rb = rb_first_cached(&execlists->virtual);
 			continue;
 		}
+		GEM_BUG_ON(!(rq->execution_mask & engine->mask));
 
 		if (rq_prio(rq) >= queue_prio(execlists)) {
 			if (last && !can_merge_rq(last, rq)) {
@@ -3154,6 +3173,8 @@ static void virtual_context_destroy(struct kref *kref)
 	if (ve->context.state)
 		__execlists_context_fini(&ve->context);
 
+	kfree(ve->bonds);
+
 	i915_timeline_fini(&ve->base.timeline);
 	kfree(ve);
 }
@@ -3205,9 +3226,30 @@ static const struct intel_context_ops virtual_context_ops = {
 	.destroy = virtual_context_destroy,
 };
 
+static unsigned long virtual_submission_mask(struct virtual_engine *ve)
+{
+	struct i915_request *rq;
+	unsigned long mask;
+
+	rq = READ_ONCE(ve->request);
+	if (!rq)
+		return 0;
+
+	/* The rq is ready for submission; rq->execution_mask is now stable. */
+	mask = rq->execution_mask;
+	if (unlikely(!mask)) {
+		/* Invalid selection, submit to a random engine in error */
+		i915_request_skip(rq, -ENODEV);
+		mask = ve->siblings[0]->mask;
+	}
+
+	return mask;
+}
+
 static void virtual_submission_tasklet(unsigned long data)
 {
 	struct virtual_engine * const ve = (struct virtual_engine *)data;
+	unsigned long mask;
 	unsigned int n;
 	int prio;
 
@@ -3215,6 +3257,12 @@ static void virtual_submission_tasklet(unsigned long data)
 	if (prio == INT_MIN)
 		return;
 
+	rcu_read_lock();
+	mask = virtual_submission_mask(ve);
+	rcu_read_unlock();
+	if (unlikely(!mask))
+		return;
+
 	local_irq_disable();
 	for (n = 0; READ_ONCE(ve->request) && n < ve->count; n++) {
 		struct intel_engine_cs *sibling = ve->siblings[n];
@@ -3222,6 +3270,17 @@ static void virtual_submission_tasklet(unsigned long data)
 		struct rb_node **parent, *rb;
 		bool first;
 
+		if (unlikely(!(mask & sibling->mask))) {
+			if (!RB_EMPTY_NODE(&node->rb)) {
+				spin_lock(&sibling->timeline.lock);
+				rb_erase_cached(&node->rb,
+						&sibling->execlists.virtual);
+				RB_CLEAR_NODE(&node->rb);
+				spin_unlock(&sibling->timeline.lock);
+			}
+			continue;
+		}
+
 		spin_lock(&sibling->timeline.lock);
 
 		if (!RB_EMPTY_NODE(&node->rb)) {
@@ -3284,6 +3343,37 @@ static void virtual_submit_request(struct i915_request *request)
 	tasklet_schedule(&ve->base.execlists.tasklet);
 }
 
+static struct ve_bond *
+virtual_find_bond(struct virtual_engine *ve, struct intel_engine_cs *master)
+{
+	int i;
+
+	for (i = 0; i < ve->num_bonds; i++) {
+		if (ve->bonds[i].master == master)
+			return &ve->bonds[i];
+	}
+
+	return NULL;
+}
+
+static void
+virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
+{
+	struct virtual_engine *ve = to_virtual_engine(rq->engine);
+	struct ve_bond *bond;
+
+	bond = virtual_find_bond(ve, to_request(signal)->engine);
+	if (bond) {
+		intel_engine_mask_t old, new, cmp;
+
+		cmp = READ_ONCE(rq->execution_mask);
+		do {
+			old = cmp;
+			new = cmp & bond->sibling_mask;
+		} while ((cmp = cmpxchg(&rq->execution_mask, old, new)) != old);
+	}
+}
+
 struct intel_engine_cs *
 intel_execlists_create_virtual(struct i915_gem_context *ctx,
 			       struct intel_engine_cs **siblings,
@@ -3322,6 +3412,7 @@ intel_execlists_create_virtual(struct i915_gem_context *ctx,
 
 	ve->base.schedule = i915_schedule;
 	ve->base.submit_request = virtual_submit_request;
+	ve->base.bond_execute = virtual_bond_execute;
 
 	ve->base.execlists.queue_priority_hint = INT_MIN;
 	tasklet_init(&ve->base.execlists.tasklet,
@@ -3412,9 +3503,70 @@ intel_execlists_clone_virtual(struct i915_gem_context *ctx,
 	if (IS_ERR(dst))
 		return dst;
 
+	if (se->num_bonds) {
+		struct virtual_engine *de = to_virtual_engine(dst);
+
+		de->bonds = kmemdup(se->bonds,
+				    sizeof(*se->bonds) * se->num_bonds,
+				    GFP_KERNEL);
+		if (!de->bonds) {
+			intel_virtual_engine_destroy(dst);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		de->num_bonds = se->num_bonds;
+	}
+
 	return dst;
 }
 
+static unsigned long
+virtual_sibling_mask(struct virtual_engine *ve, unsigned long mask)
+{
+	unsigned long emask = 0;
+	int bit;
+
+	for_each_set_bit(bit, &mask, ve->count)
+		emask |= ve->siblings[bit]->mask;
+
+	return emask;
+}
+
+int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
+				     struct intel_engine_cs *master,
+				     unsigned long mask)
+{
+	struct virtual_engine *ve = to_virtual_engine(engine);
+	struct ve_bond *bond;
+
+	if (mask >> ve->count)
+		return -EINVAL;
+
+	mask = virtual_sibling_mask(ve, mask);
+	if (!mask)
+		return -EINVAL;
+
+	bond = virtual_find_bond(ve, master);
+	if (bond) {
+		bond->sibling_mask |= mask;
+		return 0;
+	}
+
+	bond = krealloc(ve->bonds,
+			sizeof(*bond) * (ve->num_bonds + 1),
+			GFP_KERNEL);
+	if (!bond)
+		return -ENOMEM;
+
+	bond[ve->num_bonds].master = master;
+	bond[ve->num_bonds].sibling_mask = mask;
+
+	ve->bonds = bond;
+	ve->num_bonds++;
+
+	return 0;
+}
+
 void intel_virtual_engine_destroy(struct intel_engine_cs *engine)
 {
 	struct virtual_engine *ve = to_virtual_engine(engine);
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 9d90dc68e02b..77b85648045a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -121,6 +121,10 @@ struct intel_engine_cs *
 intel_execlists_clone_virtual(struct i915_gem_context *ctx,
 			      struct intel_engine_cs *src);
 
+int intel_virtual_engine_attach_bond(struct intel_engine_cs *engine,
+				     struct intel_engine_cs *master,
+				     unsigned long mask);
+
 void intel_virtual_engine_destroy(struct intel_engine_cs *engine);
 
 u32 gen8_make_rpcs(struct drm_i915_private *i915, struct intel_sseu *ctx_sseu);
diff --git a/drivers/gpu/drm/i915/selftests/intel_lrc.c b/drivers/gpu/drm/i915/selftests/intel_lrc.c
index 6df033960350..bc8e13f80fb5 100644
--- a/drivers/gpu/drm/i915/selftests/intel_lrc.c
+++ b/drivers/gpu/drm/i915/selftests/intel_lrc.c
@@ -13,6 +13,7 @@
 #include "igt_live_test.h"
 #include "igt_spinner.h"
 #include "i915_random.h"
+#include "lib_sw_fence.h"
 
 #include "mock_context.h"
 
@@ -1221,6 +1222,189 @@ static int live_virtual_engine(void *arg)
 	return err;
 }
 
+static int bond_virtual_engine(struct drm_i915_private *i915,
+			       unsigned int class,
+			       struct intel_engine_cs **siblings,
+			       unsigned int nsibling,
+			       unsigned int flags)
+#define BOND_SCHEDULE BIT(0)
+{
+	struct intel_engine_cs *master;
+	struct i915_gem_context *ctx;
+	struct i915_request *rq[16];
+	enum intel_engine_id id;
+	unsigned long n;
+	int err;
+
+	GEM_BUG_ON(nsibling >= ARRAY_SIZE(rq) - 1);
+
+	ctx = kernel_context(i915);
+	if (!ctx)
+		return -ENOMEM;
+
+	err = 0;
+	rq[0] = ERR_PTR(-ENOMEM);
+	for_each_engine(master, i915, id) {
+		struct i915_sw_fence fence = {};
+
+		if (master->class == class)
+			continue;
+
+		memset_p((void *)rq, ERR_PTR(-EINVAL), ARRAY_SIZE(rq));
+
+		rq[0] = i915_request_alloc(master, ctx);
+		if (IS_ERR(rq[0])) {
+			err = PTR_ERR(rq[0]);
+			goto out;
+		}
+		i915_request_get(rq[0]);
+
+		if (flags & BOND_SCHEDULE) {
+			onstack_fence_init(&fence);
+			err = i915_sw_fence_await_sw_fence_gfp(&rq[0]->submit,
+							       &fence,
+							       GFP_KERNEL);
+		}
+		i915_request_add(rq[0]);
+		if (err < 0)
+			goto out;
+
+		for (n = 0; n < nsibling; n++) {
+			struct intel_engine_cs *engine;
+
+			engine = intel_execlists_create_virtual(ctx,
+								siblings,
+								nsibling);
+			if (IS_ERR(engine)) {
+				err = PTR_ERR(engine);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			err = intel_virtual_engine_attach_bond(engine,
+							       master,
+							       BIT(n));
+			if (err) {
+				intel_virtual_engine_destroy(engine);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+
+			rq[n + 1] = i915_request_alloc(engine, ctx);
+			if (IS_ERR(rq[n + 1])) {
+				err = PTR_ERR(rq[n + 1]);
+				intel_virtual_engine_destroy(engine);
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+			i915_request_get(rq[n + 1]);
+
+			err = i915_request_await_execution(rq[n + 1],
+							   &rq[0]->fence,
+							   engine->bond_execute);
+			i915_request_add(rq[n + 1]);
+			intel_virtual_engine_destroy(engine);
+			if (err < 0) {
+				onstack_fence_fini(&fence);
+				goto out;
+			}
+		}
+		onstack_fence_fini(&fence);
+
+		if (i915_request_wait(rq[0],
+				      I915_WAIT_LOCKED,
+				      HZ / 10) < 0) {
+			pr_err("Master request did not execute (on %s)!\n",
+			       rq[0]->engine->name);
+			err = -EIO;
+			goto out;
+		}
+
+		for (n = 0; n < nsibling; n++) {
+			if (i915_request_wait(rq[n + 1],
+					      I915_WAIT_LOCKED,
+					      MAX_SCHEDULE_TIMEOUT) < 0) {
+				err = -EIO;
+				goto out;
+			}
+
+			if (rq[n + 1]->engine != siblings[n]) {
+				pr_err("Bonded request did not execute on target engine: expected %s, used %s; master was %s\n",
+				       siblings[n]->name,
+				       rq[n + 1]->engine->name,
+				       rq[0]->engine->name);
+				err = -EINVAL;
+				goto out;
+			}
+		}
+
+		for (n = 0; !IS_ERR(rq[n]); n++)
+			i915_request_put(rq[n]);
+		rq[0] = ERR_PTR(-ENOMEM);
+	}
+
+out:
+	for (n = 0; !IS_ERR(rq[n]); n++)
+		i915_request_put(rq[n]);
+	if (igt_flush_test(i915, I915_WAIT_LOCKED))
+		err = -EIO;
+
+	kernel_context_close(ctx);
+	return err;
+}
+
+static int live_virtual_bond(void *arg)
+{
+	static const struct phase {
+		const char *name;
+		unsigned int flags;
+	} phases[] = {
+		{ "", 0 },
+		{ "schedule", BOND_SCHEDULE },
+		{ },
+	};
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+	unsigned int class, inst;
+	int err = 0;
+
+	if (USES_GUC_SUBMISSION(i915))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+		const struct phase *p;
+		int nsibling;
+
+		nsibling = 0;
+		for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+			if (!i915->engine_class[class][inst])
+				break;
+
+			GEM_BUG_ON(nsibling == ARRAY_SIZE(siblings));
+			siblings[nsibling++] = i915->engine_class[class][inst];
+		}
+		if (nsibling < 2)
+			continue;
+
+		for (p = phases; p->name; p++) {
+			err = bond_virtual_engine(i915,
+						  class, siblings, nsibling,
+						  p->flags);
+			if (err) {
+				pr_err("%s(%s): failed class=%d, nsibling=%d, err=%d\n",
+				       __func__, p->name, class, nsibling, err);
+				goto out_unlock;
+			}
+		}
+	}
+
+out_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_execlists_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
@@ -1233,6 +1417,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
 		SUBTEST(live_preempt_hang),
 		SUBTEST(live_preempt_smoke),
 		SUBTEST(live_virtual_engine),
+		SUBTEST(live_virtual_bond),
 	};
 
 	if (!HAS_EXECLISTS(i915))
diff --git a/drivers/gpu/drm/i915/selftests/lib_sw_fence.c b/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
index 2bfa72c1654b..b976c12817c5 100644
--- a/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
+++ b/drivers/gpu/drm/i915/selftests/lib_sw_fence.c
@@ -45,6 +45,9 @@ void __onstack_fence_init(struct i915_sw_fence *fence,
 
 void onstack_fence_fini(struct i915_sw_fence *fence)
 {
+	if (!fence->flags)
+		return;
+
 	i915_sw_fence_commit(fence);
 	i915_sw_fence_fini(fence);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 9c94c037d13b..0d9ca4fb9edb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1532,6 +1532,10 @@ struct drm_i915_gem_context_param {
  * sized argument, will revert back to default settings.
  *
  * See struct i915_context_param_engines.
+ *
+ * Extensions:
+ *   i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
+ *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
  */
 #define I915_CONTEXT_PARAM_ENGINES	0xa
 /* Must be kept compact -- no holes and well documented */
@@ -1627,9 +1631,38 @@ struct i915_context_engines_load_balance {
 	__u64 mbz64[4]; /* reserved for future use; must be zero */
 };
 
+/*
+ * i915_context_engines_bond:
+ *
+ * Constructed bonded pairs for execution within a virtual engine.
+ *
+ * All engines are equal, but some are more equal than others. Given
+ * the distribution of resources in the HW, it may be preferable to run
+ * a request on a given subset of engines in parallel to a request on a
+ * specific engine. We enable this selection of engines within a virtual
+ * engine by specifying bonding pairs, for any given master engine we will
+ * only execute on one of the corresponding siblings within the virtual engine.
+ *
+ * To execute a request in parallel on the master engine and a sibling requires
+ * coordination with a I915_EXEC_FENCE_SUBMIT.
+ */
+struct i915_context_engines_bond {
+	struct i915_user_extension base;
+
+	__u16 virtual_index; /* index of virtual engine in ctx->engines[] */
+	__u16 mbz;
+
+	__u16 master_class;
+	__u16 master_instance;
+
+	__u64 sibling_mask; /* bitmask of BIT(sibling_index) wrt the v.engine */
+	__u64 flags; /* all undefined flags must be zero */
+};
+
 struct i915_context_param_engines {
 	__u64 extensions; /* linked chain of extension blocks, 0 terminates */
 #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0
+#define I915_CONTEXT_ENGINES_EXT_BOND 1
 
 	struct {
 		__u16 engine_class; /* see enum drm_i915_gem_engine_class */
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 22/22] drm/i915: Allow specification of parallel execbuf
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (19 preceding siblings ...)
  2019-03-18  9:52 ` [PATCH 21/22] drm/i915/execlists: Virtual engine bonding Chris Wilson
@ 2019-03-18  9:52 ` Chris Wilson
  2019-03-18 17:10 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Flush pages on acquisition Patchwork
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18  9:52 UTC (permalink / raw)
  To: intel-gfx

There is a desire to split a task onto two engines and have them run at
the same time, e.g. scanline interleaving to spread the workload evenly.
Through the use of the out-fence from the first execbuf, we can
coordinate secondary execbuf to only become ready simultaneously with
the first, so that with all things idle the second execbufs are executed
in parallel with the first. The key difference here between the new
EXEC_FENCE_SUBMIT and the existing EXEC_FENCE_IN is that the in-fence
waits for the completion of the first request (so that all of its
rendering results are visible to the second execbuf, the more common
userspace fence requirement).

Since we only have a single input fence slot, userspace cannot mix an
in-fence and a submit-fence. It has to use one or the other! This is not
such a harsh requirement, since by virtue of the submit-fence, the
secondary execbuf inherit all of the dependencies from the first
request, and for the application the dependencies should be common
between the primary and secondary execbuf.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            |  1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 25 +++++++++++++++++++++-
 include/uapi/drm/i915_drm.h                | 17 ++++++++++++++-
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9a0fa3b21e9d..e7fdd9926266 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -421,6 +421,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_CAPTURE:
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
 	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
+	case I915_PARAM_HAS_EXEC_SUBMIT_FENCE:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 66b3921cc8bd..3e9a6892a7a9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2281,6 +2281,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 {
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
+	struct dma_fence *exec_fence = NULL;
 	struct sync_file *out_fence = NULL;
 	intel_wakeref_t wakeref;
 	int out_fence_fd = -1;
@@ -2324,11 +2325,24 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			return -EINVAL;
 	}
 
+	if (args->flags & I915_EXEC_FENCE_SUBMIT) {
+		if (in_fence) {
+			err = -EINVAL;
+			goto err_in_fence;
+		}
+
+		exec_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
+		if (!exec_fence) {
+			err = -EINVAL;
+			goto err_in_fence;
+		}
+	}
+
 	if (args->flags & I915_EXEC_FENCE_OUT) {
 		out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
 		if (out_fence_fd < 0) {
 			err = out_fence_fd;
-			goto err_in_fence;
+			goto err_exec_fence;
 		}
 	}
 
@@ -2460,6 +2474,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			goto err_request;
 	}
 
+	if (exec_fence) {
+		err = i915_request_await_execution(eb.request, exec_fence,
+						   eb.engine->bond_execute);
+		if (err < 0)
+			goto err_request;
+	}
+
 	if (fences) {
 		err = await_fence_array(&eb, fences);
 		if (err)
@@ -2520,6 +2541,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_out_fence:
 	if (out_fence_fd != -1)
 		put_unused_fd(out_fence_fd);
+err_exec_fence:
+	dma_fence_put(exec_fence);
 err_in_fence:
 	dma_fence_put(in_fence);
 	return err;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 0d9ca4fb9edb..08f680dd2b1c 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -593,6 +593,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Query whether DRM_I915_GEM_EXECBUFFER2 supports coordination of parallel
+ * execution through use of explicit fence support.
+ * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
+ */
+#define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1115,7 +1121,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/*
+ * Setting I915_EXEC_FENCE_SUBMIT implies that lower_32_bits(rsvd2) represent
+ * a sync_file fd to wait upon (in a nonblocking manner) prior to executing
+ * the batch.
+ *
+ * Returns -EINVAL if the sync_file fd cannot be found.
+ */
+#define I915_EXEC_FENCE_SUBMIT		(1 << 20)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
  2019-03-18  9:51 ` [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
@ 2019-03-18 10:21   ` Tvrtko Ursulin
  2019-03-18 10:40     ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 10:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> We want to use intel_engine_mask_t inside i915_request.h, which means
> extracting it from the general header file mess and placing it inside a
> types.h. A knock on effect is that the compiler wants to warn about
> type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
> for the worst.

We can't do:

#define ALL_ENGINES ((intel_engine_mask_t)-1)

to avoid this warning and a lot of the churn?

Regards,

Tvrtko

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile                 |  1 +
>   drivers/gpu/drm/i915/gvt/gvt.h                |  2 +-
>   drivers/gpu/drm/i915/gvt/handlers.c           |  2 +-
>   drivers/gpu/drm/i915/gvt/scheduler.c          |  2 +-
>   drivers/gpu/drm/i915/gvt/vgpu.c               |  6 +-
>   drivers/gpu/drm/i915/i915_drv.h               |  1 -
>   drivers/gpu/drm/i915/i915_reset.c             | 30 +++---
>   drivers/gpu/drm/i915/i915_reset.h             |  6 +-
>   drivers/gpu/drm/i915/i915_scheduler.h         | 86 +---------------
>   drivers/gpu/drm/i915/i915_scheduler_types.h   | 98 +++++++++++++++++++
>   drivers/gpu/drm/i915/i915_timeline.h          |  1 +
>   drivers/gpu/drm/i915/i915_timeline_types.h    |  3 +-
>   drivers/gpu/drm/i915/intel_device_info.h      |  3 +-
>   drivers/gpu/drm/i915/intel_engine_types.h     |  9 +-
>   .../gpu/drm/i915/selftests/i915_gem_context.c |  4 +-
>   .../gpu/drm/i915/selftests/intel_hangcheck.c  |  2 +-
>   .../test_i915_scheduler_types_standalone.c    |  7 ++
>   17 files changed, 147 insertions(+), 116 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/i915_scheduler_types.h
>   create mode 100644 drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 68fecf355471..197b081769b5 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -60,6 +60,7 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
>   i915-$(CONFIG_DRM_I915_WERROR) += \
>   	test_i915_active_types_standalone.o \
>   	test_i915_gem_context_types_standalone.o \
> +	test_i915_scheduler_types_standalone.o \
>   	test_i915_timeline_types_standalone.o \
>   	test_intel_context_types_standalone.o \
>   	test_intel_engine_types_standalone.o \
> diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
> index 8bce09de4b82..c7f373566ecd 100644
> --- a/drivers/gpu/drm/i915/gvt/gvt.h
> +++ b/drivers/gpu/drm/i915/gvt/gvt.h
> @@ -488,7 +488,7 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
>   void intel_gvt_destroy_vgpu(struct intel_vgpu *vgpu);
>   void intel_gvt_release_vgpu(struct intel_vgpu *vgpu);
>   void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
> -				 unsigned int engine_mask);
> +				 unsigned long engine_mask);
>   void intel_gvt_reset_vgpu(struct intel_vgpu *vgpu);
>   void intel_gvt_activate_vgpu(struct intel_vgpu *vgpu);
>   void intel_gvt_deactivate_vgpu(struct intel_vgpu *vgpu);
> diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
> index b596cb42e24e..a0d981547c9e 100644
> --- a/drivers/gpu/drm/i915/gvt/handlers.c
> +++ b/drivers/gpu/drm/i915/gvt/handlers.c
> @@ -311,7 +311,7 @@ static int mul_force_wake_write(struct intel_vgpu *vgpu,
>   static int gdrst_mmio_write(struct intel_vgpu *vgpu, unsigned int offset,
>   			    void *p_data, unsigned int bytes)
>   {
> -	unsigned int engine_mask = 0;
> +	unsigned long engine_mask = 0;
>   	u32 data;
>   
>   	write_vreg(vgpu, offset, p_data, bytes);
> diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
> index 7550e09939ae..56a9530b4e06 100644
> --- a/drivers/gpu/drm/i915/gvt/scheduler.c
> +++ b/drivers/gpu/drm/i915/gvt/scheduler.c
> @@ -1137,7 +1137,7 @@ void intel_vgpu_clean_submission(struct intel_vgpu *vgpu)
>    *
>    */
>   void intel_vgpu_reset_submission(struct intel_vgpu *vgpu,
> -		unsigned long engine_mask)
> +				 unsigned long engine_mask)
>   {
>   	struct intel_vgpu_submission *s = &vgpu->submission;
>   
> diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
> index 314e40121e47..e734c21e7d06 100644
> --- a/drivers/gpu/drm/i915/gvt/vgpu.c
> +++ b/drivers/gpu/drm/i915/gvt/vgpu.c
> @@ -526,14 +526,14 @@ struct intel_vgpu *intel_gvt_create_vgpu(struct intel_gvt *gvt,
>    * GPU engines. For FLR, engine_mask is ignored.
>    */
>   void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
> -				 unsigned int engine_mask)
> +				 unsigned long engine_mask)
>   {
>   	struct intel_gvt *gvt = vgpu->gvt;
>   	struct intel_gvt_workload_scheduler *scheduler = &gvt->scheduler;
> -	unsigned int resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
> +	unsigned long resetting_eng = dmlr ? ALL_ENGINES : engine_mask;
>   
>   	gvt_dbg_core("------------------------------------------\n");
> -	gvt_dbg_core("resseting vgpu%d, dmlr %d, engine_mask %08x\n",
> +	gvt_dbg_core("resseting vgpu%d, dmlr %d, engine_mask %08lx\n",
>   		     vgpu->id, dmlr, engine_mask);
>   
>   	vgpu->resetting_eng = resetting_eng;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 395aa9d5ba02..86080a6e0f45 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2432,7 +2432,6 @@ static inline unsigned int i915_sg_segment_size(void)
>   #define IS_GEN9_LP(dev_priv)	(IS_GEN(dev_priv, 9) && IS_LP(dev_priv))
>   #define IS_GEN9_BC(dev_priv)	(IS_GEN(dev_priv, 9) && !IS_LP(dev_priv))
>   
> -#define ALL_ENGINES	(~0u)
>   #define HAS_ENGINE(dev_priv, id) (INTEL_INFO(dev_priv)->engine_mask & BIT(id))
>   
>   #define HAS_LLC(dev_priv)	(INTEL_INFO(dev_priv)->has_llc)
> diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
> index 861fe083e383..b8daec7ddc06 100644
> --- a/drivers/gpu/drm/i915/i915_reset.c
> +++ b/drivers/gpu/drm/i915/i915_reset.c
> @@ -144,7 +144,7 @@ static void gen3_stop_engine(struct intel_engine_cs *engine)
>   }
>   
>   static void i915_stop_engines(struct drm_i915_private *i915,
> -			      unsigned int engine_mask)
> +			      unsigned long engine_mask)
>   {
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> @@ -165,7 +165,7 @@ static bool i915_in_reset(struct pci_dev *pdev)
>   }
>   
>   static int i915_do_reset(struct drm_i915_private *i915,
> -			 unsigned int engine_mask,
> +			 unsigned long engine_mask,
>   			 unsigned int retry)
>   {
>   	struct pci_dev *pdev = i915->drm.pdev;
> @@ -194,7 +194,7 @@ static bool g4x_reset_complete(struct pci_dev *pdev)
>   }
>   
>   static int g33_do_reset(struct drm_i915_private *i915,
> -			unsigned int engine_mask,
> +			unsigned long engine_mask,
>   			unsigned int retry)
>   {
>   	struct pci_dev *pdev = i915->drm.pdev;
> @@ -204,7 +204,7 @@ static int g33_do_reset(struct drm_i915_private *i915,
>   }
>   
>   static int g4x_do_reset(struct drm_i915_private *dev_priv,
> -			unsigned int engine_mask,
> +			unsigned long engine_mask,
>   			unsigned int retry)
>   {
>   	struct pci_dev *pdev = dev_priv->drm.pdev;
> @@ -242,7 +242,7 @@ static int g4x_do_reset(struct drm_i915_private *dev_priv,
>   }
>   
>   static int ironlake_do_reset(struct drm_i915_private *dev_priv,
> -			     unsigned int engine_mask,
> +			     unsigned long engine_mask,
>   			     unsigned int retry)
>   {
>   	int ret;
> @@ -299,7 +299,7 @@ static int gen6_hw_domain_reset(struct drm_i915_private *dev_priv,
>   }
>   
>   static int gen6_reset_engines(struct drm_i915_private *i915,
> -			      unsigned int engine_mask,
> +			      unsigned long engine_mask,
>   			      unsigned int retry)
>   {
>   	struct intel_engine_cs *engine;
> @@ -425,7 +425,7 @@ static void gen11_unlock_sfc(struct drm_i915_private *dev_priv,
>   }
>   
>   static int gen11_reset_engines(struct drm_i915_private *i915,
> -			       unsigned int engine_mask,
> +			       unsigned long engine_mask,
>   			       unsigned int retry)
>   {
>   	const u32 hw_engine_mask[] = {
> @@ -492,7 +492,7 @@ static void gen8_engine_reset_cancel(struct intel_engine_cs *engine)
>   }
>   
>   static int gen8_reset_engines(struct drm_i915_private *i915,
> -			      unsigned int engine_mask,
> +			      unsigned long engine_mask,
>   			      unsigned int retry)
>   {
>   	struct intel_engine_cs *engine;
> @@ -533,7 +533,7 @@ static int gen8_reset_engines(struct drm_i915_private *i915,
>   }
>   
>   typedef int (*reset_func)(struct drm_i915_private *,
> -			  unsigned int engine_mask,
> +			  unsigned long engine_mask,
>   			  unsigned int retry);
>   
>   static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
> @@ -554,7 +554,7 @@ static reset_func intel_get_gpu_reset(struct drm_i915_private *i915)
>   		return NULL;
>   }
>   
> -int intel_gpu_reset(struct drm_i915_private *i915, unsigned int engine_mask)
> +int intel_gpu_reset(struct drm_i915_private *i915, unsigned long engine_mask)
>   {
>   	const int retries = engine_mask == ALL_ENGINES ? RESET_MAX_RETRIES : 1;
>   	reset_func reset;
> @@ -588,7 +588,7 @@ int intel_gpu_reset(struct drm_i915_private *i915, unsigned int engine_mask)
>   		if (retry)
>   			i915_stop_engines(i915, engine_mask);
>   
> -		GEM_TRACE("engine_mask=%x\n", engine_mask);
> +		GEM_TRACE("engine_mask=%lx\n", engine_mask);
>   		preempt_disable();
>   		ret = reset(i915, engine_mask, retry);
>   		preempt_enable();
> @@ -688,7 +688,7 @@ static void gt_revoke(struct drm_i915_private *i915)
>   	revoke_mmaps(i915);
>   }
>   
> -static int gt_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
> +static int gt_reset(struct drm_i915_private *i915, unsigned long stalled_mask)
>   {
>   	struct intel_engine_cs *engine;
>   	enum intel_engine_id id;
> @@ -945,7 +945,7 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
>   	return result;
>   }
>   
> -static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
> +static int do_reset(struct drm_i915_private *i915, unsigned long stalled_mask)
>   {
>   	int err, i;
>   
> @@ -980,7 +980,7 @@ static int do_reset(struct drm_i915_private *i915, unsigned int stalled_mask)
>    *   - re-init display
>    */
>   void i915_reset(struct drm_i915_private *i915,
> -		unsigned int stalled_mask,
> +		unsigned long stalled_mask,
>   		const char *reason)
>   {
>   	struct i915_gpu_error *error = &i915->gpu_error;
> @@ -1222,7 +1222,7 @@ void i915_clear_error_registers(struct drm_i915_private *dev_priv)
>    * of a ring dump etc.).
>    */
>   void i915_handle_error(struct drm_i915_private *i915,
> -		       u32 engine_mask,
> +		       unsigned long engine_mask,
>   		       unsigned long flags,
>   		       const char *fmt, ...)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_reset.h b/drivers/gpu/drm/i915/i915_reset.h
> index 16f2389f656f..6d2bf7e81ac4 100644
> --- a/drivers/gpu/drm/i915/i915_reset.h
> +++ b/drivers/gpu/drm/i915/i915_reset.h
> @@ -17,7 +17,7 @@ struct intel_guc;
>   
>   __printf(4, 5)
>   void i915_handle_error(struct drm_i915_private *i915,
> -		       u32 engine_mask,
> +		       unsigned long engine_mask,
>   		       unsigned long flags,
>   		       const char *fmt, ...);
>   #define I915_ERROR_CAPTURE BIT(0)
> @@ -25,7 +25,7 @@ void i915_handle_error(struct drm_i915_private *i915,
>   void i915_clear_error_registers(struct drm_i915_private *i915);
>   
>   void i915_reset(struct drm_i915_private *i915,
> -		unsigned int stalled_mask,
> +		unsigned long stalled_mask,
>   		const char *reason);
>   int i915_reset_engine(struct intel_engine_cs *engine,
>   		      const char *reason);
> @@ -41,7 +41,7 @@ int i915_terminally_wedged(struct drm_i915_private *i915);
>   bool intel_has_gpu_reset(struct drm_i915_private *i915);
>   bool intel_has_reset_engine(struct drm_i915_private *i915);
>   
> -int intel_gpu_reset(struct drm_i915_private *i915, u32 engine_mask);
> +int intel_gpu_reset(struct drm_i915_private *i915, unsigned long engine_mask);
>   
>   int intel_reset_guc(struct drm_i915_private *i915);
>   
> diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h
> index 9a1d257f3d6e..07d243acf553 100644
> --- a/drivers/gpu/drm/i915/i915_scheduler.h
> +++ b/drivers/gpu/drm/i915/i915_scheduler.h
> @@ -8,92 +8,10 @@
>   #define _I915_SCHEDULER_H_
>   
>   #include <linux/bitops.h>
> +#include <linux/list.h>
>   #include <linux/kernel.h>
>   
> -#include <uapi/drm/i915_drm.h>
> -
> -struct drm_i915_private;
> -struct i915_request;
> -struct intel_engine_cs;
> -
> -enum {
> -	I915_PRIORITY_MIN = I915_CONTEXT_MIN_USER_PRIORITY - 1,
> -	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
> -	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
> -
> -	I915_PRIORITY_INVALID = INT_MIN
> -};
> -
> -#define I915_USER_PRIORITY_SHIFT 3
> -#define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
> -
> -#define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
> -#define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
> -
> -#define I915_PRIORITY_WAIT		((u8)BIT(0))
> -#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
> -#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
> -
> -#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
> -
> -struct i915_sched_attr {
> -	/**
> -	 * @priority: execution and service priority
> -	 *
> -	 * All clients are equal, but some are more equal than others!
> -	 *
> -	 * Requests from a context with a greater (more positive) value of
> -	 * @priority will be executed before those with a lower @priority
> -	 * value, forming a simple QoS.
> -	 *
> -	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
> -	 */
> -	int priority;
> -};
> -
> -/*
> - * "People assume that time is a strict progression of cause to effect, but
> - * actually, from a nonlinear, non-subjective viewpoint, it's more like a big
> - * ball of wibbly-wobbly, timey-wimey ... stuff." -The Doctor, 2015
> - *
> - * Requests exist in a complex web of interdependencies. Each request
> - * has to wait for some other request to complete before it is ready to be run
> - * (e.g. we have to wait until the pixels have been rendering into a texture
> - * before we can copy from it). We track the readiness of a request in terms
> - * of fences, but we also need to keep the dependency tree for the lifetime
> - * of the request (beyond the life of an individual fence). We use the tree
> - * at various points to reorder the requests whilst keeping the requests
> - * in order with respect to their various dependencies.
> - *
> - * There is no active component to the "scheduler". As we know the dependency
> - * DAG of each request, we are able to insert it into a sorted queue when it
> - * is ready, and are able to reorder its portion of the graph to accommodate
> - * dynamic priority changes.
> - */
> -struct i915_sched_node {
> -	struct list_head signalers_list; /* those before us, we depend upon */
> -	struct list_head waiters_list; /* those after us, they depend upon us */
> -	struct list_head link;
> -	struct i915_sched_attr attr;
> -	unsigned int flags;
> -#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
> -};
> -
> -struct i915_dependency {
> -	struct i915_sched_node *signaler;
> -	struct list_head signal_link;
> -	struct list_head wait_link;
> -	struct list_head dfs_link;
> -	unsigned long flags;
> -#define I915_DEPENDENCY_ALLOC BIT(0)
> -};
> -
> -struct i915_priolist {
> -	struct list_head requests[I915_PRIORITY_COUNT];
> -	struct rb_node node;
> -	unsigned long used;
> -	int priority;
> -};
> +#include "i915_scheduler_types.h"
>   
>   #define priolist_for_each_request(it, plist, idx) \
>   	for (idx = 0; idx < ARRAY_SIZE((plist)->requests); idx++) \
> diff --git a/drivers/gpu/drm/i915/i915_scheduler_types.h b/drivers/gpu/drm/i915/i915_scheduler_types.h
> new file mode 100644
> index 000000000000..5c94b3eb5c81
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_scheduler_types.h
> @@ -0,0 +1,98 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2018 Intel Corporation
> + */
> +
> +#ifndef _I915_SCHEDULER_TYPES_H_
> +#define _I915_SCHEDULER_TYPES_H_
> +
> +#include <linux/list.h>
> +#include <linux/rbtree.h>
> +
> +#include <uapi/drm/i915_drm.h>
> +
> +struct drm_i915_private;
> +struct i915_request;
> +struct intel_engine_cs;
> +
> +enum {
> +	I915_PRIORITY_MIN = I915_CONTEXT_MIN_USER_PRIORITY - 1,
> +	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
> +	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
> +
> +	I915_PRIORITY_INVALID = INT_MIN
> +};
> +
> +#define I915_USER_PRIORITY_SHIFT 3
> +#define I915_USER_PRIORITY(x) ((x) << I915_USER_PRIORITY_SHIFT)
> +
> +#define I915_PRIORITY_COUNT BIT(I915_USER_PRIORITY_SHIFT)
> +#define I915_PRIORITY_MASK (I915_PRIORITY_COUNT - 1)
> +
> +#define I915_PRIORITY_WAIT		((u8)BIT(0))
> +#define I915_PRIORITY_NEWCLIENT		((u8)BIT(1))
> +#define I915_PRIORITY_NOSEMAPHORE	((u8)BIT(2))
> +
> +#define __NO_PREEMPTION (I915_PRIORITY_WAIT)
> +
> +struct i915_sched_attr {
> +	/**
> +	 * @priority: execution and service priority
> +	 *
> +	 * All clients are equal, but some are more equal than others!
> +	 *
> +	 * Requests from a context with a greater (more positive) value of
> +	 * @priority will be executed before those with a lower @priority
> +	 * value, forming a simple QoS.
> +	 *
> +	 * The &drm_i915_private.kernel_context is assigned the lowest priority.
> +	 */
> +	int priority;
> +};
> +
> +/*
> + * "People assume that time is a strict progression of cause to effect, but
> + * actually, from a nonlinear, non-subjective viewpoint, it's more like a big
> + * ball of wibbly-wobbly, timey-wimey ... stuff." -The Doctor, 2015
> + *
> + * Requests exist in a complex web of interdependencies. Each request
> + * has to wait for some other request to complete before it is ready to be run
> + * (e.g. we have to wait until the pixels have been rendering into a texture
> + * before we can copy from it). We track the readiness of a request in terms
> + * of fences, but we also need to keep the dependency tree for the lifetime
> + * of the request (beyond the life of an individual fence). We use the tree
> + * at various points to reorder the requests whilst keeping the requests
> + * in order with respect to their various dependencies.
> + *
> + * There is no active component to the "scheduler". As we know the dependency
> + * DAG of each request, we are able to insert it into a sorted queue when it
> + * is ready, and are able to reorder its portion of the graph to accommodate
> + * dynamic priority changes.
> + */
> +struct i915_sched_node {
> +	struct list_head signalers_list; /* those before us, we depend upon */
> +	struct list_head waiters_list; /* those after us, they depend upon us */
> +	struct list_head link;
> +	struct i915_sched_attr attr;
> +	unsigned int flags;
> +#define I915_SCHED_HAS_SEMAPHORE	BIT(0)
> +};
> +
> +struct i915_dependency {
> +	struct i915_sched_node *signaler;
> +	struct list_head signal_link;
> +	struct list_head wait_link;
> +	struct list_head dfs_link;
> +	unsigned long flags;
> +#define I915_DEPENDENCY_ALLOC BIT(0)
> +};
> +
> +struct i915_priolist {
> +	struct list_head requests[I915_PRIORITY_COUNT];
> +	struct rb_node node;
> +	unsigned long used;
> +	int priority;
> +};
> +
> +#endif /* _I915_SCHEDULER_TYPES_H_ */
> diff --git a/drivers/gpu/drm/i915/i915_timeline.h b/drivers/gpu/drm/i915/i915_timeline.h
> index 9126c8206490..454aa72aee18 100644
> --- a/drivers/gpu/drm/i915/i915_timeline.h
> +++ b/drivers/gpu/drm/i915/i915_timeline.h
> @@ -27,6 +27,7 @@
>   
>   #include <linux/lockdep.h>
>   
> +#include "i915_active.h"
>   #include "i915_syncmap.h"
>   #include "i915_timeline_types.h"
>   
> diff --git a/drivers/gpu/drm/i915/i915_timeline_types.h b/drivers/gpu/drm/i915/i915_timeline_types.h
> index 8ff146dc05ba..d42053544d7c 100644
> --- a/drivers/gpu/drm/i915/i915_timeline_types.h
> +++ b/drivers/gpu/drm/i915/i915_timeline_types.h
> @@ -9,9 +9,10 @@
>   
>   #include <linux/list.h>
>   #include <linux/kref.h>
> +#include <linux/mutex.h>
>   #include <linux/types.h>
>   
> -#include "i915_active.h"
> +#include "i915_active_types.h"
>   
>   struct drm_i915_private;
>   struct i915_vma;
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h
> index 6234570a9b17..d20c33a10c11 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -27,6 +27,7 @@
>   
>   #include <uapi/drm/i915_drm.h>
>   
> +#include "intel_engine_types.h"
>   #include "intel_display.h"
>   
>   struct drm_printer;
> @@ -149,8 +150,6 @@ struct sseu_dev_info {
>   	u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES];
>   };
>   
> -typedef u8 intel_engine_mask_t;
> -
>   struct intel_device_info {
>   	u16 gen_mask;
>   
> diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
> index b0aa1f0d4e47..79a166b9a81b 100644
> --- a/drivers/gpu/drm/i915/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/intel_engine_types.h
> @@ -12,8 +12,10 @@
>   #include <linux/list.h>
>   #include <linux/types.h>
>   
> +#include "i915_gem.h"
> +#include "i915_scheduler_types.h"
> +#include "i915_selftest.h"
>   #include "i915_timeline_types.h"
> -#include "intel_device_info.h"
>   #include "intel_workarounds_types.h"
>   
>   #include "i915_gem_batch_pool.h"
> @@ -24,11 +26,16 @@
>   
>   #define I915_CMD_HASH_ORDER 9
>   
> +struct dma_fence;
>   struct drm_i915_reg_table;
>   struct i915_gem_context;
>   struct i915_request;
>   struct i915_sched_attr;
>   
> +typedef u8 intel_engine_mask_t;
> +#define ALL_ENGINES	(~0ul)
> +#define INIT_ALL_ENGINES(x) (x) = (intel_engine_mask_t)(ALL_ENGINES)
> +
>   struct intel_hw_status_page {
>   	struct i915_vma *vma;
>   	u32 *addr;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index 0759a90c0d5a..f18c78ebff07 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -1466,7 +1466,7 @@ static int igt_vm_isolation(void *arg)
>   }
>   
>   static __maybe_unused const char *
> -__engine_name(struct drm_i915_private *i915, unsigned int engines)
> +__engine_name(struct drm_i915_private *i915, unsigned long engines)
>   {
>   	struct intel_engine_cs *engine;
>   	unsigned int tmp;
> @@ -1482,7 +1482,7 @@ __engine_name(struct drm_i915_private *i915, unsigned int engines)
>   
>   static int __igt_switch_to_kernel_context(struct drm_i915_private *i915,
>   					  struct i915_gem_context *ctx,
> -					  unsigned int engines)
> +					  unsigned long engines)
>   {
>   	struct intel_engine_cs *engine;
>   	unsigned int tmp;
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index 76b4fa150f2e..05a7b9b9a1de 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -1124,7 +1124,7 @@ static int igt_reset_engines(void *arg)
>   	return 0;
>   }
>   
> -static u32 fake_hangcheck(struct drm_i915_private *i915, u32 mask)
> +static u32 fake_hangcheck(struct drm_i915_private *i915, unsigned long mask)
>   {
>   	u32 count = i915_reset_count(&i915->gpu_error);
>   
> diff --git a/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c b/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
> new file mode 100644
> index 000000000000..8afa2c3719fb
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c
> @@ -0,0 +1,7 @@
> +/*
> + * SPDX-License-Identifier: MIT
> + *
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_scheduler_types.h"
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring
  2019-03-18  9:51 ` [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring Chris Wilson
@ 2019-03-18 10:31   ` Tvrtko Ursulin
  2019-03-18 10:37     ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 10:31 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> As the final request on a ring may hold the reference to this ring (via
> retiring the last pinned context), we may find ourselves chasing a
> dangling pointer on completion of the list.
> 
> A quick solution is to hold a reference to the ring itself as we retire
> along it so that we only free it after we stop dereferencing it.

Is there a guilty commit to reference as Fixes: ?


> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
>   drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
>   drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
>   drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
>   drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
>   drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
>   6 files changed, 27 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 9533a85cb0b3..0a3d94517d0a 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
>   	if (!i915->gt.active_requests)
>   		return;
>   
> -	list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
> +	list_for_each_entry_safe(ring, tmp,
> +				 &i915->gt.active_rings, active_link) {
> +		intel_ring_get(ring); /* last rq holds reference! */
>   		ring_retire_requests(ring);
> +		intel_ring_put(ring);
> +	}

Where does it chase a dangling pointer? It used the safe iterator already.

>   }
>   
>   #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h
> index 79a166b9a81b..549fdfca17aa 100644
> --- a/drivers/gpu/drm/i915/intel_engine_types.h
> +++ b/drivers/gpu/drm/i915/intel_engine_types.h
> @@ -9,6 +9,7 @@
>   
>   #include <linux/hashtable.h>
>   #include <linux/irq_work.h>
> +#include <linux/kref.h>
>   #include <linux/list.h>
>   #include <linux/types.h>
>   
> @@ -58,6 +59,7 @@ struct intel_engine_hangcheck {
>   };
>   
>   struct intel_ring {
> +	struct kref ref;
>   	struct i915_vma *vma;
>   	void *vaddr;
>   
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index aa50f03ba812..d3f1fe06d013 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1236,7 +1236,7 @@ static void execlists_submit_request(struct i915_request *request)
>   
>   static void __execlists_context_fini(struct intel_context *ce)
>   {
> -	intel_ring_free(ce->ring);
> +	intel_ring_put(ce->ring);
>   
>   	GEM_BUG_ON(i915_gem_object_is_active(ce->state->obj));
>   	i915_gem_object_put(ce->state->obj);
> @@ -2867,7 +2867,7 @@ static int execlists_context_deferred_alloc(struct intel_context *ce,
>   	return 0;
>   
>   error_ring_free:
> -	intel_ring_free(ring);
> +	intel_ring_put(ring);
>   error_deref_obj:
>   	i915_gem_object_put(ctx_obj);
>   	return ret;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 746fe570466c..45a54fadc482 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1302,6 +1302,7 @@ intel_engine_create_ring(struct intel_engine_cs *engine,
>   	if (!ring)
>   		return ERR_PTR(-ENOMEM);
>   
> +	kref_init(&ring->ref);
>   	INIT_LIST_HEAD(&ring->request_list);
>   	ring->timeline = i915_timeline_get(timeline);
>   
> @@ -1326,9 +1327,9 @@ intel_engine_create_ring(struct intel_engine_cs *engine,
>   	return ring;
>   }
>   
> -void
> -intel_ring_free(struct intel_ring *ring)
> +void intel_ring_free(struct kref *ref)
>   {
> +	struct intel_ring *ring = container_of(ref, typeof(*ring), ref);
>   	struct drm_i915_gem_object *obj = ring->vma->obj;
>   
>   	i915_vma_close(ring->vma);
> @@ -1571,7 +1572,7 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
>   err_unpin:
>   	intel_ring_unpin(ring);
>   err_ring:
> -	intel_ring_free(ring);
> +	intel_ring_put(ring);
>   err:
>   	intel_engine_cleanup_common(engine);
>   	return err;
> @@ -1585,7 +1586,7 @@ void intel_engine_cleanup(struct intel_engine_cs *engine)
>   		(I915_READ_MODE(engine) & MODE_IDLE) == 0);
>   
>   	intel_ring_unpin(engine->buffer);
> -	intel_ring_free(engine->buffer);
> +	intel_ring_put(engine->buffer);
>   
>   	if (engine->cleanup)
>   		engine->cleanup(engine);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index e612bdca9fd9..a57489fcb302 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -231,7 +231,18 @@ int intel_ring_pin(struct intel_ring *ring);
>   void intel_ring_reset(struct intel_ring *ring, u32 tail);
>   unsigned int intel_ring_update_space(struct intel_ring *ring);
>   void intel_ring_unpin(struct intel_ring *ring);
> -void intel_ring_free(struct intel_ring *ring);
> +void intel_ring_free(struct kref *ref);
> +
> +static inline struct intel_ring *intel_ring_get(struct intel_ring *ring)
> +{
> +	kref_get(&ring->ref);
> +	return ring;
> +}
> +
> +static inline void intel_ring_put(struct intel_ring *ring)
> +{
> +	kref_put(&ring->ref, intel_ring_free);
> +}
>   
>   void intel_engine_stop(struct intel_engine_cs *engine);
>   void intel_engine_cleanup(struct intel_engine_cs *engine);
> diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
> index f6d120e05ee4..881450c694e9 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_engine.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
> @@ -57,6 +57,7 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
>   		return NULL;
>   	}
>   
> +	kref_init(&ring->base.ref);
>   	ring->base.size = sz;
>   	ring->base.effective_size = sz;
>   	ring->base.vaddr = (void *)(ring + 1);
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring
  2019-03-18 10:31   ` Tvrtko Ursulin
@ 2019-03-18 10:37     ` Chris Wilson
  2019-03-18 10:46       ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 10:37 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 10:31:57)
> 
> On 18/03/2019 09:51, Chris Wilson wrote:
> > As the final request on a ring may hold the reference to this ring (via
> > retiring the last pinned context), we may find ourselves chasing a
> > dangling pointer on completion of the list.
> > 
> > A quick solution is to hold a reference to the ring itself as we retire
> > along it so that we only free it after we stop dereferencing it.
> 
> Is there a guilty commit to reference as Fixes: ?

It only becomes a problem with veng as we gain an immediate free path,
whereas at the moment, context frees are deferred until they can acquire
the struct_mutex. We cannot hit this path at the moment, but that we had
to use the safe iterator implies that we were aware that the ring itself
could disappear. If you wanted to pin it on something,

References: b887d6154624 ("drm/i915: Retire requests along rings")

> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
> >   drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
> >   drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
> >   drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
> >   drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
> >   drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
> >   6 files changed, 27 insertions(+), 8 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 9533a85cb0b3..0a3d94517d0a 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
> >       if (!i915->gt.active_requests)
> >               return;
> >   
> > -     list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
> > +     list_for_each_entry_safe(ring, tmp,
> > +                              &i915->gt.active_rings, active_link) {
> > +             intel_ring_get(ring); /* last rq holds reference! */
> >               ring_retire_requests(ring);
> > +             intel_ring_put(ring);
> > +     }
> 
> Where does it chase a dangling pointer? It used the safe iterator already.

Inside ring_retire_requests(); the use of _safe here actually implies we
met this problem already :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link
  2019-03-18  9:51 ` [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link Chris Wilson
@ 2019-03-18 10:39   ` Tvrtko Ursulin
  2019-03-18 10:45     ` Chris Wilson
  2019-03-18 10:54   ` Chris Wilson
  1 sibling, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 10:39 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> On unpinning the intel_context, we remove it from the active list
> inside the GEM context. This list is supposed to be guarded by the GEM
> context mutex, so remember to take it!

Fixes: ?

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_context.c         | 15 +++++++++++----
>   drivers/gpu/drm/i915/intel_lrc.c             |  3 ---
>   drivers/gpu/drm/i915/intel_ringbuffer.c      |  3 ---
>   drivers/gpu/drm/i915/selftests/mock_engine.c |  2 --
>   4 files changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
> index 5a16c9bb2778..0ab894a058f6 100644
> --- a/drivers/gpu/drm/i915/intel_context.c
> +++ b/drivers/gpu/drm/i915/intel_context.c
> @@ -165,13 +165,13 @@ intel_context_pin(struct i915_gem_context *ctx,
>   		if (err)
>   			goto err;
>   
> +		i915_gem_context_get(ctx);
> +		GEM_BUG_ON(ce->gem_context != ctx);
> +
>   		mutex_lock(&ctx->mutex);
>   		list_add(&ce->active_link, &ctx->active_engines);
>   		mutex_unlock(&ctx->mutex);
>   
> -		i915_gem_context_get(ctx);
> -		GEM_BUG_ON(ce->gem_context != ctx);
> -
>   		smp_mb__before_atomic(); /* flush pin before it is visible */
>   	}
>   
> @@ -194,9 +194,16 @@ void intel_context_unpin(struct intel_context *ce)
>   	/* We may be called from inside intel_context_pin() to evict another */
>   	mutex_lock_nested(&ce->pin_mutex, SINGLE_DEPTH_NESTING);

Hm is the nested annotation and comment correct? Looking at it now, the 
allocations happen outside pin_mutex.

Regards,

Tvrtko

>   
> -	if (likely(atomic_dec_and_test(&ce->pin_count)))
> +	if (likely(atomic_dec_and_test(&ce->pin_count))) {
>   		ce->ops->unpin(ce);
>   
> +		mutex_lock(&ce->gem_context->mutex);
> +		list_del(&ce->active_link);
> +		mutex_unlock(&ce->gem_context->mutex);
> +
> +		i915_gem_context_put(ce->gem_context);
> +	}
> +
>   	mutex_unlock(&ce->pin_mutex);
>   }
>   
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d3f1fe06d013..13f5545fc1d2 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1306,9 +1306,6 @@ static void execlists_context_unpin(struct intel_context *ce)
>   
>   	i915_gem_object_unpin_map(ce->state->obj);
>   	__context_unpin(ce->state);
> -
> -	list_del(&ce->active_link);
> -	i915_gem_context_put(ce->gem_context);
>   }
>   
>   static void
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 45a54fadc482..6d60bc258feb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1415,9 +1415,6 @@ static void ring_context_unpin(struct intel_context *ce)
>   {
>   	__context_unpin_ppgtt(ce->gem_context);
>   	__context_unpin(ce);
> -
> -	list_del(&ce->active_link);
> -	i915_gem_context_put(ce->gem_context);
>   }
>   
>   static struct i915_vma *
> diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
> index 881450c694e9..7641b74ada98 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_engine.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
> @@ -126,8 +126,6 @@ static void hw_delay_complete(struct timer_list *t)
>   static void mock_context_unpin(struct intel_context *ce)
>   {
>   	mock_timeline_unpin(ce->ring->timeline);
> -	list_del(&ce->active_link);
> -	i915_gem_context_put(ce->gem_context);
>   }
>   
>   static void mock_context_destroy(struct intel_context *ce)
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
  2019-03-18 10:21   ` Tvrtko Ursulin
@ 2019-03-18 10:40     ` Chris Wilson
  2019-03-18 10:48       ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 10:40 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 10:21:41)
> 
> On 18/03/2019 09:51, Chris Wilson wrote:
> > We want to use intel_engine_mask_t inside i915_request.h, which means
> > extracting it from the general header file mess and placing it inside a
> > types.h. A knock on effect is that the compiler wants to warn about
> > type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
> > for the worst.
> 
> We can't do:
> 
> #define ALL_ENGINES ((intel_engine_mask_t)-1)
> 
> to avoid this warning and a lot of the churn?

The churn is a lot of type fixing which needs to be done at some point.
I'm not keen on passing the contracted intel_engine_mask_t, and
ALL_ENGINES is not all bits set.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link
  2019-03-18 10:39   ` Tvrtko Ursulin
@ 2019-03-18 10:45     ` Chris Wilson
  2019-03-18 10:50       ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 10:45 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 10:39:44)
> 
> On 18/03/2019 09:51, Chris Wilson wrote:
> > On unpinning the intel_context, we remove it from the active list
> > inside the GEM context. This list is supposed to be guarded by the GEM
> > context mutex, so remember to take it!
> 
> Fixes: ?

It is not broken yet :)

> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/i915/intel_context.c         | 15 +++++++++++----
> >   drivers/gpu/drm/i915/intel_lrc.c             |  3 ---
> >   drivers/gpu/drm/i915/intel_ringbuffer.c      |  3 ---
> >   drivers/gpu/drm/i915/selftests/mock_engine.c |  2 --
> >   4 files changed, 11 insertions(+), 12 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
> > index 5a16c9bb2778..0ab894a058f6 100644
> > --- a/drivers/gpu/drm/i915/intel_context.c
> > +++ b/drivers/gpu/drm/i915/intel_context.c
> > @@ -165,13 +165,13 @@ intel_context_pin(struct i915_gem_context *ctx,
> >               if (err)
> >                       goto err;
> >   
> > +             i915_gem_context_get(ctx);
> > +             GEM_BUG_ON(ce->gem_context != ctx);
> > +
> >               mutex_lock(&ctx->mutex);
> >               list_add(&ce->active_link, &ctx->active_engines);
> >               mutex_unlock(&ctx->mutex);
> >   
> > -             i915_gem_context_get(ctx);
> > -             GEM_BUG_ON(ce->gem_context != ctx);
> > -
> >               smp_mb__before_atomic(); /* flush pin before it is visible */
> >       }
> >   
> > @@ -194,9 +194,16 @@ void intel_context_unpin(struct intel_context *ce)
> >       /* We may be called from inside intel_context_pin() to evict another */
> >       mutex_lock_nested(&ce->pin_mutex, SINGLE_DEPTH_NESTING);
> 
> Hm is the nested annotation and comment correct?

Yes. pin -> vma bind -> evict -> retire requests -> unpin someone else.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring
  2019-03-18 10:37     ` Chris Wilson
@ 2019-03-18 10:46       ` Tvrtko Ursulin
  2019-03-18 10:56         ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 10:46 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 10:37, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-18 10:31:57)
>>
>> On 18/03/2019 09:51, Chris Wilson wrote:
>>> As the final request on a ring may hold the reference to this ring (via
>>> retiring the last pinned context), we may find ourselves chasing a
>>> dangling pointer on completion of the list.
>>>
>>> A quick solution is to hold a reference to the ring itself as we retire
>>> along it so that we only free it after we stop dereferencing it.
>>
>> Is there a guilty commit to reference as Fixes: ?
> 
> It only becomes a problem with veng as we gain an immediate free path,
> whereas at the moment, context frees are deferred until they can acquire
> the struct_mutex. We cannot hit this path at the moment, but that we had
> to use the safe iterator implies that we were aware that the ring itself
> could disappear. If you wanted to pin it on something,
> 
> References: b887d6154624 ("drm/i915: Retire requests along rings")
> 
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
>>>    drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
>>>    drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
>>>    drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
>>>    drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
>>>    drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
>>>    6 files changed, 27 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>> index 9533a85cb0b3..0a3d94517d0a 100644
>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>> @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
>>>        if (!i915->gt.active_requests)
>>>                return;
>>>    
>>> -     list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
>>> +     list_for_each_entry_safe(ring, tmp,
>>> +                              &i915->gt.active_rings, active_link) {
>>> +             intel_ring_get(ring); /* last rq holds reference! */
>>>                ring_retire_requests(ring);
>>> +             intel_ring_put(ring);
>>> +     }
>>
>> Where does it chase a dangling pointer? It used the safe iterator already.
> 
> Inside ring_retire_requests(); the use of _safe here actually implies we
> met this problem already :)

I get it, the issue is during ring->request_list iteration in 
ring_retire_requests. How about move ring pinning in there so it is clearer?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
  2019-03-18 10:40     ` Chris Wilson
@ 2019-03-18 10:48       ` Tvrtko Ursulin
  2019-03-18 13:57         ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 10:48 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 10:40, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-18 10:21:41)
>>
>> On 18/03/2019 09:51, Chris Wilson wrote:
>>> We want to use intel_engine_mask_t inside i915_request.h, which means
>>> extracting it from the general header file mess and placing it inside a
>>> types.h. A knock on effect is that the compiler wants to warn about
>>> type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
>>> for the worst.
>>
>> We can't do:
>>
>> #define ALL_ENGINES ((intel_engine_mask_t)-1)
>>
>> to avoid this warning and a lot of the churn?
> 
> The churn is a lot of type fixing which needs to be done at some point.
> I'm not keen on passing the contracted intel_engine_mask_t, and
> ALL_ENGINES is not all bits set.

It is all bit set in intel_engine_mask_t. ;) I forgot what was your 
argument against using it in function arguments. Perhaps because it is 
pointless.. I regret adding this typedef even more now.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link
  2019-03-18 10:45     ` Chris Wilson
@ 2019-03-18 10:50       ` Tvrtko Ursulin
  0 siblings, 0 replies; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 10:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 10:45, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-18 10:39:44)
>>
>> On 18/03/2019 09:51, Chris Wilson wrote:
>>> On unpinning the intel_context, we remove it from the active list
>>> inside the GEM context. This list is supposed to be guarded by the GEM
>>> context mutex, so remember to take it!
>>
>> Fixes: ?
> 
> It is not broken yet :)
> 
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/i915/intel_context.c         | 15 +++++++++++----
>>>    drivers/gpu/drm/i915/intel_lrc.c             |  3 ---
>>>    drivers/gpu/drm/i915/intel_ringbuffer.c      |  3 ---
>>>    drivers/gpu/drm/i915/selftests/mock_engine.c |  2 --
>>>    4 files changed, 11 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
>>> index 5a16c9bb2778..0ab894a058f6 100644
>>> --- a/drivers/gpu/drm/i915/intel_context.c
>>> +++ b/drivers/gpu/drm/i915/intel_context.c
>>> @@ -165,13 +165,13 @@ intel_context_pin(struct i915_gem_context *ctx,
>>>                if (err)
>>>                        goto err;
>>>    
>>> +             i915_gem_context_get(ctx);
>>> +             GEM_BUG_ON(ce->gem_context != ctx);
>>> +
>>>                mutex_lock(&ctx->mutex);
>>>                list_add(&ce->active_link, &ctx->active_engines);
>>>                mutex_unlock(&ctx->mutex);
>>>    
>>> -             i915_gem_context_get(ctx);
>>> -             GEM_BUG_ON(ce->gem_context != ctx);
>>> -
>>>                smp_mb__before_atomic(); /* flush pin before it is visible */
>>>        }
>>>    
>>> @@ -194,9 +194,16 @@ void intel_context_unpin(struct intel_context *ce)
>>>        /* We may be called from inside intel_context_pin() to evict another */
>>>        mutex_lock_nested(&ce->pin_mutex, SINGLE_DEPTH_NESTING);
>>
>> Hm is the nested annotation and comment correct?
> 
> Yes. pin -> vma bind -> evict -> retire requests -> unpin someone else.

Yep, the ops->pin.. I missed it.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link
  2019-03-18  9:51 ` [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link Chris Wilson
  2019-03-18 10:39   ` Tvrtko Ursulin
@ 2019-03-18 10:54   ` Chris Wilson
  1 sibling, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 10:54 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2019-03-18 09:51:47)
> On unpinning the intel_context, we remove it from the active list
> inside the GEM context. This list is supposed to be guarded by the GEM
> context mutex, so remember to take it!
> 

Fixes: 7e3d9a59410d ("drm/i915: Track active engines within a context")
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring
  2019-03-18 10:46       ` Tvrtko Ursulin
@ 2019-03-18 10:56         ` Chris Wilson
  2019-03-18 13:25           ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 10:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 10:46:28)
> 
> On 18/03/2019 10:37, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-03-18 10:31:57)
> >>
> >> On 18/03/2019 09:51, Chris Wilson wrote:
> >>> As the final request on a ring may hold the reference to this ring (via
> >>> retiring the last pinned context), we may find ourselves chasing a
> >>> dangling pointer on completion of the list.
> >>>
> >>> A quick solution is to hold a reference to the ring itself as we retire
> >>> along it so that we only free it after we stop dereferencing it.
> >>
> >> Is there a guilty commit to reference as Fixes: ?
> > 
> > It only becomes a problem with veng as we gain an immediate free path,
> > whereas at the moment, context frees are deferred until they can acquire
> > the struct_mutex. We cannot hit this path at the moment, but that we had
> > to use the safe iterator implies that we were aware that the ring itself
> > could disappear. If you wanted to pin it on something,
> > 
> > References: b887d6154624 ("drm/i915: Retire requests along rings")
> > 
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
> >>>    drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
> >>>    drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
> >>>    drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
> >>>    drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
> >>>    drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
> >>>    6 files changed, 27 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>> index 9533a85cb0b3..0a3d94517d0a 100644
> >>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>> @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
> >>>        if (!i915->gt.active_requests)
> >>>                return;
> >>>    
> >>> -     list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
> >>> +     list_for_each_entry_safe(ring, tmp,
> >>> +                              &i915->gt.active_rings, active_link) {
> >>> +             intel_ring_get(ring); /* last rq holds reference! */
> >>>                ring_retire_requests(ring);
> >>> +             intel_ring_put(ring);
> >>> +     }
> >>
> >> Where does it chase a dangling pointer? It used the safe iterator already.
> > 
> > Inside ring_retire_requests(); the use of _safe here actually implies we
> > met this problem already :)
> 
> I get it, the issue is during ring->request_list iteration in 
> ring_retire_requests. How about move ring pinning in there so it is clearer?

It's only this path that is affected. That maye becomes clearer when we
retire along timelines and not rings, which is the patchset I wanted to
land to fix this, but figured this was a tiny fix for now.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/22] drm/i915: Hold a reference to the active HW context
  2019-03-18  9:51 ` [PATCH 06/22] drm/i915: Hold a reference to the active HW context Chris Wilson
@ 2019-03-18 12:54   ` Tvrtko Ursulin
  2019-03-18 12:56     ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 12:54 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> For virtual engines, we need to keep the HW context alive while it
> remains in use. For regular HW contexts, they are created and kept alive
> until the end of the GEM context. For simplicity, generalise the
> requirements and keep an active reference to each HW context.

Is there a functional effect from this patch? Later with veng added?

Regards,

Tvrtko


> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c      |  2 +-
>   drivers/gpu/drm/i915/intel_context.c         |  6 ++++++
>   drivers/gpu/drm/i915/intel_context.h         | 11 +++++++++++
>   drivers/gpu/drm/i915/intel_context_types.h   |  6 +++++-
>   drivers/gpu/drm/i915/intel_lrc.c             |  4 +++-
>   drivers/gpu/drm/i915/intel_ringbuffer.c      |  4 +++-
>   drivers/gpu/drm/i915/selftests/mock_engine.c |  7 ++++++-
>   7 files changed, 35 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 21208a865380..d776d43707e0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -232,7 +232,7 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
>   	i915_ppgtt_put(ctx->ppgtt);
>   
>   	rbtree_postorder_for_each_entry_safe(it, n, &ctx->hw_contexts, node)
> -		it->ops->destroy(it);
> +		intel_context_put(it);
>   
>   	kfree(ctx->name);
>   	put_pid(ctx->pid);
> diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c
> index 0ab894a058f6..8931e0fee873 100644
> --- a/drivers/gpu/drm/i915/intel_context.c
> +++ b/drivers/gpu/drm/i915/intel_context.c
> @@ -172,6 +172,7 @@ intel_context_pin(struct i915_gem_context *ctx,
>   		list_add(&ce->active_link, &ctx->active_engines);
>   		mutex_unlock(&ctx->mutex);
>   
> +		intel_context_get(ce);
>   		smp_mb__before_atomic(); /* flush pin before it is visible */
>   	}
>   
> @@ -192,6 +193,7 @@ void intel_context_unpin(struct intel_context *ce)
>   		return;
>   
>   	/* We may be called from inside intel_context_pin() to evict another */
> +	intel_context_get(ce);
>   	mutex_lock_nested(&ce->pin_mutex, SINGLE_DEPTH_NESTING);
>   
>   	if (likely(atomic_dec_and_test(&ce->pin_count))) {
> @@ -202,9 +204,11 @@ void intel_context_unpin(struct intel_context *ce)
>   		mutex_unlock(&ce->gem_context->mutex);
>   
>   		i915_gem_context_put(ce->gem_context);
> +		intel_context_put(ce);
>   	}
>   
>   	mutex_unlock(&ce->pin_mutex);
> +	intel_context_put(ce);
>   }
>   
>   static void intel_context_retire(struct i915_active_request *active,
> @@ -221,6 +225,8 @@ intel_context_init(struct intel_context *ce,
>   		   struct i915_gem_context *ctx,
>   		   struct intel_engine_cs *engine)
>   {
> +	kref_init(&ce->ref);
> +
>   	ce->gem_context = ctx;
>   	ce->engine = engine;
>   	ce->ops = engine->cops;
> diff --git a/drivers/gpu/drm/i915/intel_context.h b/drivers/gpu/drm/i915/intel_context.h
> index 9546d932406a..ebc861b1a49e 100644
> --- a/drivers/gpu/drm/i915/intel_context.h
> +++ b/drivers/gpu/drm/i915/intel_context.h
> @@ -73,4 +73,15 @@ static inline void __intel_context_pin(struct intel_context *ce)
>   
>   void intel_context_unpin(struct intel_context *ce);
>   
> +static inline struct intel_context *intel_context_get(struct intel_context *ce)
> +{
> +	kref_get(&ce->ref);
> +	return ce;
> +}
> +
> +static inline void intel_context_put(struct intel_context *ce)
> +{
> +	kref_put(&ce->ref, ce->ops->destroy);
> +}
> +
>   #endif /* __INTEL_CONTEXT_H__ */
> diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/intel_context_types.h
> index 6dc9b4b9067b..624729a35875 100644
> --- a/drivers/gpu/drm/i915/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/intel_context_types.h
> @@ -7,6 +7,7 @@
>   #ifndef __INTEL_CONTEXT_TYPES__
>   #define __INTEL_CONTEXT_TYPES__
>   
> +#include <linux/kref.h>
>   #include <linux/list.h>
>   #include <linux/mutex.h>
>   #include <linux/rbtree.h>
> @@ -22,7 +23,8 @@ struct intel_ring;
>   struct intel_context_ops {
>   	int (*pin)(struct intel_context *ce);
>   	void (*unpin)(struct intel_context *ce);
> -	void (*destroy)(struct intel_context *ce);
> +
> +	void (*destroy)(struct kref *kref);
>   };
>   
>   /*
> @@ -36,6 +38,8 @@ struct intel_sseu {
>   };
>   
>   struct intel_context {
> +	struct kref ref;
> +
>   	struct i915_gem_context *gem_context;
>   	struct intel_engine_cs *engine;
>   	struct intel_engine_cs *active;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 13f5545fc1d2..fbf67105f040 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1242,8 +1242,10 @@ static void __execlists_context_fini(struct intel_context *ce)
>   	i915_gem_object_put(ce->state->obj);
>   }
>   
> -static void execlists_context_destroy(struct intel_context *ce)
> +static void execlists_context_destroy(struct kref *kref)
>   {
> +	struct intel_context *ce = container_of(kref, typeof(*ce), ref);
> +
>   	GEM_BUG_ON(intel_context_is_pinned(ce));
>   
>   	if (ce->state)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 6d60bc258feb..35fdebd67e5f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1345,8 +1345,10 @@ static void __ring_context_fini(struct intel_context *ce)
>   	i915_gem_object_put(ce->state->obj);
>   }
>   
> -static void ring_context_destroy(struct intel_context *ce)
> +static void ring_context_destroy(struct kref *ref)
>   {
> +	struct intel_context *ce = container_of(ref, typeof(*ce), ref);
> +
>   	GEM_BUG_ON(intel_context_is_pinned(ce));
>   
>   	if (ce->state)
> diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
> index 7641b74ada98..639d36eb904a 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_engine.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
> @@ -128,12 +128,16 @@ static void mock_context_unpin(struct intel_context *ce)
>   	mock_timeline_unpin(ce->ring->timeline);
>   }
>   
> -static void mock_context_destroy(struct intel_context *ce)
> +static void mock_context_destroy(struct kref *ref)
>   {
> +	struct intel_context *ce = container_of(ref, typeof(*ce), ref);
> +
>   	GEM_BUG_ON(intel_context_is_pinned(ce));
>   
>   	if (ce->ring)
>   		mock_ring_free(ce->ring);
> +
> +	intel_context_free(ce);
>   }
>   
>   static int mock_context_pin(struct intel_context *ce)
> @@ -151,6 +155,7 @@ static int mock_context_pin(struct intel_context *ce)
>   static const struct intel_context_ops mock_context_ops = {
>   	.pin = mock_context_pin,
>   	.unpin = mock_context_unpin,
> +
>   	.destroy = mock_context_destroy,
>   };
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/22] drm/i915: Hold a reference to the active HW context
  2019-03-18 12:54   ` Tvrtko Ursulin
@ 2019-03-18 12:56     ` Chris Wilson
  2019-03-18 12:57       ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 12:56 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 12:54:00)
> 
> On 18/03/2019 09:51, Chris Wilson wrote:
> > For virtual engines, we need to keep the HW context alive while it
> > remains in use. For regular HW contexts, they are created and kept alive
> > until the end of the GEM context. For simplicity, generalise the
> > requirements and keep an active reference to each HW context.
> 
> Is there a functional effect from this patch? Later with veng added?

If by functional do you mean prevents the code from eating itself on
use-after-free after the engines are freed, then yes.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/22] drm/i915: Hold a reference to the active HW context
  2019-03-18 12:56     ` Chris Wilson
@ 2019-03-18 12:57       ` Chris Wilson
  2019-03-18 13:29         ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 12:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-03-18 12:56:12)
> Quoting Tvrtko Ursulin (2019-03-18 12:54:00)
> > 
> > On 18/03/2019 09:51, Chris Wilson wrote:
> > > For virtual engines, we need to keep the HW context alive while it
> > > remains in use. For regular HW contexts, they are created and kept alive
> > > until the end of the GEM context. For simplicity, generalise the
> > > requirements and keep an active reference to each HW context.
> > 
> > Is there a functional effect from this patch? Later with veng added?
> 
> If by functional do you mean prevents the code from eating itself on
> use-after-free after the engines are freed, then yes.

A variation of this used to be inside the veng patch, but that only
applied itself to veng. After the discussion there, I felt it would be
more obvious if it was applied as a standalone patch by generalising the
requirements to all HW context.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 07/22] drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set
  2019-03-18  9:51 ` [PATCH 07/22] drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set Chris Wilson
@ 2019-03-18 13:08   ` Tvrtko Ursulin
  0 siblings, 0 replies; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 13:08 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Yokoyama


On 18/03/2019 09:51, Chris Wilson wrote:
> We only need to acquire a wakeref for ourselves for a few operations, as
> most either already acquire their own wakeref or imply a wakeref. In
> particular, it is i915_gem_set_wedged() that needed us to present it
> with a wakeref, which is incongruous with its "use anywhere" ability.
> 
> Suggested-by: Yokoyama, Caz <caz.yokoyama@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Yokoyama, Caz <caz.yokoyama@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c | 12 ++++--------
>   drivers/gpu/drm/i915/i915_reset.c   |  4 +++-
>   2 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 6a90558de213..08683dca7775 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -3888,12 +3888,9 @@ static int
>   i915_drop_caches_set(void *data, u64 val)
>   {
>   	struct drm_i915_private *i915 = data;
> -	intel_wakeref_t wakeref;
> -	int ret = 0;
>   
>   	DRM_DEBUG("Dropping caches: 0x%08llx [0x%08llx]\n",
>   		  val, val & DROP_ALL);
> -	wakeref = intel_runtime_pm_get(i915);
>   
>   	if (val & DROP_RESET_ACTIVE &&
>   	    wait_for(intel_engines_are_idle(i915), I915_IDLE_ENGINES_TIMEOUT))
> @@ -3902,9 +3899,11 @@ i915_drop_caches_set(void *data, u64 val)
>   	/* No need to check and wait for gpu resets, only libdrm auto-restarts
>   	 * on ioctls on -EAGAIN. */
>   	if (val & (DROP_ACTIVE | DROP_RETIRE | DROP_RESET_SEQNO)) {
> +		int ret;
> +
>   		ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
>   		if (ret)
> -			goto out;
> +			return ret;
>   
>   		if (val & DROP_ACTIVE)
>   			ret = i915_gem_wait_for_idle(i915,
> @@ -3943,10 +3942,7 @@ i915_drop_caches_set(void *data, u64 val)
>   	if (val & DROP_FREED)
>   		i915_gem_drain_freed_objects(i915);
>   
> -out:
> -	intel_runtime_pm_put(i915, wakeref);
> -
> -	return ret;
> +	return 0;
>   }
>   
>   DEFINE_SIMPLE_ATTRIBUTE(i915_drop_caches_fops,
> diff --git a/drivers/gpu/drm/i915/i915_reset.c b/drivers/gpu/drm/i915/i915_reset.c
> index b8daec7ddc06..e61bfa0fc4e0 100644
> --- a/drivers/gpu/drm/i915/i915_reset.c
> +++ b/drivers/gpu/drm/i915/i915_reset.c
> @@ -863,9 +863,11 @@ static void __i915_gem_set_wedged(struct drm_i915_private *i915)
>   void i915_gem_set_wedged(struct drm_i915_private *i915)
>   {
>   	struct i915_gpu_error *error = &i915->gpu_error;
> +	intel_wakeref_t wakeref;
>   
>   	mutex_lock(&error->wedge_mutex);
> -	__i915_gem_set_wedged(i915);
> +	with_intel_runtime_pm(i915, wakeref)
> +		__i915_gem_set_wedged(i915);
>   	mutex_unlock(&error->wedge_mutex);
>   }
>   
> 

Statement that all paths will take a wakeref looks true.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 09/22] drm/i915: Switch to use HWS indices rather than addresses
  2019-03-18  9:51 ` [PATCH 09/22] drm/i915: Switch to use HWS indices rather than addresses Chris Wilson
@ 2019-03-18 13:21   ` Tvrtko Ursulin
  0 siblings, 0 replies; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 13:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> If we use the STORE_DATA_INDEX function we can use a fixed offset and
> avoid having to lookup up the engine HWS address. A step closer to being
> able to emit the final breadcrumb during request_add rather than later
> in the submission interrupt handler.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/intel_guc_submission.c |  3 ++-
>   drivers/gpu/drm/i915/intel_lrc.c            | 17 +++++++----------
>   drivers/gpu/drm/i915/intel_ringbuffer.c     | 16 ++++++----------
>   drivers/gpu/drm/i915/intel_ringbuffer.h     |  4 ++--
>   4 files changed, 17 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
> index 4a5727233419..c4ad73980988 100644
> --- a/drivers/gpu/drm/i915/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/intel_guc_submission.c
> @@ -583,7 +583,8 @@ static void inject_preempt_context(struct work_struct *work)
>   		} else {
>   			cs = gen8_emit_ggtt_write(cs,
>   						  GUC_PREEMPT_FINISHED,
> -						  addr);
> +						  addr,
> +						  0);
>   			*cs++ = MI_NOOP;
>   			*cs++ = MI_NOOP;
>   		}
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index fbf67105f040..7e0c20a2d733 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -173,12 +173,6 @@ static void execlists_init_reg_state(u32 *reg_state,
>   				     struct intel_engine_cs *engine,
>   				     struct intel_ring *ring);
>   
> -static inline u32 intel_hws_hangcheck_address(struct intel_engine_cs *engine)
> -{
> -	return (i915_ggtt_offset(engine->status_page.vma) +
> -		I915_GEM_HWS_HANGCHECK_ADDR);
> -}
> -
>   static inline struct i915_priolist *to_priolist(struct rb_node *rb)
>   {
>   	return rb_entry(rb, struct i915_priolist, node);
> @@ -2213,11 +2207,14 @@ static u32 *gen8_emit_fini_breadcrumb(struct i915_request *request, u32 *cs)
>   {
>   	cs = gen8_emit_ggtt_write(cs,
>   				  request->fence.seqno,
> -				  request->timeline->hwsp_offset);
> +				  request->timeline->hwsp_offset,
> +				  0);
>   
>   	cs = gen8_emit_ggtt_write(cs,
>   				  intel_engine_next_hangcheck_seqno(request->engine),
> -				  intel_hws_hangcheck_address(request->engine));
> +				  I915_GEM_HWS_HANGCHECK_ADDR,
> +				  MI_FLUSH_DW_STORE_INDEX);
> +
>   
>   	*cs++ = MI_USER_INTERRUPT;
>   	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> @@ -2241,8 +2238,8 @@ static u32 *gen8_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
>   
>   	cs = gen8_emit_ggtt_write_rcs(cs,
>   				      intel_engine_next_hangcheck_seqno(request->engine),
> -				      intel_hws_hangcheck_address(request->engine),
> -				      0);
> +				      I915_GEM_HWS_HANGCHECK_ADDR,
> +				      PIPE_CONTROL_STORE_DATA_INDEX);
>   
>   	*cs++ = MI_USER_INTERRUPT;
>   	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 35fdebd67e5f..0310d5d53bf9 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -43,12 +43,6 @@
>    */
>   #define LEGACY_REQUEST_SIZE 200
>   
> -static inline u32 hws_hangcheck_address(struct intel_engine_cs *engine)
> -{
> -	return (i915_ggtt_offset(engine->status_page.vma) +
> -		I915_GEM_HWS_HANGCHECK_ADDR);
> -}
> -
>   unsigned int intel_ring_update_space(struct intel_ring *ring)
>   {
>   	unsigned int space;
> @@ -317,8 +311,8 @@ static u32 *gen6_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
>   	*cs++ = rq->fence.seqno;
>   
>   	*cs++ = GFX_OP_PIPE_CONTROL(4);
> -	*cs++ = PIPE_CONTROL_QW_WRITE;
> -	*cs++ = hws_hangcheck_address(rq->engine) | PIPE_CONTROL_GLOBAL_GTT;
> +	*cs++ = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_STORE_DATA_INDEX;
> +	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR | PIPE_CONTROL_GLOBAL_GTT;
>   	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
>   
>   	*cs++ = MI_USER_INTERRUPT;
> @@ -423,8 +417,10 @@ static u32 *gen7_rcs_emit_breadcrumb(struct i915_request *rq, u32 *cs)
>   	*cs++ = rq->fence.seqno;
>   
>   	*cs++ = GFX_OP_PIPE_CONTROL(4);
> -	*cs++ = PIPE_CONTROL_QW_WRITE | PIPE_CONTROL_GLOBAL_GTT_IVB;
> -	*cs++ = hws_hangcheck_address(rq->engine);
> +	*cs++ = (PIPE_CONTROL_QW_WRITE |
> +		 PIPE_CONTROL_STORE_DATA_INDEX |
> +		 PIPE_CONTROL_GLOBAL_GTT_IVB);
> +	*cs++ = I915_GEM_HWS_HANGCHECK_ADDR;
>   	*cs++ = intel_engine_next_hangcheck_seqno(rq->engine);
>   
>   	*cs++ = MI_USER_INTERRUPT;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index a57489fcb302..a02c92dac5da 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -419,14 +419,14 @@ gen8_emit_ggtt_write_rcs(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
>   }
>   
>   static inline u32 *
> -gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset)
> +gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
>   {
>   	/* w/a: bit 5 needs to be zero for MI_FLUSH_DW address. */
>   	GEM_BUG_ON(gtt_offset & (1 << 5));
>   	/* Offset should be aligned to 8 bytes for both (QW/DW) write types */
>   	GEM_BUG_ON(!IS_ALIGNED(gtt_offset, 8));
>   
> -	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW;
> +	*cs++ = (MI_FLUSH_DW + 1) | MI_FLUSH_DW_OP_STOREDW | flags;
>   	*cs++ = gtt_offset | MI_FLUSH_DW_USE_GTT;
>   	*cs++ = 0;
>   	*cs++ = value;
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring
  2019-03-18 10:56         ` Chris Wilson
@ 2019-03-18 13:25           ` Tvrtko Ursulin
  0 siblings, 0 replies; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 13:25 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 10:56, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-18 10:46:28)
>>
>> On 18/03/2019 10:37, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2019-03-18 10:31:57)
>>>>
>>>> On 18/03/2019 09:51, Chris Wilson wrote:
>>>>> As the final request on a ring may hold the reference to this ring (via
>>>>> retiring the last pinned context), we may find ourselves chasing a
>>>>> dangling pointer on completion of the list.
>>>>>
>>>>> A quick solution is to hold a reference to the ring itself as we retire
>>>>> along it so that we only free it after we stop dereferencing it.
>>>>
>>>> Is there a guilty commit to reference as Fixes: ?
>>>
>>> It only becomes a problem with veng as we gain an immediate free path,
>>> whereas at the moment, context frees are deferred until they can acquire
>>> the struct_mutex. We cannot hit this path at the moment, but that we had
>>> to use the safe iterator implies that we were aware that the ring itself
>>> could disappear. If you wanted to pin it on something,
>>>
>>> References: b887d6154624 ("drm/i915: Retire requests along rings")
>>>
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/i915_request.c          |  6 +++++-
>>>>>     drivers/gpu/drm/i915/intel_engine_types.h    |  2 ++
>>>>>     drivers/gpu/drm/i915/intel_lrc.c             |  4 ++--
>>>>>     drivers/gpu/drm/i915/intel_ringbuffer.c      |  9 +++++----
>>>>>     drivers/gpu/drm/i915/intel_ringbuffer.h      | 13 ++++++++++++-
>>>>>     drivers/gpu/drm/i915/selftests/mock_engine.c |  1 +
>>>>>     6 files changed, 27 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>>> index 9533a85cb0b3..0a3d94517d0a 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>>> @@ -1332,8 +1332,12 @@ void i915_retire_requests(struct drm_i915_private *i915)
>>>>>         if (!i915->gt.active_requests)
>>>>>                 return;
>>>>>     
>>>>> -     list_for_each_entry_safe(ring, tmp, &i915->gt.active_rings, active_link)
>>>>> +     list_for_each_entry_safe(ring, tmp,
>>>>> +                              &i915->gt.active_rings, active_link) {
>>>>> +             intel_ring_get(ring); /* last rq holds reference! */
>>>>>                 ring_retire_requests(ring);
>>>>> +             intel_ring_put(ring);
>>>>> +     }
>>>>
>>>> Where does it chase a dangling pointer? It used the safe iterator already.
>>>
>>> Inside ring_retire_requests(); the use of _safe here actually implies we
>>> met this problem already :)
>>
>> I get it, the issue is during ring->request_list iteration in
>> ring_retire_requests. How about move ring pinning in there so it is clearer?
> 
> It's only this path that is affected. That maye becomes clearer when we
> retire along timelines and not rings, which is the patchset I wanted to
> land to fix this, but figured this was a tiny fix for now.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I failed to find a short and clear suggestion for making the comment 
more precise.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 06/22] drm/i915: Hold a reference to the active HW context
  2019-03-18 12:57       ` Chris Wilson
@ 2019-03-18 13:29         ` Tvrtko Ursulin
  0 siblings, 0 replies; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 13:29 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 12:57, Chris Wilson wrote:
> Quoting Chris Wilson (2019-03-18 12:56:12)
>> Quoting Tvrtko Ursulin (2019-03-18 12:54:00)
>>>
>>> On 18/03/2019 09:51, Chris Wilson wrote:
>>>> For virtual engines, we need to keep the HW context alive while it
>>>> remains in use. For regular HW contexts, they are created and kept alive
>>>> until the end of the GEM context. For simplicity, generalise the
>>>> requirements and keep an active reference to each HW context.
>>>
>>> Is there a functional effect from this patch? Later with veng added?
>>
>> If by functional do you mean prevents the code from eating itself on
>> use-after-free after the engines are freed, then yes.
> 
> A variation of this used to be inside the veng patch, but that only
> applied itself to veng. After the discussion there, I felt it would be
> more obvious if it was applied as a standalone patch by generalising the
> requirements to all HW context.

Yep. Guess my previous review was too sloppy.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
  2019-03-18 10:48       ` Tvrtko Ursulin
@ 2019-03-18 13:57         ` Chris Wilson
  0 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 13:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 10:48:42)
> 
> On 18/03/2019 10:40, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-03-18 10:21:41)
> >>
> >> On 18/03/2019 09:51, Chris Wilson wrote:
> >>> We want to use intel_engine_mask_t inside i915_request.h, which means
> >>> extracting it from the general header file mess and placing it inside a
> >>> types.h. A knock on effect is that the compiler wants to warn about
> >>> type-contraction of ALL_ENGINES into intel_engine_maskt_t, so prepare
> >>> for the worst.
> >>
> >> We can't do:
> >>
> >> #define ALL_ENGINES ((intel_engine_mask_t)-1)
> >>
> >> to avoid this warning and a lot of the churn?
> > 
> > The churn is a lot of type fixing which needs to be done at some point.
> > I'm not keen on passing the contracted intel_engine_mask_t, and
> > ALL_ENGINES is not all bits set.
> 
> It is all bit set in intel_engine_mask_t. ;) I forgot what was your 
> argument against using it in function arguments. Perhaps because it is 
> pointless.. I regret adding this typedef even more now.

I made it all intel_engine_mask_t,


add/remove: 0/0 grow/shrink: 16/24 up/down: 111/-95 (16)
Function                                     old     new   delta
workload_thread                             4244    4274     +30
gen6_reset_engines                           200     217     +17
intel_gpu_reset                              632     646     +14
__igt_switch_to_kernel_context               399     412     +13
workload_thread.cold                        1426    1436     +10
intel_gvt_reset_vgpu_locked                  316     321      +5
__igt_switch_to_kernel_context.cold          126     131      +5
ring_mode_mmio_write                         202     206      +4
init_execlist                                 13      17      +4
intel_vgpu_select_submission_ops             210     212      +2
intel_vgpu_reset_submission                   49      51      +2
reset_execlist                               210     211      +1
live_gpu_reset_gt_engine_workarounds.part     175     176      +1
i915_wedged_set                              131     132      +1
i915_gem_init                               1527    1528      +1
clean_execlist                               130     131      +1
i915_capture_error_state                     337     336      -1
context_barrier_task.constprop               369     368      -1
intel_vgpu_clean_submission                  144     142      -2
intel_gvt_release_vgpu                        60      58      -2
intel_engines_sanitize                       115     113      -2
igt_wedged_reset                             101      99      -2
igt_switch_to_kernel_context                 240     238      -2
igt_reset_wait                               340     338      -2
igt_reset_nop                                442     440      -2
igt_global_reset                              85      83      -2
i915_gem_resume                              207     205      -2
i915_drop_caches_set                         572     570      -2
__i915_gem_set_wedged.part                   354     352      -2
ring_request_alloc                          1887    1884      -3
igt_atomic_reset                             407     404      -3
i915_gem_switch_to_kernel_context            629     626      -3
gen8_reset_engines                           927     924      -3
gen6_alloc_va_range                          641     638      -3
pd_vma_bind                                  245     241      -4
i915_reset                                   903     896      -7
i915_handle_error                            711     704      -7
context_barrier_inject_fault                   8       1      -7
mock_context_barrier                         429     415     -14
gdrst_mmio_write                             164     147     -17

so bizarre gcc is bizarre.

Would you rather see this with universal intel_engine_mask_t?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace
  2019-03-18  9:51 ` [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace Chris Wilson
@ 2019-03-18 16:22   ` Tvrtko Ursulin
  2019-03-18 16:30     ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 16:22 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> In later patches, it became apparent that userspace can see a partially
> constructed GEM context and begin using it before it was ready, to much
> hilarity. Close this window of opportunity by lifting the registration of
> the context with userspace (the insertion of the context into the filp's
> idr) to the very end of the CONTEXT_CREATE ioctl.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c       | 143 +++++++++++-------
>   drivers/gpu/drm/i915/i915_gem_gtt.c           |   7 +-
>   drivers/gpu/drm/i915/i915_gem_gtt.h           |   8 +-
>   drivers/gpu/drm/i915/selftests/huge_pages.c   |   2 +-
>   .../gpu/drm/i915/selftests/i915_gem_context.c |  12 +-
>   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   2 +-
>   drivers/gpu/drm/i915/selftests/mock_context.c |  17 ++-
>   7 files changed, 116 insertions(+), 75 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index d776d43707e0..5df3d423ec6c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -337,15 +337,13 @@ static u32 default_desc_template(const struct drm_i915_private *i915,
>   }
>   
>   static struct i915_gem_context *
> -__create_hw_context(struct drm_i915_private *dev_priv,
> -		    struct drm_i915_file_private *file_priv)
> +__create_context(struct drm_i915_private *dev_priv)
>   {
>   	struct i915_gem_context *ctx;
> -	int ret;
>   	int i;
>   
>   	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
> -	if (ctx == NULL)
> +	if (!ctx)
>   		return ERR_PTR(-ENOMEM);
>   
>   	kref_init(&ctx->ref);
> @@ -362,29 +360,6 @@ __create_hw_context(struct drm_i915_private *dev_priv,
>   	INIT_LIST_HEAD(&ctx->handles_list);
>   	INIT_LIST_HEAD(&ctx->hw_id_link);
>   
> -	/* Default context will never have a file_priv */
> -	ret = DEFAULT_CONTEXT_HANDLE;
> -	if (file_priv) {
> -		ret = idr_alloc(&file_priv->context_idr, ctx,
> -				DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
> -		if (ret < 0)
> -			goto err_lut;
> -	}
> -	ctx->user_handle = ret;
> -
> -	ctx->file_priv = file_priv;
> -	if (file_priv) {
> -		ctx->pid = get_task_pid(current, PIDTYPE_PID);
> -		ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x",
> -				      current->comm,
> -				      pid_nr(ctx->pid),
> -				      ctx->user_handle);
> -		if (!ctx->name) {
> -			ret = -ENOMEM;
> -			goto err_pid;
> -		}
> -	}
> -
>   	/* NB: Mark all slices as needing a remap so that when the context first
>   	 * loads it will restore whatever remap state already exists. If there
>   	 * is no remap info, it will be a NOP. */
> @@ -401,25 +376,10 @@ __create_hw_context(struct drm_i915_private *dev_priv,
>   		ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
>   
>   	return ctx;
> -
> -err_pid:
> -	put_pid(ctx->pid);
> -	idr_remove(&file_priv->context_idr, ctx->user_handle);
> -err_lut:
> -	context_close(ctx);
> -	return ERR_PTR(ret);
> -}
> -
> -static void __destroy_hw_context(struct i915_gem_context *ctx,
> -				 struct drm_i915_file_private *file_priv)
> -{
> -	idr_remove(&file_priv->context_idr, ctx->user_handle);
> -	context_close(ctx);
>   }
>   
>   static struct i915_gem_context *
> -i915_gem_create_context(struct drm_i915_private *dev_priv,
> -			struct drm_i915_file_private *file_priv)
> +i915_gem_create_context(struct drm_i915_private *dev_priv)
>   {
>   	struct i915_gem_context *ctx;
>   
> @@ -428,18 +388,18 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   	/* Reap the most stale context */
>   	contexts_free_first(dev_priv);
>   
> -	ctx = __create_hw_context(dev_priv, file_priv);
> +	ctx = __create_context(dev_priv);
>   	if (IS_ERR(ctx))
>   		return ctx;
>   
>   	if (HAS_FULL_PPGTT(dev_priv)) {
>   		struct i915_hw_ppgtt *ppgtt;
>   
> -		ppgtt = i915_ppgtt_create(dev_priv, file_priv);
> +		ppgtt = i915_ppgtt_create(dev_priv);
>   		if (IS_ERR(ppgtt)) {
>   			DRM_DEBUG_DRIVER("PPGTT setup failed (%ld)\n",
>   					 PTR_ERR(ppgtt));
> -			__destroy_hw_context(ctx, file_priv);
> +			context_close(ctx);
>   			return ERR_CAST(ppgtt);
>   		}
>   
> @@ -475,7 +435,7 @@ i915_gem_context_create_gvt(struct drm_device *dev)
>   	if (ret)
>   		return ERR_PTR(ret);
>   
> -	ctx = i915_gem_create_context(to_i915(dev), NULL);
> +	ctx = i915_gem_create_context(to_i915(dev));
>   	if (IS_ERR(ctx))
>   		goto out;
>   
> @@ -511,7 +471,7 @@ i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio)
>   	struct i915_gem_context *ctx;
>   	int err;
>   
> -	ctx = i915_gem_create_context(i915, NULL);
> +	ctx = i915_gem_create_context(i915);
>   	if (IS_ERR(ctx))
>   		return ctx;
>   
> @@ -625,25 +585,79 @@ static int context_idr_cleanup(int id, void *p, void *data)
>   	return 0;
>   }
>   
> +static int gem_context_register(struct i915_gem_context *ctx,
> +				struct drm_i915_file_private *fpriv)
> +{
> +	int ret;
> +

Assert struct mutex for now? Without it userspace can still see not 
fully initialized ctx. It is kind of two arguments but good for 
documentation nevertheless I think.

> +	ctx->pid = get_task_pid(current, PIDTYPE_PID);
> +
> +	if (ctx->ppgtt)
> +		ctx->ppgtt->vm.file = fpriv;

Thinking about i915_vm_set_file(vm, fpriv), not sure.

> +
> +	/* And (nearly) finally expose ourselves to userspace via the idr */
> +	ret = idr_alloc(&fpriv->context_idr, ctx,
> +			DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
> +	if (ret < 0)
> +		goto err_pid;
> +
> +	ctx->file_priv = fpriv;
> +	ctx->user_handle = ret;
> +
> +	ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x",
> +			      current->comm,
> +			      pid_nr(ctx->pid),
> +			      ctx->user_handle);
> +	if (!ctx->name) {
> +		ret = -ENOMEM;
> +		goto err_idr;
> +	}
> +
> +	return 0;
> +
> +err_idr:
> +	idr_remove(&fpriv->context_idr, ctx->user_handle);
> +	ctx->file_priv = NULL;
> +err_pid:
> +	put_pid(ctx->pid);
> +	ctx->pid = NULL;

To avoid this, and in the spirit of the patch, I think it would be 
better to just store everything in locals and assign to ctx members once 
passed the point of failure.

> +	return ret;
> +}
> +
>   int i915_gem_context_open(struct drm_i915_private *i915,
>   			  struct drm_file *file)
>   {
>   	struct drm_i915_file_private *file_priv = file->driver_priv;
>   	struct i915_gem_context *ctx;
> +	int err;
>   
>   	idr_init(&file_priv->context_idr);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> -	ctx = i915_gem_create_context(i915, file_priv);
> -	mutex_unlock(&i915->drm.struct_mutex);
> +
> +	ctx = i915_gem_create_context(i915);
>   	if (IS_ERR(ctx)) {
> -		idr_destroy(&file_priv->context_idr);
> -		return PTR_ERR(ctx);
> +		err = PTR_ERR(ctx);
> +		goto err;
>   	}
>   
> +	err = gem_context_register(ctx, file_priv);
> +	if (err)
> +		goto err_ctx;
> +
> +	GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
>   	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
>   
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
>   	return 0;
> +
> +err_ctx:
> +	context_close(ctx);
> +err:
> +	mutex_unlock(&i915->drm.struct_mutex);
> +	idr_destroy(&file_priv->context_idr);
> +	return PTR_ERR(ctx);
>   }
>   
>   void i915_gem_context_close(struct drm_file *file)
> @@ -835,17 +849,28 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>   	if (ret)
>   		return ret;
>   
> -	ctx = i915_gem_create_context(i915, file_priv);
> -	mutex_unlock(&dev->struct_mutex);
> -	if (IS_ERR(ctx))
> -		return PTR_ERR(ctx);
> +	ctx = i915_gem_create_context(i915);
> +	if (IS_ERR(ctx)) {
> +		ret = PTR_ERR(ctx);
> +		goto err_unlock;
> +	}
>   
> -	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
> +	ret = gem_context_register(ctx, file_priv);
> +	if (ret)
> +		goto err_ctx;
> +
> +	mutex_unlock(&dev->struct_mutex);
>   
>   	args->ctx_id = ctx->user_handle;
>   	DRM_DEBUG("HW context %d created\n", args->ctx_id);
>   
>   	return 0;
> +
> +err_ctx:
> +	context_close(ctx);
> +err_unlock:
> +	mutex_unlock(&dev->struct_mutex);
> +	return ret;
>   }
>   
>   int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> @@ -870,7 +895,9 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>   	if (ret)
>   		goto out;
>   
> -	__destroy_hw_context(ctx, file_priv);
> +	idr_remove(&file_priv->context_idr, ctx->user_handle);
> +	context_close(ctx);
> +
>   	mutex_unlock(&dev->struct_mutex);
>   
>   out:
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index b8055c8d4e71..b9e0e3a00223 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -2069,8 +2069,7 @@ __hw_ppgtt_create(struct drm_i915_private *i915)
>   }
>   
>   struct i915_hw_ppgtt *
> -i915_ppgtt_create(struct drm_i915_private *i915,
> -		  struct drm_i915_file_private *fpriv)
> +i915_ppgtt_create(struct drm_i915_private *i915)
>   {
>   	struct i915_hw_ppgtt *ppgtt;
>   
> @@ -2078,8 +2077,6 @@ i915_ppgtt_create(struct drm_i915_private *i915,
>   	if (IS_ERR(ppgtt))
>   		return ppgtt;
>   
> -	ppgtt->vm.file = fpriv;
> -
>   	trace_i915_ppgtt_create(&ppgtt->vm);
>   
>   	return ppgtt;
> @@ -2657,7 +2654,7 @@ int i915_gem_init_aliasing_ppgtt(struct drm_i915_private *i915)
>   	struct i915_hw_ppgtt *ppgtt;
>   	int err;
>   
> -	ppgtt = i915_ppgtt_create(i915, ERR_PTR(-EPERM));
> +	ppgtt = i915_ppgtt_create(i915);

vm.file changes to NULL now, but after some grepping I did not find that 
it matters.

>   	if (IS_ERR(ppgtt))
>   		return PTR_ERR(ppgtt);
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
> index 35f21a2ae36c..b76ab4c2a0e6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.h
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
> @@ -603,15 +603,17 @@ int i915_gem_init_ggtt(struct drm_i915_private *dev_priv);
>   void i915_ggtt_cleanup_hw(struct drm_i915_private *dev_priv);
>   
>   int i915_ppgtt_init_hw(struct drm_i915_private *dev_priv);
> -void i915_ppgtt_release(struct kref *kref);
> -struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_i915_private *dev_priv,
> -					struct drm_i915_file_private *fpriv);
> +
> +struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_i915_private *dev_priv);
>   void i915_ppgtt_close(struct i915_address_space *vm);
> +void i915_ppgtt_release(struct kref *kref);
> +
>   static inline void i915_ppgtt_get(struct i915_hw_ppgtt *ppgtt)
>   {
>   	if (ppgtt)
>   		kref_get(&ppgtt->ref);
>   }
> +
>   static inline void i915_ppgtt_put(struct i915_hw_ppgtt *ppgtt)
>   {
>   	if (ppgtt)
> diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
> index 218cfc361de3..c5c8ba6c059f 100644
> --- a/drivers/gpu/drm/i915/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
> @@ -1710,7 +1710,7 @@ int i915_gem_huge_page_mock_selftests(void)
>   	mkwrite_device_info(dev_priv)->ppgtt_size = 48;
>   
>   	mutex_lock(&dev_priv->drm.struct_mutex);
> -	ppgtt = i915_ppgtt_create(dev_priv, ERR_PTR(-ENODEV));
> +	ppgtt = i915_ppgtt_create(dev_priv);
>   	if (IS_ERR(ppgtt)) {
>   		err = PTR_ERR(ppgtt);
>   		goto out_unlock;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> index f18c78ebff07..4dc96e28d89f 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
> @@ -76,7 +76,7 @@ static int live_nop_switch(void *arg)
>   	}
>   
>   	for (n = 0; n < nctx; n++) {
> -		ctx[n] = i915_gem_create_context(i915, file->driver_priv);
> +		ctx[n] = live_context(i915, file);
>   		if (IS_ERR(ctx[n])) {
>   			err = PTR_ERR(ctx[n]);
>   			goto out_unlock;
> @@ -514,7 +514,7 @@ static int igt_ctx_exec(void *arg)
>   		struct i915_gem_context *ctx;
>   		unsigned int id;
>   
> -		ctx = i915_gem_create_context(i915, file->driver_priv);
> +		ctx = live_context(i915, file);
>   		if (IS_ERR(ctx)) {
>   			err = PTR_ERR(ctx);
>   			goto out_unlock;
> @@ -960,7 +960,7 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
>   
>   	mutex_lock(&i915->drm.struct_mutex);
>   
> -	ctx = i915_gem_create_context(i915, file->driver_priv);
> +	ctx = live_context(i915, file);
>   	if (IS_ERR(ctx)) {
>   		ret = PTR_ERR(ctx);
>   		goto out_unlock;
> @@ -1070,7 +1070,7 @@ static int igt_ctx_readonly(void *arg)
>   	if (err)
>   		goto out_unlock;
>   
> -	ctx = i915_gem_create_context(i915, file->driver_priv);
> +	ctx = live_context(i915, file);
>   	if (IS_ERR(ctx)) {
>   		err = PTR_ERR(ctx);
>   		goto out_unlock;
> @@ -1390,13 +1390,13 @@ static int igt_vm_isolation(void *arg)
>   	if (err)
>   		goto out_unlock;
>   
> -	ctx_a = i915_gem_create_context(i915, file->driver_priv);
> +	ctx_a = live_context(i915, file);
>   	if (IS_ERR(ctx_a)) {
>   		err = PTR_ERR(ctx_a);
>   		goto out_unlock;
>   	}
>   
> -	ctx_b = i915_gem_create_context(i915, file->driver_priv);
> +	ctx_b = live_context(i915, file);
>   	if (IS_ERR(ctx_b)) {
>   		err = PTR_ERR(ctx_b);
>   		goto out_unlock;
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> index 826fd51c331e..01084f6b4fb7 100644
> --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
> @@ -1010,7 +1010,7 @@ static int exercise_ppgtt(struct drm_i915_private *dev_priv,
>   		return PTR_ERR(file);
>   
>   	mutex_lock(&dev_priv->drm.struct_mutex);
> -	ppgtt = i915_ppgtt_create(dev_priv, file->driver_priv);
> +	ppgtt = i915_ppgtt_create(dev_priv);
>   	if (IS_ERR(ppgtt)) {
>   		err = PTR_ERR(ppgtt);
>   		goto out_unlock;
> diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
> index 8efa6892c6cd..1cc8be732435 100644
> --- a/drivers/gpu/drm/i915/selftests/mock_context.c
> +++ b/drivers/gpu/drm/i915/selftests/mock_context.c
> @@ -88,9 +88,24 @@ void mock_init_contexts(struct drm_i915_private *i915)
>   struct i915_gem_context *
>   live_context(struct drm_i915_private *i915, struct drm_file *file)

Live context, mock context identity crisis. :) Nothing to act upon, just 
amusing myself..

>   {
> +	struct i915_gem_context *ctx;
> +	int err;
> +
>   	lockdep_assert_held(&i915->drm.struct_mutex);
>   
> -	return i915_gem_create_context(i915, file->driver_priv);
> +	ctx = i915_gem_create_context(i915);
> +	if (IS_ERR(ctx))
> +		return ctx;
> +
> +	err = gem_context_register(ctx, file->driver_priv);
> +	if (err)
> +		goto err_ctx;
> +
> +	return ctx;
> +
> +err_ctx:
> +	i915_gem_context_put(ctx);
> +	return ERR_PTR(err);

Never too early for onion unwind, yes? :)

>   }
>   
>   struct i915_gem_context *
> 

Looks good. Cleanup of gem_context_register to use local would make me 
completely happy.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr
  2019-03-18  9:51 ` [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr Chris Wilson
@ 2019-03-18 16:28   ` Tvrtko Ursulin
  2019-03-18 16:35     ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 16:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 09:51, Chris Wilson wrote:
> Define a mutex for the exclusive use of interacting with the per-file
> context-idr, that was previously guarded by struct_mutex. This allows us
> to reduce the coverage of struct_mutex, with a view to removing the last
> bits coordinating GEM context later. (In the short term, we avoid taking
> struct_mutex while using the extended constructor functions, preventing
> some nasty recursion.)
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |  2 ++
>   drivers/gpu/drm/i915/i915_gem_context.c | 43 +++++++++++--------------
>   2 files changed, 21 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 86080a6e0f45..90389333dd47 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -216,7 +216,9 @@ struct drm_i915_file_private {
>    */
>   #define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
>   	} mm;
> +
>   	struct idr context_idr;
> +	struct mutex context_lock; /* guards context_idr */

context_idr_lock then?

>   
>   	unsigned int bsd_engine;
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 5df3d423ec6c..94c466d4b29e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -579,9 +579,7 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
>   
>   static int context_idr_cleanup(int id, void *p, void *data)
>   {
> -	struct i915_gem_context *ctx = p;
> -
> -	context_close(ctx);
> +	context_close(p);
>   	return 0;
>   }
>   
> @@ -596,8 +594,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
>   		ctx->ppgtt->vm.file = fpriv;
>   
>   	/* And (nearly) finally expose ourselves to userspace via the idr */
> +	mutex_lock(&fpriv->context_lock);
>   	ret = idr_alloc(&fpriv->context_idr, ctx,
>   			DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
> +	mutex_unlock(&fpriv->context_lock);
>   	if (ret < 0)
>   		goto err_pid;
>   
> @@ -616,7 +616,9 @@ static int gem_context_register(struct i915_gem_context *ctx,
>   	return 0;
>   
>   err_idr:
> +	mutex_lock(&fpriv->context_lock);
>   	idr_remove(&fpriv->context_idr, ctx->user_handle);
> +	mutex_unlock(&fpriv->context_lock);
>   	ctx->file_priv = NULL;
>   err_pid:
>   	put_pid(ctx->pid);
> @@ -632,10 +634,11 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>   	int err;
>   
>   	idr_init(&file_priv->context_idr);
> +	mutex_init(&file_priv->context_lock);
>   
>   	mutex_lock(&i915->drm.struct_mutex);
> -
>   	ctx = i915_gem_create_context(i915);
> +	mutex_unlock(&i915->drm.struct_mutex);
>   	if (IS_ERR(ctx)) {
>   		err = PTR_ERR(ctx);
>   		goto err;
> @@ -648,14 +651,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>   	GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
>   	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
>   
> -	mutex_unlock(&i915->drm.struct_mutex);
> -
>   	return 0;
>   
>   err_ctx:
> +	mutex_lock(&i915->drm.struct_mutex);
>   	context_close(ctx);
> -err:
>   	mutex_unlock(&i915->drm.struct_mutex);
> +err:
> +	mutex_destroy(&file_priv->context_lock);
>   	idr_destroy(&file_priv->context_idr);
>   	return PTR_ERR(ctx);
>   }
> @@ -668,6 +671,7 @@ void i915_gem_context_close(struct drm_file *file)
>   
>   	idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
>   	idr_destroy(&file_priv->context_idr);
> +	mutex_destroy(&file_priv->context_lock);
>   }
>   
>   static struct i915_request *
> @@ -850,25 +854,22 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>   		return ret;
>   
>   	ctx = i915_gem_create_context(i915);
> -	if (IS_ERR(ctx)) {
> -		ret = PTR_ERR(ctx);
> -		goto err_unlock;
> -	}
> +	mutex_unlock(&dev->struct_mutex);
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
>   
>   	ret = gem_context_register(ctx, file_priv);
>   	if (ret)
>   		goto err_ctx;
>   
> -	mutex_unlock(&dev->struct_mutex);
> -
>   	args->ctx_id = ctx->user_handle;
>   	DRM_DEBUG("HW context %d created\n", args->ctx_id);
>   
>   	return 0;
>   
>   err_ctx:
> +	mutex_lock(&dev->struct_mutex);
>   	context_close(ctx);
> -err_unlock:
>   	mutex_unlock(&dev->struct_mutex);
>   	return ret;
>   }
> @@ -879,7 +880,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>   	struct drm_i915_gem_context_destroy *args = data;
>   	struct drm_i915_file_private *file_priv = file->driver_priv;
>   	struct i915_gem_context *ctx;
> -	int ret;
>   
>   	if (args->pad != 0)
>   		return -EINVAL;
> @@ -887,21 +887,16 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>   	if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
>   		return -ENOENT;
>   
> -	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> +	mutex_lock(&file_priv->context_lock);
> +	ctx = idr_remove(&file_priv->context_idr, args->ctx_id);
> +	mutex_lock(&file_priv->context_lock);
>   	if (!ctx)
>   		return -ENOENT;
>   
> -	ret = mutex_lock_interruptible(&dev->struct_mutex);
> -	if (ret)
> -		goto out;
> -
> -	idr_remove(&file_priv->context_idr, ctx->user_handle);
> +	mutex_lock(&dev->struct_mutex);

I'd keep this one interruptible. Hm bummer, there was more of them before..

I mean the new mutex we can probably get away by not bothering, since it 
guards so little, but struct mutex, since you are touching those lines 
anyway, what do you think about making it interruptible in all ioctl paths?

>   	context_close(ctx);
> -
>   	mutex_unlock(&dev->struct_mutex);
>   
> -out:
> -	i915_gem_context_put(ctx);
>   	return 0;
>   }
>   
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace
  2019-03-18 16:22   ` Tvrtko Ursulin
@ 2019-03-18 16:30     ` Chris Wilson
  2019-03-18 16:32       ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 16:30 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 16:22:12)
> 
> On 18/03/2019 09:51, Chris Wilson wrote:
> > +static int gem_context_register(struct i915_gem_context *ctx,
> > +                             struct drm_i915_file_private *fpriv)
> > +{
> > +     int ret;
> > +
> 
> Assert struct mutex for now? Without it userspace can still see not 
> fully initialized ctx. It is kind of two arguments but good for 
> documentation nevertheless I think.

The goal is that we need to fix that now. And we can't hold struct_mutex
across the extensions, as we want to wrap ctx_setparam which expects to
be able to take struct_mutex. So it has to be registered outside of
struct_mutex in this or the next path.

> > +     ctx->pid = get_task_pid(current, PIDTYPE_PID);
> > +
> > +     if (ctx->ppgtt)
> > +             ctx->ppgtt->vm.file = fpriv;
> 
> Thinking about i915_vm_set_file(vm, fpriv), not sure.

Nah, longer term needs a better fix since ppgtt is shared. (I'm just
ignoring this for now, since they all must have the same fpriv.)

> > +
> > +     /* And (nearly) finally expose ourselves to userspace via the idr */
> > +     ret = idr_alloc(&fpriv->context_idr, ctx,
> > +                     DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
> > +     if (ret < 0)
> > +             goto err_pid;
> > +
> > +     ctx->file_priv = fpriv;
> > +     ctx->user_handle = ret;
> > +
> > +     ctx->name = kasprintf(GFP_KERNEL, "%s[%d]/%x",
> > +                           current->comm,
> > +                           pid_nr(ctx->pid),
> > +                           ctx->user_handle);
> > +     if (!ctx->name) {
> > +             ret = -ENOMEM;
> > +             goto err_idr;
> > +     }
> > +
> > +     return 0;
> > +
> > +err_idr:
> > +     idr_remove(&fpriv->context_idr, ctx->user_handle);
> > +     ctx->file_priv = NULL;
> > +err_pid:
> > +     put_pid(ctx->pid);
> > +     ctx->pid = NULL;
> 
> To avoid this, and in the spirit of the patch, I think it would be 
> better to just store everything in locals and assign to ctx members once 
> passed the point of failure.

Ah. Yes. That should solve the problem of that idr_alloc() not being
last as it needs to be.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace
  2019-03-18 16:30     ` Chris Wilson
@ 2019-03-18 16:32       ` Chris Wilson
  2019-03-18 16:46         ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 16:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2019-03-18 16:30:58)
> Quoting Tvrtko Ursulin (2019-03-18 16:22:12)
> > 
> > On 18/03/2019 09:51, Chris Wilson wrote:
> > > +static int gem_context_register(struct i915_gem_context *ctx,
> > > +                             struct drm_i915_file_private *fpriv)
> > > +{
> > > +     int ret;
> > > +
> > 
> > Assert struct mutex for now? Without it userspace can still see not 
> > fully initialized ctx. It is kind of two arguments but good for 
> > documentation nevertheless I think.
> 
> The goal is that we need to fix that now. And we can't hold struct_mutex
> across the extensions, as we want to wrap ctx_setparam which expects to
> be able to take struct_mutex. So it has to be registered outside of
> struct_mutex in this or the next path.

Waitasec... It is the very next patch that drops the struct_mutex. (I
was thinking it was removed in this patch.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr
  2019-03-18 16:28   ` Tvrtko Ursulin
@ 2019-03-18 16:35     ` Chris Wilson
  2019-03-18 16:45       ` Tvrtko Ursulin
  0 siblings, 1 reply; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 16:35 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 16:28:35)
> 
> On 18/03/2019 09:51, Chris Wilson wrote:
> > Define a mutex for the exclusive use of interacting with the per-file
> > context-idr, that was previously guarded by struct_mutex. This allows us
> > to reduce the coverage of struct_mutex, with a view to removing the last
> > bits coordinating GEM context later. (In the short term, we avoid taking
> > struct_mutex while using the extended constructor functions, preventing
> > some nasty recursion.)
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.h         |  2 ++
> >   drivers/gpu/drm/i915/i915_gem_context.c | 43 +++++++++++--------------
> >   2 files changed, 21 insertions(+), 24 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 86080a6e0f45..90389333dd47 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -216,7 +216,9 @@ struct drm_i915_file_private {
> >    */
> >   #define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
> >       } mm;
> > +
> >       struct idr context_idr;
> > +     struct mutex context_lock; /* guards context_idr */
> 
> context_idr_lock then?
> 
> >   
> >       unsigned int bsd_engine;
> >   
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 5df3d423ec6c..94c466d4b29e 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -579,9 +579,7 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
> >   
> >   static int context_idr_cleanup(int id, void *p, void *data)
> >   {
> > -     struct i915_gem_context *ctx = p;
> > -
> > -     context_close(ctx);
> > +     context_close(p);
> >       return 0;
> >   }
> >   
> > @@ -596,8 +594,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >               ctx->ppgtt->vm.file = fpriv;
> >   
> >       /* And (nearly) finally expose ourselves to userspace via the idr */
> > +     mutex_lock(&fpriv->context_lock);
> >       ret = idr_alloc(&fpriv->context_idr, ctx,
> >                       DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
> > +     mutex_unlock(&fpriv->context_lock);
> >       if (ret < 0)
> >               goto err_pid;
> >   
> > @@ -616,7 +616,9 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >       return 0;
> >   
> >   err_idr:
> > +     mutex_lock(&fpriv->context_lock);
> >       idr_remove(&fpriv->context_idr, ctx->user_handle);
> > +     mutex_unlock(&fpriv->context_lock);
> >       ctx->file_priv = NULL;
> >   err_pid:
> >       put_pid(ctx->pid);
> > @@ -632,10 +634,11 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >       int err;
> >   
> >       idr_init(&file_priv->context_idr);
> > +     mutex_init(&file_priv->context_lock);
> >   
> >       mutex_lock(&i915->drm.struct_mutex);
> > -
> >       ctx = i915_gem_create_context(i915);
> > +     mutex_unlock(&i915->drm.struct_mutex);
> >       if (IS_ERR(ctx)) {
> >               err = PTR_ERR(ctx);
> >               goto err;
> > @@ -648,14 +651,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >       GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
> >       GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
> >   
> > -     mutex_unlock(&i915->drm.struct_mutex);
> > -
> >       return 0;
> >   
> >   err_ctx:
> > +     mutex_lock(&i915->drm.struct_mutex);
> >       context_close(ctx);
> > -err:
> >       mutex_unlock(&i915->drm.struct_mutex);
> > +err:
> > +     mutex_destroy(&file_priv->context_lock);
> >       idr_destroy(&file_priv->context_idr);
> >       return PTR_ERR(ctx);
> >   }
> > @@ -668,6 +671,7 @@ void i915_gem_context_close(struct drm_file *file)
> >   
> >       idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
> >       idr_destroy(&file_priv->context_idr);
> > +     mutex_destroy(&file_priv->context_lock);
> >   }
> >   
> >   static struct i915_request *
> > @@ -850,25 +854,22 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >               return ret;
> >   
> >       ctx = i915_gem_create_context(i915);
> > -     if (IS_ERR(ctx)) {
> > -             ret = PTR_ERR(ctx);
> > -             goto err_unlock;
> > -     }
> > +     mutex_unlock(&dev->struct_mutex);
> > +     if (IS_ERR(ctx))
> > +             return PTR_ERR(ctx);
> >   
> >       ret = gem_context_register(ctx, file_priv);
> >       if (ret)
> >               goto err_ctx;
> >   
> > -     mutex_unlock(&dev->struct_mutex);
> > -
> >       args->ctx_id = ctx->user_handle;
> >       DRM_DEBUG("HW context %d created\n", args->ctx_id);
> >   
> >       return 0;
> >   
> >   err_ctx:
> > +     mutex_lock(&dev->struct_mutex);
> >       context_close(ctx);
> > -err_unlock:
> >       mutex_unlock(&dev->struct_mutex);
> >       return ret;
> >   }
> > @@ -879,7 +880,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >       struct drm_i915_gem_context_destroy *args = data;
> >       struct drm_i915_file_private *file_priv = file->driver_priv;
> >       struct i915_gem_context *ctx;
> > -     int ret;
> >   
> >       if (args->pad != 0)
> >               return -EINVAL;
> > @@ -887,21 +887,16 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >       if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
> >               return -ENOENT;
> >   
> > -     ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> > +     mutex_lock(&file_priv->context_lock);
> > +     ctx = idr_remove(&file_priv->context_idr, args->ctx_id);
> > +     mutex_lock(&file_priv->context_lock);
> >       if (!ctx)
> >               return -ENOENT;
> >   
> > -     ret = mutex_lock_interruptible(&dev->struct_mutex);
> > -     if (ret)
> > -             goto out;
> > -
> > -     idr_remove(&file_priv->context_idr, ctx->user_handle);
> > +     mutex_lock(&dev->struct_mutex);
> 
> I'd keep this one interruptible. Hm bummer, there was more of them before..

At this point, interrupt handling becomes problematic, as we have to
then re-insert the ctx_id into the idr and that may have already been
claimed elsewhere.
 
> I mean the new mutex we can probably get away by not bothering, since it 
> guards so little, but struct mutex, since you are touching those lines 
> anyway, what do you think about making it interruptible in all ioctl paths?

Not practical imo; but we won't need struct_mutex here in about another 20
patches :)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr
  2019-03-18 16:35     ` Chris Wilson
@ 2019-03-18 16:45       ` Tvrtko Ursulin
  2019-03-18 21:10         ` Chris Wilson
  0 siblings, 1 reply; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 16:45 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 16:35, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2019-03-18 16:28:35)
>>
>> On 18/03/2019 09:51, Chris Wilson wrote:
>>> Define a mutex for the exclusive use of interacting with the per-file
>>> context-idr, that was previously guarded by struct_mutex. This allows us
>>> to reduce the coverage of struct_mutex, with a view to removing the last
>>> bits coordinating GEM context later. (In the short term, we avoid taking
>>> struct_mutex while using the extended constructor functions, preventing
>>> some nasty recursion.)
>>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_drv.h         |  2 ++
>>>    drivers/gpu/drm/i915/i915_gem_context.c | 43 +++++++++++--------------
>>>    2 files changed, 21 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index 86080a6e0f45..90389333dd47 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -216,7 +216,9 @@ struct drm_i915_file_private {
>>>     */
>>>    #define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
>>>        } mm;
>>> +
>>>        struct idr context_idr;
>>> +     struct mutex context_lock; /* guards context_idr */
>>
>> context_idr_lock then?
>>
>>>    
>>>        unsigned int bsd_engine;
>>>    
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index 5df3d423ec6c..94c466d4b29e 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -579,9 +579,7 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
>>>    
>>>    static int context_idr_cleanup(int id, void *p, void *data)
>>>    {
>>> -     struct i915_gem_context *ctx = p;
>>> -
>>> -     context_close(ctx);
>>> +     context_close(p);
>>>        return 0;
>>>    }
>>>    
>>> @@ -596,8 +594,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
>>>                ctx->ppgtt->vm.file = fpriv;
>>>    
>>>        /* And (nearly) finally expose ourselves to userspace via the idr */
>>> +     mutex_lock(&fpriv->context_lock);
>>>        ret = idr_alloc(&fpriv->context_idr, ctx,
>>>                        DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
>>> +     mutex_unlock(&fpriv->context_lock);
>>>        if (ret < 0)
>>>                goto err_pid;
>>>    
>>> @@ -616,7 +616,9 @@ static int gem_context_register(struct i915_gem_context *ctx,
>>>        return 0;
>>>    
>>>    err_idr:
>>> +     mutex_lock(&fpriv->context_lock);
>>>        idr_remove(&fpriv->context_idr, ctx->user_handle);
>>> +     mutex_unlock(&fpriv->context_lock);
>>>        ctx->file_priv = NULL;
>>>    err_pid:
>>>        put_pid(ctx->pid);
>>> @@ -632,10 +634,11 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>>>        int err;
>>>    
>>>        idr_init(&file_priv->context_idr);
>>> +     mutex_init(&file_priv->context_lock);
>>>    
>>>        mutex_lock(&i915->drm.struct_mutex);
>>> -
>>>        ctx = i915_gem_create_context(i915);
>>> +     mutex_unlock(&i915->drm.struct_mutex);
>>>        if (IS_ERR(ctx)) {
>>>                err = PTR_ERR(ctx);
>>>                goto err;
>>> @@ -648,14 +651,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
>>>        GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
>>>        GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
>>>    
>>> -     mutex_unlock(&i915->drm.struct_mutex);
>>> -
>>>        return 0;
>>>    
>>>    err_ctx:
>>> +     mutex_lock(&i915->drm.struct_mutex);
>>>        context_close(ctx);
>>> -err:
>>>        mutex_unlock(&i915->drm.struct_mutex);
>>> +err:
>>> +     mutex_destroy(&file_priv->context_lock);
>>>        idr_destroy(&file_priv->context_idr);
>>>        return PTR_ERR(ctx);
>>>    }
>>> @@ -668,6 +671,7 @@ void i915_gem_context_close(struct drm_file *file)
>>>    
>>>        idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
>>>        idr_destroy(&file_priv->context_idr);
>>> +     mutex_destroy(&file_priv->context_lock);
>>>    }
>>>    
>>>    static struct i915_request *
>>> @@ -850,25 +854,22 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
>>>                return ret;
>>>    
>>>        ctx = i915_gem_create_context(i915);
>>> -     if (IS_ERR(ctx)) {
>>> -             ret = PTR_ERR(ctx);
>>> -             goto err_unlock;
>>> -     }
>>> +     mutex_unlock(&dev->struct_mutex);
>>> +     if (IS_ERR(ctx))
>>> +             return PTR_ERR(ctx);
>>>    
>>>        ret = gem_context_register(ctx, file_priv);
>>>        if (ret)
>>>                goto err_ctx;
>>>    
>>> -     mutex_unlock(&dev->struct_mutex);
>>> -
>>>        args->ctx_id = ctx->user_handle;
>>>        DRM_DEBUG("HW context %d created\n", args->ctx_id);
>>>    
>>>        return 0;
>>>    
>>>    err_ctx:
>>> +     mutex_lock(&dev->struct_mutex);
>>>        context_close(ctx);
>>> -err_unlock:
>>>        mutex_unlock(&dev->struct_mutex);
>>>        return ret;
>>>    }
>>> @@ -879,7 +880,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>>>        struct drm_i915_gem_context_destroy *args = data;
>>>        struct drm_i915_file_private *file_priv = file->driver_priv;
>>>        struct i915_gem_context *ctx;
>>> -     int ret;
>>>    
>>>        if (args->pad != 0)
>>>                return -EINVAL;
>>> @@ -887,21 +887,16 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
>>>        if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
>>>                return -ENOENT;
>>>    
>>> -     ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
>>> +     mutex_lock(&file_priv->context_lock);
>>> +     ctx = idr_remove(&file_priv->context_idr, args->ctx_id);
>>> +     mutex_lock(&file_priv->context_lock);
>>>        if (!ctx)
>>>                return -ENOENT;
>>>    
>>> -     ret = mutex_lock_interruptible(&dev->struct_mutex);
>>> -     if (ret)
>>> -             goto out;
>>> -
>>> -     idr_remove(&file_priv->context_idr, ctx->user_handle);
>>> +     mutex_lock(&dev->struct_mutex);
>>
>> I'd keep this one interruptible. Hm bummer, there was more of them before..
> 
> At this point, interrupt handling becomes problematic, as we have to
> then re-insert the ctx_id into the idr and that may have already been
> claimed elsewhere.

Ugh, bad.. Can we have struct_mutex nest under the context_idr_lock?

Regards,

Tvrtko

>> I mean the new mutex we can probably get away by not bothering, since it
>> guards so little, but struct mutex, since you are touching those lines
>> anyway, what do you think about making it interruptible in all ioctl paths?
> 
> Not practical imo; but we won't need struct_mutex here in about another 20
> patches :)
> -Chris
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace
  2019-03-18 16:32       ` Chris Wilson
@ 2019-03-18 16:46         ` Tvrtko Ursulin
  0 siblings, 0 replies; 53+ messages in thread
From: Tvrtko Ursulin @ 2019-03-18 16:46 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 18/03/2019 16:32, Chris Wilson wrote:
> Quoting Chris Wilson (2019-03-18 16:30:58)
>> Quoting Tvrtko Ursulin (2019-03-18 16:22:12)
>>>
>>> On 18/03/2019 09:51, Chris Wilson wrote:
>>>> +static int gem_context_register(struct i915_gem_context *ctx,
>>>> +                             struct drm_i915_file_private *fpriv)
>>>> +{
>>>> +     int ret;
>>>> +
>>>
>>> Assert struct mutex for now? Without it userspace can still see not
>>> fully initialized ctx. It is kind of two arguments but good for
>>> documentation nevertheless I think.
>>
>> The goal is that we need to fix that now. And we can't hold struct_mutex
>> across the extensions, as we want to wrap ctx_setparam which expects to
>> be able to take struct_mutex. So it has to be registered outside of
>> struct_mutex in this or the next path.
> 
> Waitasec... It is the very next patch that drops the struct_mutex. (I
> was thinking it was removed in this patch.)

So in the next patch you update the assert to ctx_idr_lock. Too evil? :))

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Flush pages on acquisition
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (20 preceding siblings ...)
  2019-03-18  9:52 ` [PATCH 22/22] drm/i915: Allow specification of parallel execbuf Chris Wilson
@ 2019-03-18 17:10 ` Patchwork
  2019-03-18 17:20 ` ✗ Fi.CI.SPARSE: " Patchwork
  2019-03-18 17:36 ` ✗ Fi.CI.BAT: failure " Patchwork
  23 siblings, 0 replies; 53+ messages in thread
From: Patchwork @ 2019-03-18 17:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915: Flush pages on acquisition
URL   : https://patchwork.freedesktop.org/series/58122/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
db0c17ac314c drm/i915: Flush pages on acquisition
2f0c66c9e084 drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
-:372: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#372: 
new file mode 100644

-:377: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#377: FILE: drivers/gpu/drm/i915/i915_scheduler_types.h:1:
+/*

-:378: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#378: FILE: drivers/gpu/drm/i915/i915_scheduler_types.h:2:
+ * SPDX-License-Identifier: MIT

-:552: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#552: FILE: drivers/gpu/drm/i915/intel_engine_types.h:37:
+#define INIT_ALL_ENGINES(x) (x) = (intel_engine_mask_t)(ALL_ENGINES)

-:598: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#598: FILE: drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c:1:
+/*

-:599: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#599: FILE: drivers/gpu/drm/i915/test_i915_scheduler_types_standalone.c:2:
+ * SPDX-License-Identifier: MIT

total: 1 errors, 5 warnings, 0 checks, 482 lines checked
760f0e1843fb drm/i915: Hold a ref to the ring while retiring
8e76f5ceca89 drm/i915: Lock the gem_context->active_list while dropping the link
9660fb957595 drm/i915: Hold a reference to the active HW context
81e4c86f7858 drm/i915/selftests: Provide stub reset functions
77e9c6071ae2 drm/i915: Switch to use HWS indices rather than addresses
7c8295cb5045 drm/i915: Separate GEM context construction and registration to userspace
6c22100b87d7 drm/i915: Introduce a mutex for file_priv->context_idr
3efbc1e0b934 drm/i915: Introduce the i915_user_extension_method
-:72: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#72: 
new file mode 100644

-:77: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#77: FILE: drivers/gpu/drm/i915/i915_user_extensions.c:1:
+/*

-:78: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#78: FILE: drivers/gpu/drm/i915/i915_user_extensions.c:2:
+ * SPDX-License-Identifier: MIT

-:144: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#144: FILE: drivers/gpu/drm/i915/i915_user_extensions.h:1:
+/*

-:145: WARNING:SPDX_LICENSE_TAG: Misplaced SPDX-License-Identifier tag - use line 1 instead
#145: FILE: drivers/gpu/drm/i915/i915_user_extensions.h:2:
+ * SPDX-License-Identifier: MIT

-:178: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'ptr' - possible side-effects?
#178: FILE: drivers/gpu/drm/i915/i915_utils.h:114:
+#define container_of_user(ptr, type, member) ({				\
+	void __user *__mptr = (void __user *)(ptr);			\
+	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
+			 !__same_type(*(ptr), void),			\
+			 "pointer type mismatch in container_of()");	\
+	((type __user *)(__mptr - offsetof(type, member))); })

-:178: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible side-effects?
#178: FILE: drivers/gpu/drm/i915/i915_utils.h:114:
+#define container_of_user(ptr, type, member) ({				\
+	void __user *__mptr = (void __user *)(ptr);			\
+	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
+			 !__same_type(*(ptr), void),			\
+			 "pointer type mismatch in container_of()");	\
+	((type __user *)(__mptr - offsetof(type, member))); })

-:178: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as '(member)' to avoid precedence issues
#178: FILE: drivers/gpu/drm/i915/i915_utils.h:114:
+#define container_of_user(ptr, type, member) ({				\
+	void __user *__mptr = (void __user *)(ptr);			\
+	BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) &&	\
+			 !__same_type(*(ptr), void),			\
+			 "pointer type mismatch in container_of()");	\
+	((type __user *)(__mptr - offsetof(type, member))); })

-:198: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'U' - possible side-effects?
#198: FILE: drivers/gpu/drm/i915/i915_utils.h:134:
+#define check_user_mbz(U) ({						\
+	typeof(*(U)) mbz__;						\
+	get_user(mbz__, (U)) ? -EFAULT : mbz__ ? -EINVAL : 0;		\
+})

total: 0 errors, 5 warnings, 4 checks, 153 lines checked
1a5d1344d0c2 drm/i915: Create/destroy VM (ppGTT) for use with contexts
-:48: CHECK:UNCOMMENTED_DEFINITION: struct mutex definition without comment
#48: FILE: drivers/gpu/drm/i915/i915_drv.h:223:
+	struct mutex vm_lock;

-:699: WARNING:LINE_SPACING: Missing a blank line after declarations
#699: FILE: drivers/gpu/drm/i915/selftests/i915_gem_context.c:504:
+		struct drm_file *file;
+		IGT_TIMEOUT(end_time);

-:761: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#761: FILE: drivers/gpu/drm/i915/selftests/i915_gem_context.c:566:
+		ncontexts = dw = 0;

-:836: WARNING:LINE_SPACING: Missing a blank line after declarations
#836: FILE: drivers/gpu/drm/i915/selftests/i915_gem_context.c:634:
+		struct i915_gem_context *ctx = NULL;
+		IGT_TIMEOUT(end_time);

-:912: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#912: FILE: drivers/gpu/drm/i915/selftests/i915_gem_context.c:693:
+		ncontexts = dw = 0;

-:1067: WARNING:LONG_LINE: line over 100 characters
#1067: FILE: include/uapi/drm/i915_drm.h:407:
+#define DRM_IOCTL_I915_GEM_VM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_VM_CREATE, struct drm_i915_gem_vm_control)

-:1068: WARNING:LONG_LINE: line over 100 characters
#1068: FILE: include/uapi/drm/i915_drm.h:408:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:1068: WARNING:SPACING: space prohibited between function name and open parenthesis '('
#1068: FILE: include/uapi/drm/i915_drm.h:408:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

-:1068: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#1068: FILE: include/uapi/drm/i915_drm.h:408:
+#define DRM_IOCTL_I915_GEM_VM_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_VM_DESTROY, struct drm_i915_gem_vm_control)

total: 1 errors, 5 warnings, 3 checks, 1009 lines checked
c81c021e9a0a drm/i915: Extend CONTEXT_CREATE to set parameters upon construction
-:28: WARNING:LONG_LINE: line over 100 characters
#28: FILE: drivers/gpu/drm/i915/i915_drv.c:3113:
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE_EXT, i915_gem_context_create_ioctl, DRM_RENDER_ALLOW),

-:543: WARNING:LONG_LINE: line over 100 characters
#543: FILE: include/uapi/drm/i915_drm.h:397:
+#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create_ext)

-:543: WARNING:SPACING: space prohibited between function name and open parenthesis '('
#543: FILE: include/uapi/drm/i915_drm.h:397:
+#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create_ext)

-:543: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#543: FILE: include/uapi/drm/i915_drm.h:397:
+#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create_ext)

total: 1 errors, 3 warnings, 0 checks, 703 lines checked
b488cdbdc9b1 drm/i915: Allow contexts to share a single timeline across all engines
af709feee39c drm/i915: Allow userspace to clone contexts on creation
-:132: ERROR:BRACKET_SPACE: space prohibited before open square bracket '['
#132: FILE: drivers/gpu/drm/i915/i915_gem_context.c:1623:
+#define MAP(x, y) [ilog2(I915_CONTEXT_CLONE_##x)] = y

total: 1 errors, 0 warnings, 0 checks, 182 lines checked
58a9ba87fabc drm/i915: Allow a context to define its set of engines
-:482: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#482: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

-:482: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'member' - possible side-effects?
#482: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

-:482: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'member' may be better as '(member)' to avoid precedence issues
#482: FILE: drivers/gpu/drm/i915/i915_utils.h:107:
+#define check_struct_size(p, member, n, sz) \
+	likely(__check_struct_size(sizeof(*(p)), \
+				   sizeof(*(p)->member) + __must_be_array((p)->member), \
+				   n, sz))

total: 0 errors, 0 warnings, 3 checks, 490 lines checked
caf36e6e753f drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]
0709f6f09616 drm/i915: Load balancing across a virtual engine
-:958: WARNING:LINE_SPACING: Missing a blank line after declarations
#958: FILE: drivers/gpu/drm/i915/intel_lrc.c:3390:
+		struct intel_engine_cs *actual = ve->siblings[0];
+		intel_context_put(&ve->context);

total: 0 errors, 1 warnings, 0 checks, 1159 lines checked
8a4e035b1017 drm/i915: Extend execution fence to support a callback
f5b2cb4a5233 drm/i915/execlists: Virtual engine bonding
a218c6d2e690 drm/i915: Allow specification of parallel execbuf

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [01/22] drm/i915: Flush pages on acquisition
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (21 preceding siblings ...)
  2019-03-18 17:10 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Flush pages on acquisition Patchwork
@ 2019-03-18 17:20 ` Patchwork
  2019-03-18 17:36 ` ✗ Fi.CI.BAT: failure " Patchwork
  23 siblings, 0 replies; 53+ messages in thread
From: Patchwork @ 2019-03-18 17:20 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915: Flush pages on acquisition
URL   : https://patchwork.freedesktop.org/series/58122/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Flush pages on acquisition
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3558:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3566:16: warning: expression using sizeof(void)

Commit: drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3566:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3565:16: warning: expression using sizeof(void)

Commit: drm/i915: Hold a ref to the ring while retiring
Okay!

Commit: drm/i915: Lock the gem_context->active_list while dropping the link
Okay!

Commit: drm/i915: Hold a reference to the active HW context
Okay!

Commit: drm/i915/selftests: Provide stub reset functions
Okay!

Commit: drm/i915: Switch to use HWS indices rather than addresses
Okay!

Commit: drm/i915: Separate GEM context construction and registration to userspace
Okay!

Commit: drm/i915: Introduce a mutex for file_priv->context_idr
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3565:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3567:16: warning: expression using sizeof(void)

Commit: drm/i915: Introduce the i915_user_extension_method
Okay!

Commit: drm/i915: Create/destroy VM (ppGTT) for use with contexts
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3567:16: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/selftests/i915_gem_context.c:1132:25: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/selftests/i915_gem_context.c:1132:25: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3570:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_gem_context.c:1265:25: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_gem_context.c:1265:25: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/selftests/i915_gem_context.c:565:25: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/selftests/i915_gem_context.c:565:25: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_gem_context.c:569:33: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_gem_context.c:569:33: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_gem_context.c:696:33: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/i915_gem_context.c:696:33: warning: expression using sizeof(void)

Commit: drm/i915: Extend CONTEXT_CREATE to set parameters upon construction
Okay!

Commit: drm/i915: Allow contexts to share a single timeline across all engines
Okay!

Commit: drm/i915: Allow userspace to clone contexts on creation
+drivers/gpu/drm/i915/i915_gem_context.c:1624:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1625:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1626:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1627:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1628:17: error: bad integer constant expression
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1265:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:1265:25: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:452:16: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:569:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:569:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:696:33: warning: expression using sizeof(void)
-drivers/gpu/drm/i915/selftests/i915_gem_context.c:696:33: warning: expression using sizeof(void)
-./include/linux/slab.h:664:13: warning: call with no type!

Commit: drm/i915: Allow a context to define its set of engines
-O:drivers/gpu/drm/i915/i915_gem_context.c:1624:17: error: bad integer constant expression
-O:drivers/gpu/drm/i915/i915_gem_context.c:1625:17: error: bad integer constant expression
-O:drivers/gpu/drm/i915/i915_gem_context.c:1626:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1833:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1834:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1835:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_gem_context.c:1836:17: error: bad integer constant expression
+drivers/gpu/drm/i915/i915_utils.h:84:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:84:13: error: undefined identifier '__builtin_mul_overflow'
+drivers/gpu/drm/i915/i915_utils.h:84:13:    got void
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: incorrect type in conditional
+drivers/gpu/drm/i915/i915_utils.h:87:13: error: undefined identifier '__builtin_add_overflow'
+drivers/gpu/drm/i915/i915_utils.h:87:13:    got void
+./include/linux/slab.h:664:13: error: not a function <noident>

Commit: drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]
Okay!

Commit: drm/i915: Load balancing across a virtual engine
+./include/linux/overflow.h:285:13: error: incorrect type in conditional
+./include/linux/overflow.h:285:13: error: undefined identifier '__builtin_mul_overflow'
+./include/linux/overflow.h:285:13:    got void
+./include/linux/overflow.h:285:13: warning: call with no type!
+./include/linux/overflow.h:287:13: error: incorrect type in conditional
+./include/linux/overflow.h:287:13: error: undefined identifier '__builtin_add_overflow'
+./include/linux/overflow.h:287:13:    got void
+./include/linux/overflow.h:287:13: warning: call with no type!

Commit: drm/i915: Extend execution fence to support a callback
Okay!

Commit: drm/i915/execlists: Virtual engine bonding
Okay!

Commit: drm/i915: Allow specification of parallel execbuf
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [01/22] drm/i915: Flush pages on acquisition
  2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
                   ` (22 preceding siblings ...)
  2019-03-18 17:20 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-03-18 17:36 ` Patchwork
  23 siblings, 0 replies; 53+ messages in thread
From: Patchwork @ 2019-03-18 17:36 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/22] drm/i915: Flush pages on acquisition
URL   : https://patchwork.freedesktop.org/series/58122/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_5767 -> Patchwork_12496
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_12496 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_12496, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/58122/revisions/1/mbox/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_12496:

### IGT changes ###

#### Possible regressions ####

  * igt@i915_selftest@live_contexts:
    - fi-bdw-gvtdvm:      PASS -> DMESG-FAIL

  
Known issues
------------

  Here are the changes found in Patchwork_12496 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@kms_busy@basic-flip-a:
    - fi-gdg-551:         PASS -> FAIL [fdo#103182] +1

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-byt-clapper:     PASS -> FAIL [fdo#103191] / [fdo#107362]

  * igt@prime_vgem@basic-fence-flip:
    - fi-gdg-551:         PASS -> DMESG-FAIL [fdo#103182]

  
#### Warnings ####

  * igt@i915_selftest@live_contexts:
    - fi-icl-u3:          INCOMPLETE [fdo#108569] -> DMESG-FAIL [fdo#108569]

  
  [fdo#103182]: https://bugs.freedesktop.org/show_bug.cgi?id=103182
  [fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
  [fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569


Participating hosts (49 -> 41)
------------------------------

  Missing    (8): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-icl-y fi-bdw-samus 


Build changes
-------------

    * Linux: CI_DRM_5767 -> Patchwork_12496

  CI_DRM_5767: 289bd1852756ddd2779c32cd13ae10e7bf44faca @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4888: 71ad19eb8fe4f0eecae3bf063e107293b90b9abc @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12496: a218c6d2e690b4a5be54f8e59e10f74adc55b78c @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

a218c6d2e690 drm/i915: Allow specification of parallel execbuf
f5b2cb4a5233 drm/i915/execlists: Virtual engine bonding
8a4e035b1017 drm/i915: Extend execution fence to support a callback
0709f6f09616 drm/i915: Load balancing across a virtual engine
caf36e6e753f drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[]
58a9ba87fabc drm/i915: Allow a context to define its set of engines
af709feee39c drm/i915: Allow userspace to clone contexts on creation
b488cdbdc9b1 drm/i915: Allow contexts to share a single timeline across all engines
c81c021e9a0a drm/i915: Extend CONTEXT_CREATE to set parameters upon construction
1a5d1344d0c2 drm/i915: Create/destroy VM (ppGTT) for use with contexts
3efbc1e0b934 drm/i915: Introduce the i915_user_extension_method
6c22100b87d7 drm/i915: Introduce a mutex for file_priv->context_idr
7c8295cb5045 drm/i915: Separate GEM context construction and registration to userspace
77e9c6071ae2 drm/i915: Switch to use HWS indices rather than addresses
81e4c86f7858 drm/i915/selftests: Provide stub reset functions
9660fb957595 drm/i915: Hold a reference to the active HW context
8e76f5ceca89 drm/i915: Lock the gem_context->active_list while dropping the link
760f0e1843fb drm/i915: Hold a ref to the ring while retiring
2f0c66c9e084 drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h
db0c17ac314c drm/i915: Flush pages on acquisition

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12496/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr
  2019-03-18 16:45       ` Tvrtko Ursulin
@ 2019-03-18 21:10         ` Chris Wilson
  0 siblings, 0 replies; 53+ messages in thread
From: Chris Wilson @ 2019-03-18 21:10 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2019-03-18 16:45:41)
> 
> On 18/03/2019 16:35, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-03-18 16:28:35)
> >>
> >> On 18/03/2019 09:51, Chris Wilson wrote:
> >>> Define a mutex for the exclusive use of interacting with the per-file
> >>> context-idr, that was previously guarded by struct_mutex. This allows us
> >>> to reduce the coverage of struct_mutex, with a view to removing the last
> >>> bits coordinating GEM context later. (In the short term, we avoid taking
> >>> struct_mutex while using the extended constructor functions, preventing
> >>> some nasty recursion.)
> >>>
> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_drv.h         |  2 ++
> >>>    drivers/gpu/drm/i915/i915_gem_context.c | 43 +++++++++++--------------
> >>>    2 files changed, 21 insertions(+), 24 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> >>> index 86080a6e0f45..90389333dd47 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -216,7 +216,9 @@ struct drm_i915_file_private {
> >>>     */
> >>>    #define DRM_I915_THROTTLE_JIFFIES msecs_to_jiffies(20)
> >>>        } mm;
> >>> +
> >>>        struct idr context_idr;
> >>> +     struct mutex context_lock; /* guards context_idr */
> >>
> >> context_idr_lock then?
> >>
> >>>    
> >>>        unsigned int bsd_engine;
> >>>    
> >>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> >>> index 5df3d423ec6c..94c466d4b29e 100644
> >>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> >>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> >>> @@ -579,9 +579,7 @@ void i915_gem_contexts_fini(struct drm_i915_private *i915)
> >>>    
> >>>    static int context_idr_cleanup(int id, void *p, void *data)
> >>>    {
> >>> -     struct i915_gem_context *ctx = p;
> >>> -
> >>> -     context_close(ctx);
> >>> +     context_close(p);
> >>>        return 0;
> >>>    }
> >>>    
> >>> @@ -596,8 +594,10 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >>>                ctx->ppgtt->vm.file = fpriv;
> >>>    
> >>>        /* And (nearly) finally expose ourselves to userspace via the idr */
> >>> +     mutex_lock(&fpriv->context_lock);
> >>>        ret = idr_alloc(&fpriv->context_idr, ctx,
> >>>                        DEFAULT_CONTEXT_HANDLE, 0, GFP_KERNEL);
> >>> +     mutex_unlock(&fpriv->context_lock);
> >>>        if (ret < 0)
> >>>                goto err_pid;
> >>>    
> >>> @@ -616,7 +616,9 @@ static int gem_context_register(struct i915_gem_context *ctx,
> >>>        return 0;
> >>>    
> >>>    err_idr:
> >>> +     mutex_lock(&fpriv->context_lock);
> >>>        idr_remove(&fpriv->context_idr, ctx->user_handle);
> >>> +     mutex_unlock(&fpriv->context_lock);
> >>>        ctx->file_priv = NULL;
> >>>    err_pid:
> >>>        put_pid(ctx->pid);
> >>> @@ -632,10 +634,11 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >>>        int err;
> >>>    
> >>>        idr_init(&file_priv->context_idr);
> >>> +     mutex_init(&file_priv->context_lock);
> >>>    
> >>>        mutex_lock(&i915->drm.struct_mutex);
> >>> -
> >>>        ctx = i915_gem_create_context(i915);
> >>> +     mutex_unlock(&i915->drm.struct_mutex);
> >>>        if (IS_ERR(ctx)) {
> >>>                err = PTR_ERR(ctx);
> >>>                goto err;
> >>> @@ -648,14 +651,14 @@ int i915_gem_context_open(struct drm_i915_private *i915,
> >>>        GEM_BUG_ON(ctx->user_handle != DEFAULT_CONTEXT_HANDLE);
> >>>        GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
> >>>    
> >>> -     mutex_unlock(&i915->drm.struct_mutex);
> >>> -
> >>>        return 0;
> >>>    
> >>>    err_ctx:
> >>> +     mutex_lock(&i915->drm.struct_mutex);
> >>>        context_close(ctx);
> >>> -err:
> >>>        mutex_unlock(&i915->drm.struct_mutex);
> >>> +err:
> >>> +     mutex_destroy(&file_priv->context_lock);
> >>>        idr_destroy(&file_priv->context_idr);
> >>>        return PTR_ERR(ctx);
> >>>    }
> >>> @@ -668,6 +671,7 @@ void i915_gem_context_close(struct drm_file *file)
> >>>    
> >>>        idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
> >>>        idr_destroy(&file_priv->context_idr);
> >>> +     mutex_destroy(&file_priv->context_lock);
> >>>    }
> >>>    
> >>>    static struct i915_request *
> >>> @@ -850,25 +854,22 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
> >>>                return ret;
> >>>    
> >>>        ctx = i915_gem_create_context(i915);
> >>> -     if (IS_ERR(ctx)) {
> >>> -             ret = PTR_ERR(ctx);
> >>> -             goto err_unlock;
> >>> -     }
> >>> +     mutex_unlock(&dev->struct_mutex);
> >>> +     if (IS_ERR(ctx))
> >>> +             return PTR_ERR(ctx);
> >>>    
> >>>        ret = gem_context_register(ctx, file_priv);
> >>>        if (ret)
> >>>                goto err_ctx;
> >>>    
> >>> -     mutex_unlock(&dev->struct_mutex);
> >>> -
> >>>        args->ctx_id = ctx->user_handle;
> >>>        DRM_DEBUG("HW context %d created\n", args->ctx_id);
> >>>    
> >>>        return 0;
> >>>    
> >>>    err_ctx:
> >>> +     mutex_lock(&dev->struct_mutex);
> >>>        context_close(ctx);
> >>> -err_unlock:
> >>>        mutex_unlock(&dev->struct_mutex);
> >>>        return ret;
> >>>    }
> >>> @@ -879,7 +880,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >>>        struct drm_i915_gem_context_destroy *args = data;
> >>>        struct drm_i915_file_private *file_priv = file->driver_priv;
> >>>        struct i915_gem_context *ctx;
> >>> -     int ret;
> >>>    
> >>>        if (args->pad != 0)
> >>>                return -EINVAL;
> >>> @@ -887,21 +887,16 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
> >>>        if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
> >>>                return -ENOENT;
> >>>    
> >>> -     ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
> >>> +     mutex_lock(&file_priv->context_lock);
> >>> +     ctx = idr_remove(&file_priv->context_idr, args->ctx_id);
> >>> +     mutex_lock(&file_priv->context_lock);
> >>>        if (!ctx)
> >>>                return -ENOENT;
> >>>    
> >>> -     ret = mutex_lock_interruptible(&dev->struct_mutex);
> >>> -     if (ret)
> >>> -             goto out;
> >>> -
> >>> -     idr_remove(&file_priv->context_idr, ctx->user_handle);
> >>> +     mutex_lock(&dev->struct_mutex);
> >>
> >> I'd keep this one interruptible. Hm bummer, there was more of them before..
> > 
> > At this point, interrupt handling becomes problematic, as we have to
> > then re-insert the ctx_id into the idr and that may have already been
> > claimed elsewhere.
> 
> Ugh, bad.. Can we have struct_mutex nest under the context_idr_lock?

General rule is to keep struct_mutex the outer lock. If you take
struct_mutex inside another lock, that lock picks up all the
struct_mutex bad habits from lockdep, and you suddenly find yourself in
a mighty world of pain.

Removing the requirement of struct_mutex around close isn't a huge
problem after
https://patchwork.freedesktop.org/patch/291947/?series=57942&rev=2 as
that gives us the locking we need to serialise the lut handles (in fact
it the struct_mutex requirement drops out of that patch, and we can
simply do it then.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 03/22] drm/i915: Sanity check mmap length against object size
  2019-03-18  9:51 ` [PATCH 03/22] drm/i915: Sanity check mmap length against object size Chris Wilson
@ 2019-03-25  0:38   ` Sasha Levin
  0 siblings, 0 replies; 53+ messages in thread
From: Sasha Levin @ 2019-03-25  0:38 UTC (permalink / raw)
  To: Sasha Levin, Chris Wilson, intel-gfx; +Cc: stable

Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.0.3, v4.19.30, v4.14.107, v4.9.164, v4.4.176, v3.18.136.

v5.0.3: Failed to apply! Possible dependencies:
    739f3abdbfcf ("drm/i915: small isolated c99 types to kernel types switch")
    ebfb6977801d ("drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set")

v4.19.30: Failed to apply! Possible dependencies:
    739f3abdbfcf ("drm/i915: small isolated c99 types to kernel types switch")
    ebfb6977801d ("drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set")
    f28ec6f4ea48 ("drm/i915: Constify power well descriptors")

v4.14.107: Failed to apply! Possible dependencies:
    0d6fc92a73e0 ("drm/i915: Separate RPS and RC6 handling for VLV")
    274b2462a049 ("drm/i915: Object w/o backing storage is banned by -ENXIO")
    3e8ddd9e5071 ("drm/i915: Nuke some bogus tabs from the pcode defines")
    48469eced282 ("drm/i915: Use cdclk_state->voltage on CNL")
    5161d058dff4 ("drm/i915: Fix BXT lane latency optimal setting with MST")
    53e9bf5e8159 ("drm/i915: Adjust system agent voltage on CNL if required by DDI ports")
    61843f0e6212 ("drm/i915: Name the IPS_PCODE_CONTROL bit")
    739f3abdbfcf ("drm/i915: small isolated c99 types to kernel types switch")
    960e54652cee ("drm/i915: Separate RPS and RC6 handling for gen6+")
    9f817501bd7f ("drm/i915: Move rps.hw_lock to dev_priv and s/hw_lock/pcu_lock")
    d305e0614601 ("drm/i915: Track minimum acceptable cdclk instead of "minimum dotclock"")
    d46b00dc38c8 ("drm/i915: Separate RPS and RC6 handling for CHV")

v4.9.164: Failed to apply! Possible dependencies:
    0e70447605f4 ("drm/i915: Move common code out of i915_gpu_error.c")
    1b36595ffb35 ("drm/i915: Show RING registers through debugfs")
    3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines")
    9c870d03674f ("drm/i915: Use RPM as the barrier for controlling user mmap access")
    bb6dc8d96b68 ("drm/i915: Implement pread without struct-mutex")
    d636951ec01b ("drm/i915: Cleanup instdone collection")
    f0cd518206e1 ("drm/i915: Use lockless object free")
    f9e613728090 ("drm/i915: Try to print INSTDONE bits for all slice/subslice")

v4.4.176: Failed to apply! Possible dependencies:
    03ac0642f67a ("drm/i915: Wrap drm_gem_object_lookup in i915_gem_object_lookup")
    1b5708ffb103 ("drm/amdgpu: export amd_powerplay_func to amdgpu and other ip block")
    1ea863fd736e ("drm/amdgpu: keep the prefered/allowed domains in the BO")
    1f7371b2a5fa ("drm/amd/powerplay: add basic powerplay framework")
    288912cb95d1 ("drm/amdgpu: use $(src) in Makefile (v2)")
    2a7d9bdabec2 ("drm/amdgpu: cleanup amdgpu_cs_parser_relocs")
    2f4b9400336e ("drm/amdgpu: clean up hw semaphore support in driver")
    36409d122cb8 ("drm/amdgpu: cleanup amdgpu_cs_list_validate")
    3a2c788d95a2 ("drm/amdgpu: share struct amdgpu_pm_state_type with powerplay module")
    3af76f23a45b ("drm/amdgpu: export fan control functions to amdgpu")
    3c0eea6c35d9 ("drm/amdgpu: put VM page tables directly into duplicates list")
    4ff37a83f19d ("drm/amdgpu: fix VM faults caused by vm_grab_id() v4")
    56467ebfb254 ("drm/amdgpu: split VM PD and PT handling during CS")
    636ce25c3001 ("drm/amdgpu: cleanup bo list bucket handling")
    758ac17f963f ("drm/amdgpu: fix and cleanup user fence handling v2")
    8d0a7cea824a ("drm/amdgpu: grab VMID before submitting job v5")
    a8ad0bd84f98 ("drm: Remove unused drm_device from drm_gem_object_lookup()")
    be86c606b50a ("drm/amdgpu: cleanup amdgpu_sync_rings V2")
    c5637837ba5d ("drm/amdgpu: keep vm in job instead of ib (v2)")
    cc325d191347 ("drm/amdgpu: check userptrs mm earlier")
    d8e0cae64550 ("drm/amdgpu: validate duplicates first")
    e61710c59dd2 ("drm/amdgpu: support per device powerplay enablement (v2)")
    edf600dac65e ("drm/amd: cleanup remaining spaces and tabs v2")
    ee1782c3f27f ("drm/amdgpu: keep the PTs validation list in the VM v2")
    f69f90a113f2 ("drm/amdgpu: fix amdgpu_cs_get_threshold_for_moves handling")

v3.18.136: Failed to apply! Possible dependencies:
    03ac0642f67a ("drm/i915: Wrap drm_gem_object_lookup in i915_gem_object_lookup")
    049fc527b464 ("drm/amdgpu: dispatch jobs in cs")
    1d263474c441 ("drm/amdgpu: unwind properly in amdgpu_cs_parser_init()")
    2a7d9bdabec2 ("drm/amdgpu: cleanup amdgpu_cs_parser_relocs")
    3cb485f34049 ("drm/amdgpu: fix context switch")
    46651cc5dbee ("drm/amdgpu fix amdgpu.dpm=0 (v2)")
    564ea7900cff ("drm/amdgpu: enable uvd dpm and powergating")
    5fc3aeeb9e55 ("drm/amdgpu: rename amdgpu_ip_funcs to amd_ip_funcs (v2)")
    636ce25c3001 ("drm/amdgpu: cleanup bo list bucket handling")
    72efa7ebdea0 ("drm/amdgpu: check context id for context switching (v2)")
    81629cba1f12 ("drm/amdgpu: add amdgpu uapi header (v4)")
    840d51445f15 ("drm/amdgpu: fix bug occurs when bo_list is NULL")
    8e9198d0698a ("drm/amdgpu: move some atombios definitions to common folder (v2)")
    97b2e202fba0 ("drm/amdgpu: add amdgpu.h (v2)")
    a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts")
    a3348bb801ba ("drm/amdgpu: don't need to use bo_list_clone any more")
    a5b750583eb4 ("drm/amdgpu: validate duplicates in the CS as well")
    a8ad0bd84f98 ("drm: Remove unused drm_device from drm_gem_object_lookup()")
    a961ea7349d0 ("drm/amdgpu: fix userptr lockup")
    aa2bdb247620 ("drm/amdgpu: add CE preamble flag v3")
    aaa36a976bbb ("drm/amdgpu: Add initial VI support")
    b80d8475c1fd ("drm/amdgpu: add scheduler initialization")
    c1b69ed0c62f ("drm/amdgpu: add backend implementation of gpu scheduler (v2)")
    cc325d191347 ("drm/amdgpu: check userptrs mm earlier")
    d2edb07b10fc ("drm/amdgpu: cleanup HDP flush handling")
    d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)")
    d7006964d46d ("drm/amdgpu: fix issue with overlapping userptrs")
    d919ad49ac04 ("drm/amdgpu: fix dereference before check")
    de807f818b95 ("drm/amdgpu: add flags for amdgpu_ib structure")


How should we proceed with this patch?

--
Thanks,
Sasha
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2019-03-25  0:38 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-18  9:51 [PATCH 01/22] drm/i915: Flush pages on acquisition Chris Wilson
2019-03-18  9:51 ` [PATCH 02/22] drm/i915: Move intel_engine_mask_t around for use by i915_request_types.h Chris Wilson
2019-03-18 10:21   ` Tvrtko Ursulin
2019-03-18 10:40     ` Chris Wilson
2019-03-18 10:48       ` Tvrtko Ursulin
2019-03-18 13:57         ` Chris Wilson
2019-03-18  9:51 ` [PATCH 03/22] drm/i915: Sanity check mmap length against object size Chris Wilson
2019-03-25  0:38   ` Sasha Levin
2019-03-18  9:51 ` [PATCH 04/22] drm/i915: Hold a ref to the ring while retiring Chris Wilson
2019-03-18 10:31   ` Tvrtko Ursulin
2019-03-18 10:37     ` Chris Wilson
2019-03-18 10:46       ` Tvrtko Ursulin
2019-03-18 10:56         ` Chris Wilson
2019-03-18 13:25           ` Tvrtko Ursulin
2019-03-18  9:51 ` [PATCH 05/22] drm/i915: Lock the gem_context->active_list while dropping the link Chris Wilson
2019-03-18 10:39   ` Tvrtko Ursulin
2019-03-18 10:45     ` Chris Wilson
2019-03-18 10:50       ` Tvrtko Ursulin
2019-03-18 10:54   ` Chris Wilson
2019-03-18  9:51 ` [PATCH 06/22] drm/i915: Hold a reference to the active HW context Chris Wilson
2019-03-18 12:54   ` Tvrtko Ursulin
2019-03-18 12:56     ` Chris Wilson
2019-03-18 12:57       ` Chris Wilson
2019-03-18 13:29         ` Tvrtko Ursulin
2019-03-18  9:51 ` [PATCH 07/22] drm/i915: Stop needlessly acquiring wakeref for debugfs/drop_caches_set Chris Wilson
2019-03-18 13:08   ` Tvrtko Ursulin
2019-03-18  9:51 ` [PATCH 08/22] drm/i915/selftests: Provide stub reset functions Chris Wilson
2019-03-18  9:51 ` [PATCH 09/22] drm/i915: Switch to use HWS indices rather than addresses Chris Wilson
2019-03-18 13:21   ` Tvrtko Ursulin
2019-03-18  9:51 ` [PATCH 10/22] drm/i915: Separate GEM context construction and registration to userspace Chris Wilson
2019-03-18 16:22   ` Tvrtko Ursulin
2019-03-18 16:30     ` Chris Wilson
2019-03-18 16:32       ` Chris Wilson
2019-03-18 16:46         ` Tvrtko Ursulin
2019-03-18  9:51 ` [PATCH 11/22] drm/i915: Introduce a mutex for file_priv->context_idr Chris Wilson
2019-03-18 16:28   ` Tvrtko Ursulin
2019-03-18 16:35     ` Chris Wilson
2019-03-18 16:45       ` Tvrtko Ursulin
2019-03-18 21:10         ` Chris Wilson
2019-03-18  9:51 ` [PATCH 12/22] drm/i915: Introduce the i915_user_extension_method Chris Wilson
2019-03-18  9:51 ` [PATCH 13/22] drm/i915: Create/destroy VM (ppGTT) for use with contexts Chris Wilson
2019-03-18  9:51 ` [PATCH 14/22] drm/i915: Extend CONTEXT_CREATE to set parameters upon construction Chris Wilson
2019-03-18  9:51 ` [PATCH 15/22] drm/i915: Allow contexts to share a single timeline across all engines Chris Wilson
2019-03-18  9:51 ` [PATCH 16/22] drm/i915: Allow userspace to clone contexts on creation Chris Wilson
2019-03-18  9:51 ` [PATCH 17/22] drm/i915: Allow a context to define its set of engines Chris Wilson
2019-03-18  9:52 ` [PATCH 18/22] drm/i915: Extend I915_CONTEXT_PARAM_SSEU to support local ctx->engine[] Chris Wilson
2019-03-18  9:52 ` [PATCH 19/22] drm/i915: Load balancing across a virtual engine Chris Wilson
2019-03-18  9:52 ` [PATCH 20/22] drm/i915: Extend execution fence to support a callback Chris Wilson
2019-03-18  9:52 ` [PATCH 21/22] drm/i915/execlists: Virtual engine bonding Chris Wilson
2019-03-18  9:52 ` [PATCH 22/22] drm/i915: Allow specification of parallel execbuf Chris Wilson
2019-03-18 17:10 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [01/22] drm/i915: Flush pages on acquisition Patchwork
2019-03-18 17:20 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-03-18 17:36 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.