All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] Virtual queue/engine uAPI prototype
@ 2018-01-25 13:33 Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 01/10] move-timeline-to-ctx Tvrtko Ursulin
                   ` (10 more replies)
  0 siblings, 11 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

The latest idea on how to load-balance VCS submissions in i915.

This time round we have a concept of submission queues implemented as contexts
which share PPGTT.

Userspace is supposed to create a queue with the virtual flag set to one.
Subsequent submissions to this queue will be load-balanced between available VCS
engines by the driver.

Only new class/instance based execbuf uAPI is supported. Explicit VCS engine
selection will be rejected meaning engine instance must be set to zero.

Engine capabilities must be used if a submission depends on an non-uniform
engine feature like HEVC which is only available on VCS0.

This series is just a prototype which illustrates the uAPI proposal. It does not
claim to balance well. In fact, it is known to balance poorly - in some cases
significantly worse than the same approach implemented in userspace. This
potentially stems from the fact it creates a forced dependency when moving
between engines (even when there are no data dependencies between batches) in
order to simulate a single stream of execution per virtual engine.

Series also doesn't implement the virtual engine fully in a way that context
state does not move around with the submissions, ie. it is not preserved.

I will also post gem_wsim and media-bench.pl IGT patches which allows playing
with this easily.

Chris Wilson (2):
  move-timeline-to-ctx
  drm/i915: Extend CREATE_CONTEXT to allow inheritance ala clone()

Tvrtko Ursulin (8):
  drm/i915: Select engines via class and instance in execbuffer2
  drm/i915: Engine capabilities uAPI
  drm/i915: Re-arrange execbuf so context is known before engine
  drm/i915: Refactor eb_select_engine to take eb
  drm/i915: Track latest request per engine class
  drm/i915: Allow creating virtual contexts
  drm/i915: Trivial virtual engine implementation
  drm/i915: Naive engine busyness based load balancing

 drivers/gpu/drm/i915/i915_drv.h                   |  11 +-
 drivers/gpu/drm/i915/i915_gem.c                   |   9 +-
 drivers/gpu/drm/i915/i915_gem_context.c           |  91 +++++++--
 drivers/gpu/drm/i915/i915_gem_context.h           |  23 +++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c        | 223 ++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c               |   7 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h               |   2 -
 drivers/gpu/drm/i915/i915_gem_request.c           |  15 +-
 drivers/gpu/drm/i915/i915_gem_request.h           |   2 +
 drivers/gpu/drm/i915/i915_gem_timeline.c          |  66 +++++-
 drivers/gpu/drm/i915/i915_gem_timeline.h          |   6 +
 drivers/gpu/drm/i915/intel_engine_cs.c            |  41 +++-
 drivers/gpu/drm/i915/intel_lrc.c                  |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c           |   9 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h           |   9 +-
 drivers/gpu/drm/i915/selftests/huge_pages.c       |   1 -
 drivers/gpu/drm/i915/selftests/i915_gem_context.c | 236 +++++++++++++++++-----
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c     |   1 -
 drivers/gpu/drm/i915/selftests/mock_context.c     |   3 +-
 drivers/gpu/drm/i915/selftests/mock_engine.c      |   3 +-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c  |   4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c         |   1 -
 include/uapi/drm/i915_drm.h                       |  38 +++-
 23 files changed, 660 insertions(+), 143 deletions(-)

-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [RFC 01/10] move-timeline-to-ctx
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 02/10] drm/i915: Extend CREATE_CONTEXT to allow inheritance ala clone() Tvrtko Ursulin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

---
 drivers/gpu/drm/i915/i915_drv.h                  | 13 +-----
 drivers/gpu/drm/i915/i915_gem.c                  |  9 ++--
 drivers/gpu/drm/i915/i915_gem_context.c          | 15 ++++++-
 drivers/gpu/drm/i915/i915_gem_context.h          |  2 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c       | 24 +++++++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c              |  3 --
 drivers/gpu/drm/i915/i915_gem_gtt.h              |  1 -
 drivers/gpu/drm/i915/i915_gem_request.c          |  2 +-
 drivers/gpu/drm/i915/i915_gem_timeline.c         | 54 +++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_timeline.h         |  4 ++
 drivers/gpu/drm/i915/intel_engine_cs.c           |  3 +-
 drivers/gpu/drm/i915/intel_lrc.c                 |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c          |  9 +++-
 drivers/gpu/drm/i915/intel_ringbuffer.h          |  5 ++-
 drivers/gpu/drm/i915/selftests/mock_engine.c     |  3 +-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |  4 +-
 drivers/gpu/drm/i915/selftests/mock_gtt.c        |  1 -
 17 files changed, 119 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2a5845a896b6..0c348f6ab386 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2305,7 +2305,8 @@ struct drm_i915_private {
 		void (*cleanup_engine)(struct intel_engine_cs *engine);
 
 		struct list_head timelines;
-		struct i915_gem_timeline global_timeline;
+		struct i915_gem_timeline execution_timeline;
+		struct i915_gem_timeline legacy_timeline;
 		u32 active_requests;
 
 		/**
@@ -3448,16 +3449,6 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
-static inline struct intel_timeline *
-i915_gem_context_lookup_timeline(struct i915_gem_context *ctx,
-				 struct intel_engine_cs *engine)
-{
-	struct i915_address_space *vm;
-
-	vm = ctx->ppgtt ? &ctx->ppgtt->base : &ctx->i915->ggtt.base;
-	return &vm->timeline.engine[engine->id];
-}
-
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file);
 int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 062b21408698..02f71eb5c9d9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3010,10 +3010,10 @@ static void engine_skip_context(struct drm_i915_gem_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
 	struct i915_gem_context *hung_ctx = request->ctx;
-	struct intel_timeline *timeline;
+	struct intel_timeline *timeline = request->timeline;
 	unsigned long flags;
 
-	timeline = i915_gem_context_lookup_timeline(hung_ctx, engine);
+	GEM_BUG_ON(timeline == engine->timeline);
 
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 	spin_lock(&timeline->lock);
@@ -3677,7 +3677,7 @@ int i915_gem_wait_for_idle(struct drm_i915_private *i915, unsigned int flags)
 
 		ret = wait_for_engines(i915);
 	} else {
-		ret = wait_for_timeline(&i915->gt.global_timeline, flags);
+		ret = wait_for_timeline(&i915->gt.execution_timeline, flags);
 	}
 
 	return ret;
@@ -5536,7 +5536,8 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv)
 	WARN_ON(dev_priv->mm.object_count);
 
 	mutex_lock(&dev_priv->drm.struct_mutex);
-	i915_gem_timeline_fini(&dev_priv->gt.global_timeline);
+	i915_gem_timeline_fini(&dev_priv->gt.legacy_timeline);
+	i915_gem_timeline_fini(&dev_priv->gt.execution_timeline);
 	WARN_ON(!list_empty(&dev_priv->gt.timelines));
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 648e7536ff51..f72046adaace 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -121,6 +121,7 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 
+	i915_gem_timeline_free(ctx->timeline);
 	i915_ppgtt_put(ctx->ppgtt);
 
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
@@ -373,6 +374,18 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 		ctx->desc_template = default_desc_template(dev_priv, ppgtt);
 	}
 
+	if (HAS_EXECLISTS(dev_priv)) {
+		struct i915_gem_timeline *timeline;
+
+		timeline = i915_gem_timeline_create(dev_priv, ctx->name);
+		if (IS_ERR(timeline)) {
+			__destroy_hw_context(ctx, file_priv);
+			return ERR_CAST(timeline);
+		}
+
+		ctx->timeline = timeline;
+	}
+
 	trace_i915_context_create(ctx);
 
 	return ctx;
@@ -574,7 +587,7 @@ static bool engine_has_idle_kernel_context(struct intel_engine_cs *engine)
 	list_for_each_entry(timeline, &engine->i915->gt.timelines, link) {
 		struct intel_timeline *tl;
 
-		if (timeline == &engine->i915->gt.global_timeline)
+		if (timeline == &engine->i915->gt.execution_timeline)
 			continue;
 
 		tl = &timeline->engine[engine->id];
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 4bfb72f8e1cb..cfa69b12a6b2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -55,6 +55,8 @@ struct i915_gem_context {
 	/** file_priv: owning file descriptor */
 	struct drm_i915_file_private *file_priv;
 
+	struct i915_gem_timeline *timeline;
+
 	/**
 	 * @ppgtt: unique address space (GTT)
 	 *
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 4401068ff468..cd482b981fdd 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1753,6 +1753,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		unsigned int flags = eb->flags[i];
 		struct i915_vma *vma = eb->vma[i];
 		struct drm_i915_gem_object *obj = vma->obj;
+		struct drm_i915_gem_request *order;
 
 		if (flags & EXEC_OBJECT_CAPTURE) {
 			struct i915_gem_capture_list *capture;
@@ -1783,6 +1784,29 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 				flags &= ~EXEC_OBJECT_ASYNC;
 		}
 
+		/*
+		 * XXX As we allow multiple queues to share the vma, but
+		 * with different timelines, yet we rely on a single
+		 * timeline through the vm (for activity tracking
+		 * see i915_vma_move_to_active()/i915_vma_retire()) we impose
+		 * that ordering constraint on the different timelines here.
+		 *
+		 * Note that this ordering constraint is undesirable as we
+		 * want to keep our weakly ordered reads through the GEM
+		 * interface. That will require us to be able to track
+		 * multiple timelines (lifting the current limit of one
+		 * per engine), like struct reservation_object but coupled
+		 * into our activity tracking.
+		 */
+		order = i915_gem_active_peek(&vma->last_read[eb->engine->id],
+					     &eb->i915->drm.struct_mutex);
+		if (order) {
+			err = i915_gem_request_await_dma_fence(eb->request,
+							       &order->fence);
+			if (err)
+				return err;
+		}
+
 		if (flags & EXEC_OBJECT_ASYNC)
 			continue;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index be227512430a..b355ba1eee22 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2116,8 +2116,6 @@ static void i915_address_space_init(struct i915_address_space *vm,
 				    struct drm_i915_private *dev_priv,
 				    const char *name)
 {
-	i915_gem_timeline_init(dev_priv, &vm->timeline, name);
-
 	drm_mm_init(&vm->mm, 0, vm->total);
 	vm->mm.head_node.color = I915_COLOR_UNEVICTABLE;
 
@@ -2134,7 +2132,6 @@ static void i915_address_space_fini(struct i915_address_space *vm)
 	if (pagevec_count(&vm->free_pages))
 		vm_free_pages_release(vm, true);
 
-	i915_gem_timeline_fini(&vm->timeline);
 	drm_mm_takedown(&vm->mm);
 	list_del(&vm->global_link);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index a42890d9af38..0028a0ccc9a0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -256,7 +256,6 @@ struct i915_pml4 {
 
 struct i915_address_space {
 	struct drm_mm mm;
-	struct i915_gem_timeline timeline;
 	struct drm_i915_private *i915;
 	struct device *dma;
 	/* Every address space belongs to a struct file - except for the global
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index ac2db7f716a1..160d81bf6d85 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -724,7 +724,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 		}
 	}
 
-	req->timeline = i915_gem_context_lookup_timeline(ctx, engine);
+	req->timeline = ring->timeline;
 	GEM_BUG_ON(req->timeline == engine->timeline);
 
 	spin_lock_init(&req->lock);
diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i915_gem_timeline.c
index e9fd87604067..1bf48bdb78c4 100644
--- a/drivers/gpu/drm/i915/i915_gem_timeline.c
+++ b/drivers/gpu/drm/i915/i915_gem_timeline.c
@@ -95,12 +95,28 @@ int i915_gem_timeline_init(struct drm_i915_private *i915,
 
 int i915_gem_timeline_init__global(struct drm_i915_private *i915)
 {
-	static struct lock_class_key class;
+	static struct lock_class_key class1, class2;
+	int err;
+
+	err = __i915_gem_timeline_init(i915,
+				       &i915->gt.execution_timeline,
+				       "[execution]", &class1,
+				       "i915_execution_timeline");
+	if (err)
+		return err;
+
+	err = __i915_gem_timeline_init(i915,
+				     &i915->gt.legacy_timeline,
+				     "[global]", &class2,
+				     "i915_global_timeline");
+	if (err)
+		goto err_exec_timeline;
+
+	return 0;
 
-	return __i915_gem_timeline_init(i915,
-					&i915->gt.global_timeline,
-					"[execution]",
-					&class, "&global_timeline->lock");
+err_exec_timeline:
+	i915_gem_timeline_fini(&i915->gt.execution_timeline);
+	return err;
 }
 
 /**
@@ -148,6 +164,34 @@ void i915_gem_timeline_fini(struct i915_gem_timeline *timeline)
 	kfree(timeline->name);
 }
 
+struct i915_gem_timeline *
+i915_gem_timeline_create(struct drm_i915_private *i915, const char *name)
+{
+	struct i915_gem_timeline *timeline;
+	int err;
+
+	timeline = kzalloc(sizeof(*timeline), GFP_KERNEL);
+	if (!timeline)
+		return ERR_PTR(-ENOMEM);
+
+	err = i915_gem_timeline_init(i915, timeline, name);
+	if (err) {
+		kfree(timeline);
+		return ERR_PTR(err);
+	}
+
+	return timeline;
+}
+
+void i915_gem_timeline_free(struct i915_gem_timeline *timeline)
+{
+	if (!timeline)
+		return;
+
+	i915_gem_timeline_fini(timeline);
+	kfree(timeline);
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_timeline.c"
 #include "selftests/i915_gem_timeline.c"
diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.h b/drivers/gpu/drm/i915/i915_gem_timeline.h
index b5a22400a01f..7ecf0a253d78 100644
--- a/drivers/gpu/drm/i915/i915_gem_timeline.h
+++ b/drivers/gpu/drm/i915/i915_gem_timeline.h
@@ -96,6 +96,10 @@ int i915_gem_timeline_init__global(struct drm_i915_private *i915);
 void i915_gem_timelines_park(struct drm_i915_private *i915);
 void i915_gem_timeline_fini(struct i915_gem_timeline *tl);
 
+struct i915_gem_timeline *
+i915_gem_timeline_create(struct drm_i915_private *i915, const char *name);
+void i915_gem_timeline_free(struct i915_gem_timeline *timeline);
+
 static inline int __intel_timeline_sync_set(struct intel_timeline *tl,
 					    u64 context, u32 seqno)
 {
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 46b2a92cb7a2..5d49f319220b 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -393,7 +393,8 @@ void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno)
 
 static void intel_engine_init_timeline(struct intel_engine_cs *engine)
 {
-	engine->timeline = &engine->i915->gt.global_timeline.engine[engine->id];
+	engine->timeline =
+		&engine->i915->gt.execution_timeline.engine[engine->id];
 }
 
 static bool csb_force_mmio(struct drm_i915_private *i915)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 6896ad1756c8..684303923ff7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2281,7 +2281,7 @@ static int execlists_context_deferred_alloc(struct i915_gem_context *ctx,
 		goto error_deref_obj;
 	}
 
-	ring = intel_engine_create_ring(engine, ctx->ring_size);
+	ring = intel_engine_create_ring(engine, ctx->timeline, ctx->ring_size);
 	if (IS_ERR(ring)) {
 		ret = PTR_ERR(ring);
 		goto error_deref_obj;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e2085820b586..66e87144f799 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1124,7 +1124,9 @@ intel_ring_create_vma(struct drm_i915_private *dev_priv, int size)
 }
 
 struct intel_ring *
-intel_engine_create_ring(struct intel_engine_cs *engine, int size)
+intel_engine_create_ring(struct intel_engine_cs *engine,
+			 struct i915_gem_timeline *timeline,
+			 int size)
 {
 	struct intel_ring *ring;
 	struct i915_vma *vma;
@@ -1137,6 +1139,7 @@ intel_engine_create_ring(struct intel_engine_cs *engine, int size)
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&ring->request_list);
+	ring->timeline = &timeline->engine[engine->id];
 
 	ring->size = size;
 	/* Workaround an erratum on the i830 which causes a hang if
@@ -1333,7 +1336,9 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 	if (err)
 		goto err;
 
-	ring = intel_engine_create_ring(engine, 32 * PAGE_SIZE);
+	ring = intel_engine_create_ring(engine,
+					&engine->i915->gt.legacy_timeline,
+					32 * PAGE_SIZE);
 	if (IS_ERR(ring)) {
 		err = PTR_ERR(ring);
 		goto err;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 0aefbf6849d1..aab7bd61ae10 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -123,6 +123,7 @@ struct intel_ring {
 	struct i915_vma *vma;
 	void *vaddr;
 
+	struct intel_timeline *timeline;
 	struct list_head request_list;
 
 	u32 head;
@@ -738,7 +739,9 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
 #define CNL_HWS_CSB_WRITE_INDEX		0x2f
 
 struct intel_ring *
-intel_engine_create_ring(struct intel_engine_cs *engine, int size);
+intel_engine_create_ring(struct intel_engine_cs *engine,
+			 struct i915_gem_timeline *timeline,
+			 int size);
 int intel_ring_pin(struct intel_ring *ring,
 		   struct drm_i915_private *i915,
 		   unsigned int offset_bias);
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index 55c0e2c15782..19c0d662f351 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -174,8 +174,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	engine->base.emit_breadcrumb = mock_emit_breadcrumb;
 	engine->base.submit_request = mock_submit_request;
 
-	engine->base.timeline =
-		&i915->gt.global_timeline.engine[engine->base.id];
+	intel_engine_init_timeline(&engine->base);
 
 	intel_engine_init_breadcrumbs(&engine->base);
 	engine->base.breadcrumbs.mock = true; /* prevent touching HW for irqs */
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 1bc61f3f76fc..af598e671a8a 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -72,7 +72,9 @@ static void mock_device_release(struct drm_device *dev)
 
 	mutex_lock(&i915->drm.struct_mutex);
 	mock_fini_ggtt(i915);
-	i915_gem_timeline_fini(&i915->gt.global_timeline);
+	i915_gem_timeline_fini(&i915->gt.legacy_timeline);
+	i915_gem_timeline_fini(&i915->gt.execution_timeline);
+	WARN_ON(!list_empty(&i915->gt.timelines));
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	destroy_workqueue(i915->wq);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index e96873f96116..36c112088940 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -76,7 +76,6 @@ mock_ppgtt(struct drm_i915_private *i915,
 
 	INIT_LIST_HEAD(&ppgtt->base.global_link);
 	drm_mm_init(&ppgtt->base.mm, 0, ppgtt->base.total);
-	i915_gem_timeline_init(i915, &ppgtt->base.timeline, name);
 
 	ppgtt->base.clear_range = nop_clear_range;
 	ppgtt->base.insert_page = mock_insert_page;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 02/10] drm/i915: Extend CREATE_CONTEXT to allow inheritance ala clone()
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 01/10] move-timeline-to-ctx Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 03/10] drm/i915: Select engines via class and instance in execbuffer2 Tvrtko Ursulin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

A context encompasses the driver's view of process related state, and
encapsulates the logical GPU state where available. Each context is
currently equivalent to a process in CPU terms. Like with processes,
sometimes the user wants a lighter encapsulation that shares some state
with the parent process, for example two threads have unique register
state but share the virtual memory mappings. We can support exactly the
same principle using contexts where we may share the GTT but keep the
logical GPU state distinct. This allows quicker switching between those
contexts, and for userspace to allocate a single offset in the GTT and
use it across multiple contexts. Like with clone(), in the future we may
wish to allow userspace to select more features to copy across from the
parent, but for now we only allow sharing of the GTT.

Note that if full per-process GTT is not supported on the harder, the
GTT are already implicitly shared between contexts, and this request
to create contexts with shared GTT fails. With full ppGTT, every fd
(i.e. every process) is allocated a unique GTT so this request cannot be
used to share GTT between processes/fds, it can only share GTT belonging
to this fd.

Testcase: igt/gem_ctx_shared
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c           |  66 ++++--
 drivers/gpu/drm/i915/i915_gem_gtt.c               |   4 +-
 drivers/gpu/drm/i915/i915_gem_gtt.h               |   1 -
 drivers/gpu/drm/i915/selftests/huge_pages.c       |   1 -
 drivers/gpu/drm/i915/selftests/i915_gem_context.c | 236 +++++++++++++++++-----
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c     |   1 -
 drivers/gpu/drm/i915/selftests/mock_context.c     |   3 +-
 include/uapi/drm/i915_drm.h                       |  11 +-
 8 files changed, 247 insertions(+), 76 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f72046adaace..22b0fa170fcf 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -109,6 +109,8 @@ static void lut_close(struct i915_gem_context *ctx)
 		struct i915_vma *vma = rcu_dereference_raw(*slot);
 
 		radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
+
+		vma->open_count--;
 		__i915_gem_object_release_unless_active(vma->obj);
 	}
 	rcu_read_unlock();
@@ -202,8 +204,6 @@ static void context_close(struct i915_gem_context *ctx)
 	 * the ppgtt).
 	 */
 	lut_close(ctx);
-	if (ctx->ppgtt)
-		i915_ppgtt_close(&ctx->ppgtt->base);
 
 	ctx->file_priv = ERR_PTR(-EBADF);
 	i915_gem_context_put(ctx);
@@ -339,6 +339,9 @@ static void __destroy_hw_context(struct i915_gem_context *ctx,
 	context_close(ctx);
 }
 
+#define CREATE_TIMELINE BIT(0)
+#define CREATE_VM BIT(1)
+
 /**
  * The default context needs to exist per ring that uses contexts. It stores the
  * context state of the GPU for applications that don't utilize HW contexts, as
@@ -346,7 +349,8 @@ static void __destroy_hw_context(struct i915_gem_context *ctx,
  */
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *dev_priv,
-			struct drm_i915_file_private *file_priv)
+			struct drm_i915_file_private *file_priv,
+			unsigned int flags)
 {
 	struct i915_gem_context *ctx;
 
@@ -359,7 +363,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 	if (IS_ERR(ctx))
 		return ctx;
 
-	if (USES_FULL_PPGTT(dev_priv)) {
+	if (flags & CREATE_VM && USES_FULL_PPGTT(dev_priv)) {
 		struct i915_hw_ppgtt *ppgtt;
 
 		ppgtt = i915_ppgtt_create(dev_priv, file_priv, ctx->name);
@@ -374,7 +378,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 		ctx->desc_template = default_desc_template(dev_priv, ppgtt);
 	}
 
-	if (HAS_EXECLISTS(dev_priv)) {
+	if (flags & CREATE_TIMELINE && HAS_EXECLISTS(dev_priv)) {
 		struct i915_gem_timeline *timeline;
 
 		timeline = i915_gem_timeline_create(dev_priv, ctx->name);
@@ -436,7 +440,7 @@ i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio)
 {
 	struct i915_gem_context *ctx;
 
-	ctx = i915_gem_create_context(i915, NULL);
+	ctx = i915_gem_create_context(i915, NULL, CREATE_VM | CREATE_TIMELINE);
 	if (IS_ERR(ctx))
 		return ctx;
 
@@ -558,7 +562,8 @@ int i915_gem_context_open(struct drm_i915_private *i915,
 	idr_init(&file_priv->context_idr);
 
 	mutex_lock(&i915->drm.struct_mutex);
-	ctx = i915_gem_create_context(i915, file_priv);
+	ctx = i915_gem_create_context(i915, file_priv,
+				      CREATE_VM | CREATE_TIMELINE);
 	mutex_unlock(&i915->drm.struct_mutex);
 	if (IS_ERR(ctx)) {
 		idr_destroy(&file_priv->context_idr);
@@ -655,10 +660,12 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct drm_i915_gem_context_create *args = data;
+	struct drm_i915_gem_context_create_v2 *args = data;
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_gem_context *share = NULL;
 	struct i915_gem_context *ctx;
-	int ret;
+	unsigned int flags = CREATE_VM | CREATE_TIMELINE;
+	int err;
 
 	if (!dev_priv->engine[RCS]->context_size)
 		return -ENODEV;
@@ -666,6 +673,9 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	if (args->pad != 0)
 		return -EINVAL;
 
+	if (args->flags & ~I915_GEM_CONTEXT_SHARE_GTT)
+		return -EINVAL;
+
 	if (client_is_banned(file_priv)) {
 		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
 			  current->comm,
@@ -674,21 +684,45 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 		return -EIO;
 	}
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
+	if (args->flags & I915_GEM_CONTEXT_SHARE_GTT) {
+		share = i915_gem_context_lookup(file_priv, args->share_ctx);
+		if (!share)
+			return -ENOENT;
+
+		if (!share->ppgtt) {
+			err = -ENODEV;
+			goto out;
+		}
 
-	ctx = i915_gem_create_context(dev_priv, file_priv);
+		flags &= ~CREATE_VM;
+	}
+
+	err = i915_mutex_lock_interruptible(dev);
+	if (err)
+		goto out;
+
+	ctx = i915_gem_create_context(dev_priv, file_priv, flags);
 	mutex_unlock(&dev->struct_mutex);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
+	if (IS_ERR(ctx)) {
+		err = PTR_ERR(ctx);
+		goto out;
+	}
+
+	if (!(flags & CREATE_VM)) {
+		i915_ppgtt_get(share->ppgtt);
+		ctx->ppgtt = share->ppgtt;
+		ctx->desc_template = share->desc_template;
+	}
 
 	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
 
 	args->ctx_id = ctx->user_handle;
 	DRM_DEBUG("HW context %d created\n", args->ctx_id);
 
-	return 0;
+out:
+	if (share)
+		i915_gem_context_put(share);
+	return err;
 }
 
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b355ba1eee22..3b09278136ce 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -2222,7 +2222,7 @@ i915_ppgtt_create(struct drm_i915_private *dev_priv,
 	return ppgtt;
 }
 
-void i915_ppgtt_close(struct i915_address_space *vm)
+static void ppgtt_close(struct i915_address_space *vm)
 {
 	struct list_head *phases[] = {
 		&vm->active_list,
@@ -2250,6 +2250,8 @@ void i915_ppgtt_release(struct kref *kref)
 
 	trace_i915_ppgtt_release(&ppgtt->base);
 
+	ppgtt_close(&ppgtt->base);
+
 	/* vmas should already be unbound and destroyed */
 	WARN_ON(!list_empty(&ppgtt->base.active_list));
 	WARN_ON(!list_empty(&ppgtt->base.inactive_list));
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 0028a0ccc9a0..e39d8e96e928 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -592,7 +592,6 @@ void i915_ppgtt_release(struct kref *kref);
 struct i915_hw_ppgtt *i915_ppgtt_create(struct drm_i915_private *dev_priv,
 					struct drm_i915_file_private *fpriv,
 					const char *name);
-void i915_ppgtt_close(struct i915_address_space *vm);
 static inline void i915_ppgtt_get(struct i915_hw_ppgtt *ppgtt)
 {
 	if (ppgtt)
diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 2ea69394f428..a4e65dd1a73c 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -1712,7 +1712,6 @@ int i915_gem_huge_page_mock_selftests(void)
 	err = i915_subtests(tests, ppgtt);
 
 out_close:
-	i915_ppgtt_close(&ppgtt->base);
 	i915_ppgtt_put(ppgtt);
 
 out_unlock:
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 56a803d11916..0ed6653a2a3d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -127,10 +127,6 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 	if (IS_ERR(vma))
 		return PTR_ERR(vma);
 
-	err = i915_gem_object_set_to_gtt_domain(obj, false);
-	if (err)
-		return err;
-
 	err = i915_vma_pin(vma, 0, 0, PIN_HIGH | PIN_USER);
 	if (err)
 		return err;
@@ -173,7 +169,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 	i915_vma_unpin(batch);
 	i915_vma_close(batch);
 
-	i915_vma_move_to_active(vma, rq, 0);
+	i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
 	i915_vma_unpin(vma);
 
 	reservation_object_lock(obj->resv, NULL);
@@ -220,7 +216,8 @@ static int cpu_fill(struct drm_i915_gem_object *obj, u32 value)
 	return 0;
 }
 
-static int cpu_check(struct drm_i915_gem_object *obj, unsigned int max)
+static noinline int cpu_check(struct drm_i915_gem_object *obj,
+			      unsigned int idx, unsigned int max)
 {
 	unsigned int n, m, needs_flush;
 	int err;
@@ -238,8 +235,8 @@ static int cpu_check(struct drm_i915_gem_object *obj, unsigned int max)
 
 		for (m = 0; m < max; m++) {
 			if (map[m] != m) {
-				pr_err("Invalid value at page %d, offset %d: found %x expected %x\n",
-				       n, m, map[m], m);
+				pr_err("%pS: Invalid value at object %d page %d, offset %d: found %x expected %x\n",
+				       __builtin_return_address(0), idx, n, m, map[m], m);
 				err = -EINVAL;
 				goto out_unmap;
 			}
@@ -247,8 +244,8 @@ static int cpu_check(struct drm_i915_gem_object *obj, unsigned int max)
 
 		for (; m < DW_PER_PAGE; m++) {
 			if (map[m] != 0xdeadbeef) {
-				pr_err("Invalid value at page %d, offset %d: found %x expected %x\n",
-				       n, m, map[m], 0xdeadbeef);
+				pr_err("%pS: Invalid value at object %d page %d, offset %d: found %x expected %x\n",
+				       __builtin_return_address(0), idx, n, m, map[m], 0xdeadbeef);
 				err = -EINVAL;
 				goto out_unmap;
 			}
@@ -311,6 +308,10 @@ create_test_object(struct i915_gem_context *ctx,
 		return ERR_PTR(err);
 	}
 
+	err = i915_gem_object_set_to_gtt_domain(obj, false);
+	if (err)
+		return ERR_PTR(err);
+
 	list_add_tail(&obj->st_link, objects);
 	return obj;
 }
@@ -326,12 +327,8 @@ static unsigned long max_dwords(struct drm_i915_gem_object *obj)
 static int igt_ctx_exec(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
-	struct drm_i915_gem_object *obj = NULL;
-	struct drm_file *file;
-	IGT_TIMEOUT(end_time);
-	LIST_HEAD(objects);
-	unsigned long ncontexts, ndwords, dw;
-	bool first_shared_gtt = true;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
 	int err = -ENODEV;
 
 	/* Create a few different contexts (with different mm) and write
@@ -339,37 +336,162 @@ static int igt_ctx_exec(void *arg)
 	 * up in the expected pages of our obj.
 	 */
 
-	file = mock_file(i915);
-	if (IS_ERR(file))
-		return PTR_ERR(file);
+	for_each_engine(engine, i915, id) {
+		struct drm_i915_gem_object *obj = NULL;
+		struct drm_file *file;
+		IGT_TIMEOUT(end_time);
+		LIST_HEAD(objects);
+		unsigned long ncontexts, ndwords, dw;
+		bool first_shared_gtt = true;
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
 
-	mutex_lock(&i915->drm.struct_mutex);
+		file = mock_file(i915);
+		if (IS_ERR(file))
+			return PTR_ERR(file);
 
-	ncontexts = 0;
-	ndwords = 0;
-	dw = 0;
-	while (!time_after(jiffies, end_time)) {
-		struct intel_engine_cs *engine;
-		struct i915_gem_context *ctx;
-		unsigned int id;
+		mutex_lock(&i915->drm.struct_mutex);
 
-		if (first_shared_gtt) {
-			ctx = __create_hw_context(i915, file->driver_priv);
-			first_shared_gtt = false;
-		} else {
-			ctx = i915_gem_create_context(i915, file->driver_priv);
+
+		ncontexts = 0;
+		ndwords = 0;
+		dw = 0;
+		while (!time_after(jiffies, end_time)) {
+			struct i915_gem_context *ctx;
+
+			if (first_shared_gtt) {
+				ctx = __create_hw_context(i915, file->driver_priv);
+				first_shared_gtt = false;
+			} else {
+				ctx = i915_gem_create_context(i915,
+							      file->driver_priv,
+							      CREATE_VM |
+							      CREATE_TIMELINE);
+			}
+			if (IS_ERR(ctx)) {
+				err = PTR_ERR(ctx);
+				goto out_unlock;
+			}
+
+			if (!obj) {
+				obj = create_test_object(ctx, file, &objects);
+				if (IS_ERR(obj)) {
+					err = PTR_ERR(obj);
+					goto out_unlock;
+				}
+			}
+
+			intel_runtime_pm_get(i915);
+			err = gpu_fill(obj, ctx, engine, dw);
+			intel_runtime_pm_put(i915);
+			if (err) {
+				pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) in ctx %u [full-ppgtt? %s], err=%d\n",
+				       ndwords, dw, max_dwords(obj),
+				       engine->name, ctx->hw_id,
+				       yesno(!!ctx->ppgtt), err);
+				goto out_unlock;
+			}
+
+			if (++dw == max_dwords(obj)) {
+				obj = NULL;
+				dw = 0;
+			}
+
+			ndwords++;
+			ncontexts++;
+		}
+
+		pr_info("Submitted %lu contexts to %s, filling %lu dwords\n",
+			ncontexts, engine->name, ndwords);
+
+		ncontexts = dw = 0;
+		list_for_each_entry(obj, &objects, st_link) {
+			unsigned int rem =
+				min_t(unsigned int, ndwords - dw, max_dwords(obj));
+
+			err = cpu_check(obj, ncontexts++, rem);
+			if (err)
+				break;
+
+			dw += rem;
 		}
-		if (IS_ERR(ctx)) {
-			err = PTR_ERR(ctx);
+
+out_unlock:
+		i915_gem_wait_for_idle(i915, I915_WAIT_LOCKED);
+		mutex_unlock(&i915->drm.struct_mutex);
+
+		mock_file_free(i915, file);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int igt_shared_ctx_exec(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	int err = -ENODEV;
+
+	/*
+	 * Create a few different contexts with the same mm and write
+	 * through each ctx using the GPU making sure those writes end
+	 * up in the expected pages of our obj.
+	 */
+
+	for_each_engine(engine, i915, id) {
+		IGT_TIMEOUT(end_time);
+		LIST_HEAD(objects);
+		unsigned long ncontexts, ndwords, dw;
+		struct drm_i915_gem_object *obj = NULL;
+		struct drm_file *file;
+		struct i915_gem_context *parent;
+
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
+		file = mock_file(i915);
+		if (IS_ERR(file))
+			return PTR_ERR(file);
+
+		mutex_lock(&i915->drm.struct_mutex);
+
+		parent = i915_gem_create_context(i915, file->driver_priv,
+						 CREATE_VM | CREATE_TIMELINE);
+		if (IS_ERR(parent)) {
+			err = PTR_ERR(parent);
+			goto out_unlock;
+		}
+
+		if (!parent->ppgtt) {
+			err = 0;
 			goto out_unlock;
 		}
 
-		for_each_engine(engine, i915, id) {
-			if (!intel_engine_can_store_dword(engine))
-				continue;
+
+		ncontexts = 0;
+		ndwords = 0;
+		dw = 0;
+		while (!time_after(jiffies, end_time)) {
+			struct i915_gem_context *ctx;
+
+			ctx = i915_gem_create_context(i915,
+						      file->driver_priv,
+						      0);
+			if (IS_ERR(ctx)) {
+				err = PTR_ERR(ctx);
+				goto out_unlock;
+			}
+
+			i915_ppgtt_get(parent->ppgtt);
+			ctx->ppgtt = parent->ppgtt;
+			ctx->desc_template = parent->desc_template;
 
 			if (!obj) {
-				obj = create_test_object(ctx, file, &objects);
+				obj = create_test_object(parent, file, &objects);
 				if (IS_ERR(obj)) {
 					err = PTR_ERR(obj);
 					goto out_unlock;
@@ -391,30 +513,35 @@ static int igt_ctx_exec(void *arg)
 				obj = NULL;
 				dw = 0;
 			}
+
 			ndwords++;
+			ncontexts++;
 		}
-		ncontexts++;
-	}
-	pr_info("Submitted %lu contexts (across %u engines), filling %lu dwords\n",
-		ncontexts, INTEL_INFO(i915)->num_rings, ndwords);
+		pr_info("Submitted %lu contexts to %s, filling %lu dwords\n",
+			ncontexts, engine->name, ndwords);
 
-	dw = 0;
-	list_for_each_entry(obj, &objects, st_link) {
-		unsigned int rem =
-			min_t(unsigned int, ndwords - dw, max_dwords(obj));
+		ncontexts = dw = 0;
+		list_for_each_entry(obj, &objects, st_link) {
+			unsigned int rem =
+				min_t(unsigned int, ndwords - dw, max_dwords(obj));
 
-		err = cpu_check(obj, rem);
-		if (err)
-			break;
+			err = cpu_check(obj, ncontexts++, rem);
+			if (err)
+				break;
 
-		dw += rem;
-	}
+			dw += rem;
+		}
 
 out_unlock:
-	mutex_unlock(&i915->drm.struct_mutex);
+		i915_gem_wait_for_idle(i915, I915_WAIT_LOCKED);
+		mutex_unlock(&i915->drm.struct_mutex);
 
-	mock_file_free(i915, file);
-	return err;
+		mock_file_free(i915, file);
+		if (err)
+			return err;
+	}
+
+	return 0;
 }
 
 static int fake_aliasing_ppgtt_enable(struct drm_i915_private *i915)
@@ -448,6 +575,7 @@ int i915_gem_context_live_selftests(struct drm_i915_private *dev_priv)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_ctx_exec),
+		SUBTEST(igt_shared_ctx_exec),
 	};
 	bool fake_alias = false;
 	int err;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index bb7cf998fc65..d6c21a0b66ad 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -914,7 +914,6 @@ static int exercise_ppgtt(struct drm_i915_private *dev_priv,
 
 	err = func(dev_priv, &ppgtt->base, 0, ppgtt->base.total, end_time);
 
-	i915_ppgtt_close(&ppgtt->base);
 	i915_ppgtt_put(ppgtt);
 out_unlock:
 	mutex_unlock(&dev_priv->drm.struct_mutex);
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index bbf80d42e793..37df29ae9668 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -90,5 +90,6 @@ live_context(struct drm_i915_private *i915, struct drm_file *file)
 {
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	return i915_gem_create_context(i915, file->driver_priv);
+	return i915_gem_create_context(i915, file->driver_priv,
+				       CREATE_VM | CREATE_TIMELINE);
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index c7a90ab35db8..bc3e25b09f75 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -382,7 +382,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_SET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_SET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey)
 #define DRM_IOCTL_I915_GET_SPRITE_COLORKEY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GET_SPRITE_COLORKEY, struct drm_intel_sprite_colorkey)
 #define DRM_IOCTL_I915_GEM_WAIT		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_WAIT, struct drm_i915_gem_wait)
-#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
+#define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create_v2)
 #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
 #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
 #define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
@@ -1400,6 +1400,15 @@ struct drm_i915_gem_context_create {
 	__u32 pad;
 };
 
+struct drm_i915_gem_context_create_v2 {
+	/*  output: id of new context*/
+	__u32 ctx_id;
+	__u32 flags;
+#define I915_GEM_CONTEXT_SHARE_GTT 0x1
+	__u32 share_ctx;
+	__u32 pad;
+};
+
 struct drm_i915_gem_context_destroy {
 	__u32 ctx_id;
 	__u32 pad;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 03/10] drm/i915: Select engines via class and instance in execbuffer2
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 01/10] move-timeline-to-ctx Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 02/10] drm/i915: Extend CREATE_CONTEXT to allow inheritance ala clone() Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 04/10] drm/i915: Engine capabilities uAPI Tvrtko Ursulin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Introduce a new way of selecting engines using the engine class
and instance concept.

This is primarily interesting for the VCS engine selection which
is a) currently done via disjoint set of flags, and b) the
current I915_EXEC_BSD flags has different semantics depending on
the underlying hardware which is bad.

Proposed idea here is to reserve 8-bits of flags, to pass in the
engine instance, re-use the existing engine selection bits for
the class selection, and a new flag named
I915_EXEC_CLASS_INSTANCE to tell the kernel this new engine
selection API is in use.

The new uAPI also removes access to the weak VCS engine
balancing as currently existing in the driver.

Example usage to send a command to VCS0:

  eb.flags = i915_execbuffer2_engine(I915_ENGINE_CLASS_VIDEO_DECODE, 0);

Or to send a command to VCS1:

  eb.flags = i915_execbuffer2_engine(I915_ENGINE_CLASS_VIDEO_DECODE, 1);

v2:
 * Fix unknown flags mask.
 * Use I915_EXEC_RING_MASK for class. (Chris Wilson)

v3:
 * Add a map for fast class-instance engine lookup. (Chris Wilson)

v4:
 * Update commit to reflect v3.
 * Export intel_engine_lookup for other users. (Chris Wilson)
 * Split out some warns. (Chris Wilson)

v5:
 * Fixed shift and mask logic.
 * Rebased to be standalone.

v6:
 * Rebased back to follow engine info ioctl.
 * Rename helper to intel_engine_lookup_user. (Chris Wilson)

v7:
 * Rebased.

v8:
 * Rebased after engine class uAPI got in via separate route.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 +++++++++++++
 include/uapi/drm/i915_drm.h                | 12 +++++++++++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index cd482b981fdd..29e346ca0898 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2015,6 +2015,16 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
 	return file_priv->bsd_engine;
 }
 
+static struct intel_engine_cs *
+eb_select_engine_class_instance(struct drm_i915_private *i915, u64 eb_flags)
+{
+	u8 class = eb_flags & I915_EXEC_RING_MASK;
+	u8 instance = (eb_flags & I915_EXEC_INSTANCE_MASK) >>
+		      I915_EXEC_INSTANCE_SHIFT;
+
+	return intel_engine_lookup_user(i915, class, instance);
+}
+
 #define I915_USER_RINGS (4)
 
 static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
@@ -2033,6 +2043,9 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
 	struct intel_engine_cs *engine;
 
+	if (args->flags & I915_EXEC_CLASS_INSTANCE)
+		return eb_select_engine_class_instance(dev_priv, args->flags);
+
 	if (user_ring_id > I915_USER_RINGS) {
 		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
 		return NULL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index bc3e25b09f75..28ae31a2accf 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1063,7 +1063,12 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+#define I915_EXEC_CLASS_INSTANCE	(1<<20)
+
+#define I915_EXEC_INSTANCE_SHIFT	(21)
+#define I915_EXEC_INSTANCE_MASK		(0xff << I915_EXEC_INSTANCE_SHIFT)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-((1 << 29) << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1071,6 +1076,11 @@ struct drm_i915_gem_execbuffer2 {
 #define i915_execbuffer2_get_context_id(eb2) \
 	((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK)
 
+#define i915_execbuffer2_engine(class, instance) \
+	(I915_EXEC_CLASS_INSTANCE | \
+	(class) | \
+	((instance) << I915_EXEC_INSTANCE_SHIFT))
+
 struct drm_i915_gem_pin {
 	/** Handle of the buffer to be pinned. */
 	__u32 handle;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 04/10] drm/i915: Engine capabilities uAPI
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 03/10] drm/i915: Select engines via class and instance in execbuffer2 Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 05/10] drm/i915: Re-arrange execbuf so context is known before engine Tvrtko Ursulin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

To add the knowledge that VCS1 engine does not support HEVC,
we introduce the concept of engine capabilities. These are
flags defined in per-engine class space which can be passed
in during execbuf time. The driver is then able to fail the
execbuf in case of mismatch between the requested capabilities
and the selected target engine.

v2: Use BIT_ULL for flags. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 +++++++++++++++----
 drivers/gpu/drm/i915/intel_engine_cs.c     |  3 +++
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  2 ++
 include/uapi/drm/i915_drm.h                | 17 ++++++++++++++++-
 4 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 29e346ca0898..3abe8a69e313 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -56,9 +56,10 @@ enum {
 #define __EXEC_OBJECT_INTERNAL_FLAGS	(~0u << 27) /* all of the above */
 #define __EXEC_OBJECT_RESERVED (__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_FENCE)
 
-#define __EXEC_HAS_RELOC	BIT(31)
-#define __EXEC_VALIDATED	BIT(30)
-#define __EXEC_INTERNAL_FLAGS	(~0u << 30)
+#define __EXEC_HAS_RELOC	BIT_ULL(63)
+#define __EXEC_VALIDATED	BIT_ULL(62)
+#define __EXEC_INTERNAL_FLAGS	(~0ULL << 62)
+
 #define UPDATE			PIN_OFFSET_FIXED
 
 #define BATCH_OFFSET_BIAS (256*1024)
@@ -2021,8 +2022,16 @@ eb_select_engine_class_instance(struct drm_i915_private *i915, u64 eb_flags)
 	u8 class = eb_flags & I915_EXEC_RING_MASK;
 	u8 instance = (eb_flags & I915_EXEC_INSTANCE_MASK) >>
 		      I915_EXEC_INSTANCE_SHIFT;
+	u8 caps = (eb_flags & I915_EXEC_ENGINE_CAP_MASK) >>
+		  I915_EXEC_ENGINE_CAP_SHIFT;
+	struct intel_engine_cs *engine;
 
-	return intel_engine_lookup_user(i915, class, instance);
+	engine = intel_engine_lookup_user(i915, class, instance);
+
+	if (engine && ((caps & engine->caps) != caps))
+		return NULL;
+
+	return engine;
 }
 
 #define I915_USER_RINGS (4)
@@ -2232,6 +2241,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	BUILD_BUG_ON(__EXEC_OBJECT_INTERNAL_FLAGS &
 		     ~__EXEC_OBJECT_UNKNOWN_FLAGS);
 
+	BUILD_BUG_ON(__EXEC_INTERNAL_FLAGS & ~__I915_EXEC_UNKNOWN_FLAGS);
+
 	eb.i915 = to_i915(dev);
 	eb.file = file;
 	eb.args = args;
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 5d49f319220b..e02627618bc5 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -234,6 +234,9 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
 	engine->irq_shift = info->irq_shift;
 	engine->class = info->class;
 	engine->instance = info->instance;
+	if (INTEL_GEN(dev_priv) >= 8 && engine->class == VIDEO_DECODE_CLASS &&
+	    engine->instance == 0)
+		engine->caps = I915_BSD_CAP_HEVC;
 
 	engine->uabi_id = info->uabi_id;
 	engine->uabi_class = class_info->uabi_class;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index aab7bd61ae10..76f7fdc926ae 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -297,6 +297,8 @@ struct intel_engine_cs {
 
 	u8 class;
 	u8 instance;
+	u8 caps;
+
 	u32 context_size;
 	u32 mmio_base;
 	unsigned int irq_shift;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 28ae31a2accf..8b8b70c5a50b 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1068,7 +1068,18 @@ struct drm_i915_gem_execbuffer2 {
 #define I915_EXEC_INSTANCE_SHIFT	(21)
 #define I915_EXEC_INSTANCE_MASK		(0xff << I915_EXEC_INSTANCE_SHIFT)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-((1 << 29) << 1))
+/*
+ * Inform the kernel of what engine capabilities this batch buffer
+ * requires. For example only the first VCS engine has the HEVC block.
+ *
+ * We reserve four bits for the capabilities where each can be shared
+ * between different engines. Eg. first bit can mean one feature for
+ * one engine and something else for the other.
+ */
+#define I915_EXEC_ENGINE_CAP_SHIFT	(29)
+#define I915_EXEC_ENGINE_CAP_MASK	(0xf << I915_EXEC_ENGINE_CAP_SHIFT)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-((1ULL << 33) << 1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1081,6 +1092,10 @@ struct drm_i915_gem_execbuffer2 {
 	(class) | \
 	((instance) << I915_EXEC_INSTANCE_SHIFT))
 
+#define I915_BSD_CAP_HEVC	(1 << 0)
+
+#define I915_ENGINE_CAP_FLAG(v)	((v) << I915_EXEC_ENGINE_CAP_SHIFT)
+
 struct drm_i915_gem_pin {
 	/** Handle of the buffer to be pinned. */
 	__u32 handle;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 05/10] drm/i915: Re-arrange execbuf so context is known before engine
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 04/10] drm/i915: Engine capabilities uAPI Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 06/10] drm/i915: Refactor eb_select_engine to take eb Tvrtko Ursulin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Needed for a following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 38 ++++++++++++++++--------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3abe8a69e313..384b6bd9a4a7 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2273,24 +2273,6 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (args->flags & I915_EXEC_IS_PINNED)
 		eb.batch_flags |= I915_DISPATCH_PINNED;
 
-	eb.engine = eb_select_engine(eb.i915, file, args);
-	if (!eb.engine)
-		return -EINVAL;
-
-	if (args->flags & I915_EXEC_RESOURCE_STREAMER) {
-		if (!HAS_RESOURCE_STREAMER(eb.i915)) {
-			DRM_DEBUG("RS is only allowed for Haswell, Gen8 and above\n");
-			return -EINVAL;
-		}
-		if (eb.engine->id != RCS) {
-			DRM_DEBUG("RS is not available on %s\n",
-				 eb.engine->name);
-			return -EINVAL;
-		}
-
-		eb.batch_flags |= I915_DISPATCH_RS;
-	}
-
 	if (args->flags & I915_EXEC_FENCE_IN) {
 		in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
 		if (!in_fence)
@@ -2315,6 +2297,25 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_destroy;
 
+	err = -EINVAL;
+	eb.engine = eb_select_engine(eb.i915, file, args);
+	if (!eb.engine)
+		goto err_engine;
+
+	if (args->flags & I915_EXEC_RESOURCE_STREAMER) {
+		if (!HAS_RESOURCE_STREAMER(eb.i915)) {
+			DRM_DEBUG("RS is only allowed for Haswell, Gen8 and above\n");
+			goto err_engine;
+		}
+		if (eb.engine->id != RCS) {
+			DRM_DEBUG("RS is not available on %s\n",
+				  eb.engine->name);
+			goto err_engine;
+		}
+
+		eb.batch_flags |= I915_DISPATCH_RS;
+	}
+
 	/*
 	 * Take a local wakeref for preparing to dispatch the execbuf as
 	 * we expect to access the hardware fairly frequently in the
@@ -2475,6 +2476,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	mutex_unlock(&dev->struct_mutex);
 err_rpm:
 	intel_runtime_pm_put(eb.i915);
+err_engine:
 	i915_gem_context_put(eb.ctx);
 err_destroy:
 	eb_destroy(&eb);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 06/10] drm/i915: Refactor eb_select_engine to take eb
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 05/10] drm/i915: Re-arrange execbuf so context is known before engine Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 07/10] drm/i915: Track latest request per engine class Tvrtko Ursulin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Refactoring to enable future patches.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 49 ++++++++++++++++--------------
 1 file changed, 26 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 384b6bd9a4a7..78340ceb1105 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2016,9 +2016,9 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
 	return file_priv->bsd_engine;
 }
 
-static struct intel_engine_cs *
-eb_select_engine_class_instance(struct drm_i915_private *i915, u64 eb_flags)
+static int eb_select_engine_class_instance(struct i915_execbuffer *eb)
 {
+	u64 eb_flags = eb->args->flags;
 	u8 class = eb_flags & I915_EXEC_RING_MASK;
 	u8 instance = (eb_flags & I915_EXEC_INSTANCE_MASK) >>
 		      I915_EXEC_INSTANCE_SHIFT;
@@ -2026,12 +2026,14 @@ eb_select_engine_class_instance(struct drm_i915_private *i915, u64 eb_flags)
 		  I915_EXEC_ENGINE_CAP_SHIFT;
 	struct intel_engine_cs *engine;
 
-	engine = intel_engine_lookup_user(i915, class, instance);
+	engine = intel_engine_lookup_user(eb->i915, class, instance);
 
 	if (engine && ((caps & engine->caps) != caps))
-		return NULL;
+		return -EINVAL;
 
-	return engine;
+	eb->engine = engine;
+
+	return 0;
 }
 
 #define I915_USER_RINGS (4)
@@ -2044,31 +2046,30 @@ static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
 	[I915_EXEC_VEBOX]	= VECS
 };
 
-static struct intel_engine_cs *
-eb_select_engine(struct drm_i915_private *dev_priv,
-		 struct drm_file *file,
-		 struct drm_i915_gem_execbuffer2 *args)
+static int eb_select_engine(struct i915_execbuffer *eb, struct drm_file *file)
 {
-	unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
+	struct drm_i915_private *dev_priv = eb->i915;
+	u64 flags = eb->args->flags;
+	unsigned int user_ring_id = flags & I915_EXEC_RING_MASK;
 	struct intel_engine_cs *engine;
 
-	if (args->flags & I915_EXEC_CLASS_INSTANCE)
-		return eb_select_engine_class_instance(dev_priv, args->flags);
+	if (flags & I915_EXEC_CLASS_INSTANCE)
+		return eb_select_engine_class_instance(eb);
 
 	if (user_ring_id > I915_USER_RINGS) {
 		DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
-		return NULL;
+		return -EINVAL;
 	}
 
 	if ((user_ring_id != I915_EXEC_BSD) &&
-	    ((args->flags & I915_EXEC_BSD_MASK) != 0)) {
+	    ((flags & I915_EXEC_BSD_MASK) != 0)) {
 		DRM_DEBUG("execbuf with non bsd ring but with invalid "
-			  "bsd dispatch flags: %d\n", (int)(args->flags));
-		return NULL;
+			  "bsd dispatch flags: %llx\n", flags);
+		return -EINVAL;
 	}
 
 	if (user_ring_id == I915_EXEC_BSD && HAS_BSD2(dev_priv)) {
-		unsigned int bsd_idx = args->flags & I915_EXEC_BSD_MASK;
+		unsigned int bsd_idx = flags & I915_EXEC_BSD_MASK;
 
 		if (bsd_idx == I915_EXEC_BSD_DEFAULT) {
 			bsd_idx = gen8_dispatch_bsd_engine(dev_priv, file);
@@ -2079,7 +2080,7 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 		} else {
 			DRM_DEBUG("execbuf with unknown bsd ring: %u\n",
 				  bsd_idx);
-			return NULL;
+			return -EINVAL;
 		}
 
 		engine = dev_priv->engine[_VCS(bsd_idx)];
@@ -2089,10 +2090,12 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 
 	if (!engine) {
 		DRM_DEBUG("execbuf with invalid ring: %u\n", user_ring_id);
-		return NULL;
+		return -EINVAL;
 	}
 
-	return engine;
+	eb->engine = engine;
+
+	return 0;
 }
 
 static void
@@ -2297,12 +2300,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_destroy;
 
-	err = -EINVAL;
-	eb.engine = eb_select_engine(eb.i915, file, args);
-	if (!eb.engine)
+	err = eb_select_engine(&eb, file);
+	if (err)
 		goto err_engine;
 
 	if (args->flags & I915_EXEC_RESOURCE_STREAMER) {
+		err = -EINVAL;
 		if (!HAS_RESOURCE_STREAMER(eb.i915)) {
 			DRM_DEBUG("RS is only allowed for Haswell, Gen8 and above\n");
 			goto err_engine;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 07/10] drm/i915: Track latest request per engine class
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 06/10] drm/i915: Refactor eb_select_engine to take eb Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 08/10] drm/i915: Allow creating virtual contexts Tvrtko Ursulin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We add a per-context, per-engine-class timeline so we are later able to
implement a virtual engine by creating implicit dependencies between
requests submitted to the same engine class.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |  6 ++++++
 drivers/gpu/drm/i915/i915_gem_request.c  | 13 +++++++++++++
 drivers/gpu/drm/i915/i915_gem_request.h  |  2 ++
 drivers/gpu/drm/i915/i915_gem_timeline.c | 12 ++++++++++++
 drivers/gpu/drm/i915/i915_gem_timeline.h |  2 ++
 5 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0c348f6ab386..d20c6c542c2d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3449,6 +3449,12 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+static inline struct intel_timeline *
+i915_gem_context_lookup_timeline_class(struct i915_gem_context *ctx, u8 class)
+{
+	return &ctx->timeline->class[class];
+}
+
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file);
 int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 160d81bf6d85..85da7f7eee03 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -380,6 +380,7 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 {
 	struct intel_engine_cs *engine = request->engine;
 	struct i915_gem_active *active, *next;
+	struct intel_timeline *timeline;
 
 	lockdep_assert_held(&request->i915->drm.struct_mutex);
 	GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit));
@@ -392,6 +393,12 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->link);
 	spin_unlock_irq(&engine->timeline->lock);
 
+	timeline = i915_gem_context_lookup_timeline_class(request->ctx,
+							  engine->class);
+	spin_lock_irq(&timeline->lock);
+	list_del(&request->ctx_link);
+	spin_unlock_irq(&timeline->lock);
+
 	unreserve_engine(request->engine);
 	advance_ring(request);
 
@@ -1054,6 +1061,12 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	GEM_BUG_ON(timeline->seqno != request->fence.seqno);
 	i915_gem_active_set(&timeline->last_request, request);
 
+	timeline = i915_gem_context_lookup_timeline_class(request->ctx,
+							  engine->class);
+	spin_lock_irq(&timeline->lock);
+	list_add(&request->ctx_link, &timeline->requests);
+	spin_unlock_irq(&timeline->lock);
+
 	list_add_tail(&request->ring_link, &ring->request_list);
 	request->emitted_jiffies = jiffies;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 2236e9188c5c..f8915ff3d62a 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -196,6 +196,8 @@ struct drm_i915_gem_request {
 	/** engine->request_list entry for this request */
 	struct list_head link;
 
+	struct list_head ctx_link;
+
 	/** ring->request_list entry for this request */
 	struct list_head ring_link;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.c b/drivers/gpu/drm/i915/i915_gem_timeline.c
index 1bf48bdb78c4..902eb92b8883 100644
--- a/drivers/gpu/drm/i915/i915_gem_timeline.c
+++ b/drivers/gpu/drm/i915/i915_gem_timeline.c
@@ -75,6 +75,10 @@ static int __i915_gem_timeline_init(struct drm_i915_private *i915,
 
 	/* Called during early_init before we know how many engines there are */
 	fences = dma_fence_context_alloc(ARRAY_SIZE(timeline->engine));
+	for (i = 0; i < ARRAY_SIZE(timeline->class); i++)
+		__intel_timeline_init(&timeline->class[i],
+				      timeline, fences++,
+				      lockclass, lockname);
 	for (i = 0; i < ARRAY_SIZE(timeline->engine); i++)
 		__intel_timeline_init(&timeline->engine[i],
 				      timeline, fences++,
@@ -137,6 +141,11 @@ void i915_gem_timelines_park(struct drm_i915_private *i915)
 	lockdep_assert_held(&i915->drm.struct_mutex);
 
 	list_for_each_entry(timeline, &i915->gt.timelines, link) {
+		for (i = 0; i < ARRAY_SIZE(timeline->class); i++) {
+			struct intel_timeline *tl = &timeline->class[i];
+
+			i915_syncmap_free(&tl->sync);
+		}
 		for (i = 0; i < ARRAY_SIZE(timeline->engine); i++) {
 			struct intel_timeline *tl = &timeline->engine[i];
 
@@ -157,6 +166,9 @@ void i915_gem_timeline_fini(struct i915_gem_timeline *timeline)
 
 	lockdep_assert_held(&timeline->i915->drm.struct_mutex);
 
+	for (i = 0; i < ARRAY_SIZE(timeline->class); i++)
+		__intel_timeline_fini(&timeline->class[i]);
+
 	for (i = 0; i < ARRAY_SIZE(timeline->engine); i++)
 		__intel_timeline_fini(&timeline->engine[i]);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_timeline.h b/drivers/gpu/drm/i915/i915_gem_timeline.h
index 7ecf0a253d78..f625ef2ad14e 100644
--- a/drivers/gpu/drm/i915/i915_gem_timeline.h
+++ b/drivers/gpu/drm/i915/i915_gem_timeline.h
@@ -30,6 +30,7 @@
 #include "i915_utils.h"
 #include "i915_gem_request.h"
 #include "i915_syncmap.h"
+#include "i915_reg.h"
 
 struct i915_gem_timeline;
 
@@ -86,6 +87,7 @@ struct i915_gem_timeline {
 	struct drm_i915_private *i915;
 	const char *name;
 
+	struct intel_timeline class[MAX_ENGINE_CLASS];
 	struct intel_timeline engine[I915_NUM_ENGINES];
 };
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 08/10] drm/i915: Allow creating virtual contexts
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (6 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 07/10] drm/i915: Track latest request per engine class Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:33 ` [RFC 09/10] drm/i915: Trivial virtual engine implementation Tvrtko Ursulin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Virtual context imply the target engine will be picked by the driver.

v2: Disallow legacy execbuf API.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c    | 15 ++++++++++-----
 drivers/gpu/drm/i915/i915_gem_context.h    |  2 ++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  6 ++++++
 include/uapi/drm/i915_drm.h                |  2 +-
 4 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 22b0fa170fcf..8f5f23b0dd34 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -667,15 +667,18 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 	unsigned int flags = CREATE_VM | CREATE_TIMELINE;
 	int err;
 
-	if (!dev_priv->engine[RCS]->context_size)
-		return -ENODEV;
-
-	if (args->pad != 0)
+	if (args->flags & ~I915_GEM_CONTEXT_SHARE_GTT)
 		return -EINVAL;
 
-	if (args->flags & ~I915_GEM_CONTEXT_SHARE_GTT)
+	if (args->virtual > 1)
 		return -EINVAL;
 
+	if (!dev_priv->engine[RCS]->context_size)
+		return -ENODEV;
+
+	if (args->virtual && !HAS_BSD2(dev_priv))
+		return -ENODEV;
+
 	if (client_is_banned(file_priv)) {
 		DRM_DEBUG("client %s[%d] banned from creating ctx\n",
 			  current->comm,
@@ -716,6 +719,8 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 
 	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
 
+	ctx->virtual = args->virtual;
+
 	args->ctx_id = ctx->user_handle;
 	DRM_DEBUG("HW context %d created\n", args->ctx_id);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index cfa69b12a6b2..5afe050718b8 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -149,6 +149,8 @@ struct i915_gem_context {
 	 */
 	int priority;
 
+	unsigned int virtual;
+
 	/** ggtt_offset_bias: placement restriction for context objects */
 	u32 ggtt_offset_bias;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 78340ceb1105..fa1806ed9be6 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2026,6 +2026,9 @@ static int eb_select_engine_class_instance(struct i915_execbuffer *eb)
 		  I915_EXEC_ENGINE_CAP_SHIFT;
 	struct intel_engine_cs *engine;
 
+	if (instance && eb->ctx->virtual)
+		return -EINVAL;
+
 	engine = intel_engine_lookup_user(eb->i915, class, instance);
 
 	if (engine && ((caps & engine->caps) != caps))
@@ -2068,6 +2071,9 @@ static int eb_select_engine(struct i915_execbuffer *eb, struct drm_file *file)
 		return -EINVAL;
 	}
 
+	if (user_ring_id == I915_EXEC_BSD && eb->ctx->virtual)
+		return -EINVAL;
+
 	if (user_ring_id == I915_EXEC_BSD && HAS_BSD2(dev_priv)) {
 		unsigned int bsd_idx = flags & I915_EXEC_BSD_MASK;
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8b8b70c5a50b..42eb34b9c014 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1431,7 +1431,7 @@ struct drm_i915_gem_context_create_v2 {
 	__u32 flags;
 #define I915_GEM_CONTEXT_SHARE_GTT 0x1
 	__u32 share_ctx;
-	__u32 pad;
+	__u32 virtual;
 };
 
 struct drm_i915_gem_context_destroy {
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 09/10] drm/i915: Trivial virtual engine implementation
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (7 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 08/10] drm/i915: Allow creating virtual contexts Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 13:57   ` Chris Wilson
  2018-01-25 13:33 ` [RFC 10/10] drm/i915: Naive engine busyness based load balancing Tvrtko Ursulin
  2018-01-25 14:12 ` ✗ Fi.CI.BAT: failure for Virtual queue/engine uAPI prototype Patchwork
  10 siblings, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Contexts marked as virtual can be load balanced between available engine
instaces. In this trivial implementation there are two important points to
kepp in mind:

1. Best engine is chosen by round-robin on every submission.

Every time context is transferred between engines an implicit
synchronization point is created, where the execution on the new engine
can only continue once the execution on the current engine has stopped
(for each context).

The round-robin on every submission is also far from ideal. If desired it
could later be improved with an engine busyness or queu-depth based
approaches which were demonstrated to work well when used for userspace
based balancing.

2. The engine is selected at the execbuf level which may be quite distant
in time from when the GPU actually becomes available to run things.

IMPORTANT CAVEAT:
This prototype implementation does not guarantee context state.

To provide context state in this prototype a much more "real" virtual
engine would have to be created which would include involved refactoring.

Userspace which uses specific engine features, not present on all engine
instances, needs to signal this fact via the engines capabilities uAPI.
i915 will then make sure only compatible engines are used for executing
the submission.

v2:
 * Fix GT2 configs and no VCS engine.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 46 +++++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index fa1806ed9be6..f89a7be68133 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -204,6 +204,8 @@ struct i915_execbuffer {
 	struct drm_i915_gem_request *request; /** our request to build */
 	struct i915_vma *batch; /** identity of the batch obj/vma */
 
+	struct drm_i915_gem_request *prev_request; /** request to depend on */
+
 	/** actual size of execobj[] as we may extend it for the cmdparser */
 	unsigned int buffer_count;
 
@@ -2018,23 +2020,51 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
 
 static int eb_select_engine_class_instance(struct i915_execbuffer *eb)
 {
+	struct drm_i915_private *i915 = eb->i915;
 	u64 eb_flags = eb->args->flags;
 	u8 class = eb_flags & I915_EXEC_RING_MASK;
 	u8 instance = (eb_flags & I915_EXEC_INSTANCE_MASK) >>
 		      I915_EXEC_INSTANCE_SHIFT;
 	u8 caps = (eb_flags & I915_EXEC_ENGINE_CAP_MASK) >>
 		  I915_EXEC_ENGINE_CAP_SHIFT;
+	struct drm_i915_gem_request *prev_req = NULL;
 	struct intel_engine_cs *engine;
 
-	if (instance && eb->ctx->virtual)
+	if (eb->ctx->virtual && instance) {
 		return -EINVAL;
+	} else if ((HAS_BSD(i915) && HAS_BSD2(i915)) && eb->ctx->virtual &&
+		   class == I915_ENGINE_CLASS_VIDEO) {
+		unsigned int vcs_instances = 2;
+		struct intel_timeline *timeline;
 
-	engine = intel_engine_lookup_user(eb->i915, class, instance);
+		instance = atomic_fetch_xor(1,
+					    &i915->mm.bsd_engine_dispatch_index);
 
-	if (engine && ((caps & engine->caps) != caps))
-		return -EINVAL;
+		do {
+			engine = i915->engine[_VCS(instance)];
+			instance ^= 1;
+			vcs_instances--;
+		} while ((caps & engine->caps) != caps && vcs_instances > 0);
+
+		if ((caps & engine->caps) != caps)
+			return -EINVAL;
+
+		timeline = i915_gem_context_lookup_timeline_class(eb->ctx,
+								  VIDEO_DECODE_CLASS);
+		spin_lock_irq(&timeline->lock);
+		prev_req = list_first_entry_or_null(&timeline->requests,
+						    struct drm_i915_gem_request,
+						    ctx_link);
+		spin_unlock_irq(&timeline->lock);
+	} else {
+		engine = intel_engine_lookup_user(i915, class, instance);
+
+		if (engine && ((caps & engine->caps) != caps))
+			return -EINVAL;
+	}
 
 	eb->engine = engine;
+	eb->prev_request = prev_req;
 
 	return 0;
 }
@@ -2100,6 +2130,7 @@ static int eb_select_engine(struct i915_execbuffer *eb, struct drm_file *file)
 	}
 
 	eb->engine = engine;
+	eb->prev_request = NULL;
 
 	return 0;
 }
@@ -2427,6 +2458,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		goto err_batch_unpin;
 	}
 
+	if (eb.prev_request) {
+		err = i915_gem_request_await_dma_fence(eb.request,
+						       &eb.prev_request->fence);
+		if (err)
+			goto err_request;
+	}
+
 	if (in_fence) {
 		err = i915_gem_request_await_dma_fence(eb.request, in_fence);
 		if (err < 0)
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [RFC 10/10] drm/i915: Naive engine busyness based load balancing
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (8 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 09/10] drm/i915: Trivial virtual engine implementation Tvrtko Ursulin
@ 2018-01-25 13:33 ` Tvrtko Ursulin
  2018-01-25 14:12 ` ✗ Fi.CI.BAT: failure for Virtual queue/engine uAPI prototype Patchwork
  10 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 13:33 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

At execbuf time engine busyness since the last submission is used as basis
for determining where to submit. In case both engines are equally busy,
request is submitted to the same engine as the previous one.

Virtual engine contexts enable engine busy stats on first submission and
disable it at context destruction.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
This is not a proposal on how to actually implement this. It since it
performs very poorly compared to similar userspace balancing strategies
and it should be considered just a prototype to illustrate the idea and
issues with forntend balancing.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c    |  3 ++
 drivers/gpu/drm/i915/i915_gem_context.h    | 19 +++++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 62 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/intel_engine_cs.c     | 35 +++++++++++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.h    |  2 +
 5 files changed, 119 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 8f5f23b0dd34..a324e24f7a07 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -129,6 +129,9 @@ static void i915_gem_context_free(struct i915_gem_context *ctx)
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
 		struct intel_context *ce = &ctx->engine[i];
 
+		if (ctx->stats_enabled[i])
+			intel_disable_engine_stats(ctx->i915->engine[i]);
+
 		if (!ce->state)
 			continue;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 5afe050718b8..0113c161f245 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -163,6 +163,25 @@ struct i915_gem_context {
 		int pin_count;
 	} engine[I915_NUM_ENGINES];
 
+	/**
+	 * @stats_enabled: Has this context enabled per-engine stats.
+	 *
+	 * Boolean tracked per-engine.
+	 */
+	bool stats_enabled[I915_NUM_ENGINES];
+
+	/**
+	 * @prev_busy: Previous engine busyness.
+	 *
+	 * For VCS engines.
+	 */
+	u64 prev_busy[2];
+
+	/**
+	 * @prev_instance: Previously submitted to VCS instance.
+	 */
+	u8 prev_instance;
+
 	/** ring_size: size for allocating the per-engine ring buffer */
 	u32 ring_size;
 	/** desc_template: invariant fields for the HW context descriptor */
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f89a7be68133..53afc4dc976e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2018,6 +2018,53 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
 	return file_priv->bsd_engine;
 }
 
+static void
+ctx_enable_stats(struct i915_gem_context *ctx, enum intel_engine_id id)
+{
+	int ret;
+
+	ret = intel_enable_engine_stats(ctx->i915->engine[id]);
+
+	ctx->stats_enabled[id] = ret == 0;
+}
+
+static u8 eb_rr_instance(struct drm_i915_private *i915)
+{
+	return atomic_fetch_xor(1, &i915->mm.bsd_engine_dispatch_index);
+}
+
+static u8 ctx_best_vcs_instance(struct i915_gem_context *ctx)
+{
+	ktime_t now = ktime_get();
+	u64 busy[2], prev_busy[2];
+	u8 instance;
+
+	busy[0] = intel_engine_get_busy_time_now(ctx->i915->engine[_VCS(0)],
+						 now);
+	busy[1] = intel_engine_get_busy_time_now(ctx->i915->engine[_VCS(1)],
+						 now);
+
+	prev_busy[0] = ctx->prev_busy[0];
+	prev_busy[1] = ctx->prev_busy[1];
+
+	ctx->prev_busy[0] = busy[0];
+	ctx->prev_busy[1] = busy[1];
+
+	busy[0] -= prev_busy[0];
+	busy[1] -= prev_busy[1];
+
+	if (busy[0] < busy[1])
+		instance = 0;
+	else if (busy[1] < busy[0])
+		instance = 1;
+	else
+		instance = ctx->prev_instance;
+
+	ctx->prev_instance = instance;
+
+	return instance;
+}
+
 static int eb_select_engine_class_instance(struct i915_execbuffer *eb)
 {
 	struct drm_i915_private *i915 = eb->i915;
@@ -2037,8 +2084,19 @@ static int eb_select_engine_class_instance(struct i915_execbuffer *eb)
 		unsigned int vcs_instances = 2;
 		struct intel_timeline *timeline;
 
-		instance = atomic_fetch_xor(1,
-					    &i915->mm.bsd_engine_dispatch_index);
+		if (intel_engine_supports_stats(i915->engine[VCS])) {
+			if (!eb->ctx->stats_enabled[_VCS(0)])
+				ctx_enable_stats(eb->ctx, _VCS(0));
+
+			if (!eb->ctx->stats_enabled[_VCS(1)])
+				ctx_enable_stats(eb->ctx, _VCS(1));
+		}
+
+		if (eb->ctx->stats_enabled[_VCS(0)] &&
+		    eb->ctx->stats_enabled[_VCS(1)])
+			instance = ctx_best_vcs_instance(eb->ctx);
+		else
+			instance = eb_rr_instance(i915);
 
 		do {
 			engine = i915->engine[_VCS(instance)];
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index e02627618bc5..d186a218809f 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -2008,6 +2008,22 @@ static ktime_t __intel_engine_get_busy_time(struct intel_engine_cs *engine)
 	return total;
 }
 
+static ktime_t
+___intel_engine_get_busy_time(struct intel_engine_cs *engine, ktime_t now)
+{
+	ktime_t total = engine->stats.total;
+
+	/*
+	 * If the engine is executing something at the moment
+	 * add it to the total.
+	 */
+	if (engine->stats.active)
+		total = ktime_add(total,
+				  ktime_sub(now, engine->stats.start));
+
+	return total;
+}
+
 /**
  * intel_engine_get_busy_time() - Return current accumulated engine busyness
  * @engine: engine to report on
@@ -2026,6 +2042,25 @@ ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine)
 	return total;
 }
 
+/**
+ * intel_engine_get_busy_time() - Return current accumulated engine busyness
+ * @engine: engine to report on
+ *
+ * Returns accumulated time @engine was busy since engine stats were enabled.
+ */
+ktime_t
+intel_engine_get_busy_time_now(struct intel_engine_cs *engine, ktime_t now)
+{
+	ktime_t total;
+	unsigned long flags;
+
+	spin_lock_irqsave(&engine->stats.lock, flags);
+	total = ___intel_engine_get_busy_time(engine, now);
+	spin_unlock_irqrestore(&engine->stats.lock, flags);
+
+	return total;
+}
+
 /**
  * intel_disable_engine_stats() - Disable engine busy tracking on engine
  * @engine: engine to disable stats collection
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 76f7fdc926ae..6f418b37ed58 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -1102,5 +1102,7 @@ int intel_enable_engine_stats(struct intel_engine_cs *engine);
 void intel_disable_engine_stats(struct intel_engine_cs *engine);
 
 ktime_t intel_engine_get_busy_time(struct intel_engine_cs *engine);
+ktime_t intel_engine_get_busy_time_now(struct intel_engine_cs *engine,
+				       ktime_t now);
 
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [RFC 09/10] drm/i915: Trivial virtual engine implementation
  2018-01-25 13:33 ` [RFC 09/10] drm/i915: Trivial virtual engine implementation Tvrtko Ursulin
@ 2018-01-25 13:57   ` Chris Wilson
  2018-01-25 14:26     ` Tvrtko Ursulin
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2018-01-25 13:57 UTC (permalink / raw)
  To: Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-01-25 13:33:32)
> -       if (engine && ((caps & engine->caps) != caps))
> -               return -EINVAL;
> +               do {
> +                       engine = i915->engine[_VCS(instance)];
> +                       instance ^= 1;
> +                       vcs_instances--;
> +               } while ((caps & engine->caps) != caps && vcs_instances > 0);
> +
> +               if ((caps & engine->caps) != caps)
> +                       return -EINVAL;
> +
> +               timeline = i915_gem_context_lookup_timeline_class(eb->ctx,
> +                                                                 VIDEO_DECODE_CLASS);
> +               spin_lock_irq(&timeline->lock);
> +               prev_req = list_first_entry_or_null(&timeline->requests,
> +                                                   struct drm_i915_gem_request,
> +                                                   ctx_link);
> +               spin_unlock_irq(&timeline->lock);

This isn't doing anything yet as we aren't using the timeline. The idea
is sound though, we need to rejig timelines to make them more flexible
so that we can combine them to use one per-queue. Ok.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* ✗ Fi.CI.BAT: failure for Virtual queue/engine uAPI prototype
  2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
                   ` (9 preceding siblings ...)
  2018-01-25 13:33 ` [RFC 10/10] drm/i915: Naive engine busyness based load balancing Tvrtko Ursulin
@ 2018-01-25 14:12 ` Patchwork
  10 siblings, 0 replies; 16+ messages in thread
From: Patchwork @ 2018-01-25 14:12 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Series Details ==

Series: Virtual queue/engine uAPI prototype
URL   : https://patchwork.freedesktop.org/series/37103/
State : failure

== Summary ==

Series 37103v1 Virtual queue/engine uAPI prototype
https://patchwork.freedesktop.org/api/1.0/series/37103/revisions/1/mbox/

Test core_auth:
        Subgroup basic-auth:
                pass       -> INCOMPLETE (fi-gdg-551)
                pass       -> INCOMPLETE (fi-blb-e6850)
                pass       -> INCOMPLETE (fi-pnv-d510)
                pass       -> INCOMPLETE (fi-bwr-2160)
                pass       -> INCOMPLETE (fi-elk-e7500)
                pass       -> INCOMPLETE (fi-ilk-650)
                pass       -> INCOMPLETE (fi-snb-2520m)
                pass       -> INCOMPLETE (fi-snb-2600)
                pass       -> INCOMPLETE (fi-ivb-3520m)
                pass       -> INCOMPLETE (fi-ivb-3770)
                pass       -> INCOMPLETE (fi-byt-j1900)
                pass       -> INCOMPLETE (fi-byt-n2820)
                pass       -> INCOMPLETE (fi-hsw-4770)
                pass       -> INCOMPLETE (fi-hsw-4770r)
Test gem_ringfill:
        Subgroup basic-default:
                pass       -> SKIP       (fi-bsw-n3050)

fi-bdw-5557u     total:288  pass:267  dwarn:0   dfail:0   fail:0   skip:21  time:420s
fi-bdw-gvtdvm    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:425s
fi-blb-e6850     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-bsw-n3050     total:288  pass:241  dwarn:0   dfail:0   fail:0   skip:47  time:487s
fi-bwr-2160      total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-bxt-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:487s
fi-bxt-j4205     total:288  pass:259  dwarn:0   dfail:0   fail:0   skip:29  time:485s
fi-byt-j1900     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-byt-n2820     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-elk-e7500     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-gdg-551       total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-glk-1         total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:516s
fi-hsw-4770      total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-hsw-4770r     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-ilk-650       total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-ivb-3520m     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-ivb-3770      total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-kbl-7500u     total:288  pass:263  dwarn:1   dfail:0   fail:0   skip:24  time:465s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:498s
fi-kbl-7567u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:451s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:503s
fi-pnv-d510      total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-skl-6260u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:426s
fi-skl-6600u     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:513s
fi-skl-6700hq    total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:528s
fi-skl-6700k2    total:288  pass:264  dwarn:0   dfail:0   fail:0   skip:24  time:487s
fi-skl-6770hq    total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:480s
fi-skl-guc       total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:413s
fi-skl-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:432s
fi-snb-2520m     total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
fi-snb-2600      total:1    pass:0    dwarn:0   dfail:0   fail:0   skip:0  
Blacklisted hosts:
fi-cfl-s2        total:288  pass:262  dwarn:0   dfail:0   fail:0   skip:26  time:574s
fi-glk-dsi       total:288  pass:258  dwarn:0   dfail:0   fail:0   skip:30  time:471s

9d8467fe5626095314bc34449457798dae066fbb drm-tip: 2018y-01m-24d-19h-59m-41s UTC integration manifest
93f04aacc45b drm/i915: Naive engine busyness based load balancing
7751cdc435c3 drm/i915: Trivial virtual engine implementation
de4094e3fb1d drm/i915: Allow creating virtual contexts
17f06c7b6709 drm/i915: Track latest request per engine class
66e92c932269 drm/i915: Refactor eb_select_engine to take eb
a3575d88bee6 drm/i915: Re-arrange execbuf so context is known before engine
b188df04c5a8 drm/i915: Engine capabilities uAPI
e28aaaf2df0d drm/i915: Select engines via class and instance in execbuffer2
b0efc2fdba0e drm/i915: Extend CREATE_CONTEXT to allow inheritance ala clone()
f492ee1d9acb move-timeline-to-ctx

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_7777/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 09/10] drm/i915: Trivial virtual engine implementation
  2018-01-25 13:57   ` Chris Wilson
@ 2018-01-25 14:26     ` Tvrtko Ursulin
  2018-01-25 14:32       ` Chris Wilson
  0 siblings, 1 reply; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 14:26 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 25/01/2018 13:57, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-25 13:33:32)
>> -       if (engine && ((caps & engine->caps) != caps))
>> -               return -EINVAL;
>> +               do {
>> +                       engine = i915->engine[_VCS(instance)];
>> +                       instance ^= 1;
>> +                       vcs_instances--;
>> +               } while ((caps & engine->caps) != caps && vcs_instances > 0);
>> +
>> +               if ((caps & engine->caps) != caps)
>> +                       return -EINVAL;
>> +
>> +               timeline = i915_gem_context_lookup_timeline_class(eb->ctx,
>> +                                                                 VIDEO_DECODE_CLASS);
>> +               spin_lock_irq(&timeline->lock);
>> +               prev_req = list_first_entry_or_null(&timeline->requests,
>> +                                                   struct drm_i915_gem_request,
>> +                                                   ctx_link);
>> +               spin_unlock_irq(&timeline->lock);
> 
> This isn't doing anything yet as we aren't using the timeline. The idea
> is sound though, we need to rejig timelines to make them more flexible
> so that we can combine them to use one per-queue. Ok.

I think it works - as far as I looked at the trace.pl HTML output it 
seems to.

The purpose is to simulate single stream of execution, so this is a new 
timeline I added which is per ctx and per engine class. Submissions set 
up an await on a previous request on the virtual engine, so when the 
balancer decides to move between instances it ensures they do not run in 
parallel.

In a real implementation this either wouldn't be needed or would live at 
some other, more natural, level.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 09/10] drm/i915: Trivial virtual engine implementation
  2018-01-25 14:26     ` Tvrtko Ursulin
@ 2018-01-25 14:32       ` Chris Wilson
  2018-01-25 14:36         ` Tvrtko Ursulin
  0 siblings, 1 reply; 16+ messages in thread
From: Chris Wilson @ 2018-01-25 14:32 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, Intel-gfx

Quoting Tvrtko Ursulin (2018-01-25 14:26:53)
> 
> On 25/01/2018 13:57, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2018-01-25 13:33:32)
> >> -       if (engine && ((caps & engine->caps) != caps))
> >> -               return -EINVAL;
> >> +               do {
> >> +                       engine = i915->engine[_VCS(instance)];
> >> +                       instance ^= 1;
> >> +                       vcs_instances--;
> >> +               } while ((caps & engine->caps) != caps && vcs_instances > 0);
> >> +
> >> +               if ((caps & engine->caps) != caps)
> >> +                       return -EINVAL;
> >> +
> >> +               timeline = i915_gem_context_lookup_timeline_class(eb->ctx,
> >> +                                                                 VIDEO_DECODE_CLASS);
> >> +               spin_lock_irq(&timeline->lock);
> >> +               prev_req = list_first_entry_or_null(&timeline->requests,
> >> +                                                   struct drm_i915_gem_request,
> >> +                                                   ctx_link);
> >> +               spin_unlock_irq(&timeline->lock);
> > 
> > This isn't doing anything yet as we aren't using the timeline. The idea
> > is sound though, we need to rejig timelines to make them more flexible
> > so that we can combine them to use one per-queue. Ok.
> 
> I think it works - as far as I looked at the trace.pl HTML output it 
> seems to.

You are syncing against the oldest, not the previous (which would be
timeline->last_request).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [RFC 09/10] drm/i915: Trivial virtual engine implementation
  2018-01-25 14:32       ` Chris Wilson
@ 2018-01-25 14:36         ` Tvrtko Ursulin
  0 siblings, 0 replies; 16+ messages in thread
From: Tvrtko Ursulin @ 2018-01-25 14:36 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, Intel-gfx


On 25/01/2018 14:32, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-01-25 14:26:53)
>>
>> On 25/01/2018 13:57, Chris Wilson wrote:
>>> Quoting Tvrtko Ursulin (2018-01-25 13:33:32)
>>>> -       if (engine && ((caps & engine->caps) != caps))
>>>> -               return -EINVAL;
>>>> +               do {
>>>> +                       engine = i915->engine[_VCS(instance)];
>>>> +                       instance ^= 1;
>>>> +                       vcs_instances--;
>>>> +               } while ((caps & engine->caps) != caps && vcs_instances > 0);
>>>> +
>>>> +               if ((caps & engine->caps) != caps)
>>>> +                       return -EINVAL;
>>>> +
>>>> +               timeline = i915_gem_context_lookup_timeline_class(eb->ctx,
>>>> +                                                                 VIDEO_DECODE_CLASS);
>>>> +               spin_lock_irq(&timeline->lock);
>>>> +               prev_req = list_first_entry_or_null(&timeline->requests,
>>>> +                                                   struct drm_i915_gem_request,
>>>> +                                                   ctx_link);
>>>> +               spin_unlock_irq(&timeline->lock);
>>>
>>> This isn't doing anything yet as we aren't using the timeline. The idea
>>> is sound though, we need to rejig timelines to make them more flexible
>>> so that we can combine them to use one per-queue. Ok.
>>
>> I think it works - as far as I looked at the trace.pl HTML output it
>> seems to.
> 
> You are syncing against the oldest, not the previous (which would be
> timeline->last_request).

Hey, my timeline my rules - see patch 7! :)

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-01-25 14:36 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-25 13:33 [RFC 00/10] Virtual queue/engine uAPI prototype Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 01/10] move-timeline-to-ctx Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 02/10] drm/i915: Extend CREATE_CONTEXT to allow inheritance ala clone() Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 03/10] drm/i915: Select engines via class and instance in execbuffer2 Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 04/10] drm/i915: Engine capabilities uAPI Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 05/10] drm/i915: Re-arrange execbuf so context is known before engine Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 06/10] drm/i915: Refactor eb_select_engine to take eb Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 07/10] drm/i915: Track latest request per engine class Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 08/10] drm/i915: Allow creating virtual contexts Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 09/10] drm/i915: Trivial virtual engine implementation Tvrtko Ursulin
2018-01-25 13:57   ` Chris Wilson
2018-01-25 14:26     ` Tvrtko Ursulin
2018-01-25 14:32       ` Chris Wilson
2018-01-25 14:36         ` Tvrtko Ursulin
2018-01-25 13:33 ` [RFC 10/10] drm/i915: Naive engine busyness based load balancing Tvrtko Ursulin
2018-01-25 14:12 ` ✗ Fi.CI.BAT: failure for Virtual queue/engine uAPI prototype Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.