All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/7] drm/i915: Vulkan performance query support
@ 2019-06-04 13:11 Lionel Landwerlin
  2019-06-04 13:11 ` [PATCH v3 1/7] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

Hi all,

This is a pretty big update following v2.

The first big change is to drop the HW arbitration usage in favor of a
software mechanism using a special priority in the scheduler.

The second is a rework of the uAPI. Since we have a couple of execbuf
uAPI changes for this series (VK_INTEL_performance_query) & timeline
semaphores, we can come up with a more generic mechanism that could be
extended in the future. Rather than reusing partially deprecated
fields.

Unfortunately (or not!) that requires timeline semaphores to be added
to the series because we need to use cliprects_ptr that is already
used by FENCE_ARRAY.

Cheers,

Lionel Landwerlin (7):
  drm/i915/perf: introduce a versioning of the i915-perf uapi
  drm/i915/perf: allow for CS OA configs to be created lazily
  drm/i915: introduce a mechanism to extend execbuf2
  drm/i915: add syncobj timeline support
  drm/i915: add a new perf configuration execbuf parameter
  drm/i915/perf: allow holding preemption on filtered ctx
  drm/i915: add support for perf configuration queries

 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 437 +++++++++++++++---
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   1 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   7 +
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |   4 +-
 drivers/gpu/drm/i915/i915_drv.c               |  11 +-
 drivers/gpu/drm/i915/i915_drv.h               |  37 +-
 drivers/gpu/drm/i915/i915_perf.c              | 197 ++++++--
 drivers/gpu/drm/i915/i915_priolist_types.h    |   7 +
 drivers/gpu/drm/i915/i915_query.c             | 279 +++++++++++
 drivers/gpu/drm/i915/i915_request.c           |   1 +
 drivers/gpu/drm/i915/i915_request.h           |   1 +
 drivers/gpu/drm/i915/intel_guc_submission.c   |   6 +
 include/uapi/drm/i915_drm.h                   | 211 ++++++++-
 14 files changed, 1109 insertions(+), 97 deletions(-)

--
2.21.0.392.gf8f6787159e
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/7] drm/i915/perf: introduce a versioning of the i915-perf uapi
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 13:11 ` [PATCH v3 2/7] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

Reporting this version will help application figure out what level of
the support the running kernel provides.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |  3 +++
 include/uapi/drm/i915_drm.h     | 21 +++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 1af6751e1b36..9ed4d0016ee1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -474,6 +474,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_MMAP_GTT_COHERENT:
 		value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
 		break;
+	case I915_PARAM_PERF_REVISION:
+		value = 1;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 328d05e77d9f..e27a8eda9121 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -610,6 +610,13 @@ typedef struct drm_i915_irq_wait {
  * See I915_EXEC_FENCE_OUT and I915_EXEC_FENCE_SUBMIT.
  */
 #define I915_PARAM_HAS_EXEC_SUBMIT_FENCE 53
+
+/*
+ * Revision of the i915-perf uAPI. The value returned helps determine what
+ * i915-perf features are available. See drm_i915_perf_property_id.
+ */
+#define I915_PARAM_PERF_REVISION	54
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1843,23 +1850,31 @@ enum drm_i915_perf_property_id {
 	 * Open the stream for a specific context handle (as used with
 	 * execbuffer2). A stream opened for a specific context this way
 	 * won't typically require root privileges.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_CTX_HANDLE = 1,
 
 	/**
 	 * A value of 1 requests the inclusion of raw OA unit reports as
 	 * part of stream samples.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_SAMPLE_OA,
 
 	/**
 	 * The value specifies which set of OA unit metrics should be
 	 * be configured, defining the contents of any OA unit reports.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_OA_METRICS_SET,
 
 	/**
 	 * The value specifies the size and layout of OA unit reports.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_OA_FORMAT,
 
@@ -1869,6 +1884,8 @@ enum drm_i915_perf_property_id {
 	 * from this exponent as follows:
 	 *
 	 *   80ns * 2^(period_exponent + 1)
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_OA_EXPONENT,
 
@@ -1900,6 +1917,8 @@ struct drm_i915_perf_open_param {
  * to close and re-open a stream with the same configuration.
  *
  * It's undefined whether any pending data for the stream will be lost.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_ENABLE	_IO('i', 0x0)
 
@@ -1907,6 +1926,8 @@ struct drm_i915_perf_open_param {
  * Disable data capture for a stream.
  *
  * It is an error to try and read a stream that is disabled.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_DISABLE	_IO('i', 0x1)
 
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/7] drm/i915/perf: allow for CS OA configs to be created lazily
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
  2019-06-04 13:11 ` [PATCH v3 1/7] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 13:29   ` Chris Wilson
  2019-06-04 13:11 ` [PATCH v3 3/7] drm/i915: introduce a mechanism to extend execbuf2 Lionel Landwerlin
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

v2: No need for locking on object OA config object creation (Chris)
    Flush cpu mapping of OA config (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
 drivers/gpu/drm/i915/i915_drv.h              |  22 ++-
 drivers/gpu/drm/i915/i915_perf.c             | 171 +++++++++++++++----
 3 files changed, 162 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index eec31e36aca7..e7eff9db343e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -126,6 +126,7 @@
  */
 #define MI_LOAD_REGISTER_IMM(x)	MI_INSTR(0x22, 2*(x)-1)
 #define   MI_LRI_FORCE_POSTED		(1<<12)
+#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
 #define MI_STORE_REGISTER_MEM        MI_INSTR(0x24, 1)
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT		(1<<22)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 76f2bf90ed86..f1e51307253a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1276,6 +1276,10 @@ struct i915_oa_config {
 	struct attribute *attrs[2];
 	struct device_attribute sysfs_metric_id;
 
+	struct drm_i915_gem_object *obj;
+
+	struct list_head vma_link;
+
 	atomic_t ref_count;
 };
 
@@ -1865,11 +1869,21 @@ struct drm_i915_private {
 		struct mutex metrics_lock;
 
 		/*
-		 * List of dynamic configurations, you need to hold
-		 * dev_priv->perf.metrics_lock to access it.
+		 * List of dynamic configurations (struct i915_oa_config), you
+		 * need to hold dev_priv->perf.metrics_lock to access it.
 		 */
 		struct idr metrics_idr;
 
+		/*
+		 * List of dynamic configurations (struct i915_oa_config)
+		 * which have an allocated buffer in GGTT for reconfiguration,
+		 * you need to hold dev_priv->perf.metrics_lock to access it.
+		 * Elements are added to the list lazilly on execbuf (when a
+		 * particular configuration is requested). The list is freed
+		 * upon closing the perf stream.
+		 */
+		struct list_head metrics_buffers;
+
 		/*
 		 * Lock associated with anything below within this structure
 		 * except exclusive_stream.
@@ -2815,6 +2829,10 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct intel_context *ce,
 			    u32 *reg_state);
+int i915_perf_get_oa_config(struct drm_i915_private *i915,
+			    int metrics_set,
+			    struct i915_oa_config **out_config,
+			    struct drm_i915_gem_object **out_obj);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 2e33a9b4eae7..e0071e44de3d 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -366,9 +366,16 @@ struct perf_open_properties {
 	int oa_period_exponent;
 };
 
-static void free_oa_config(struct drm_i915_private *dev_priv,
-			   struct i915_oa_config *oa_config)
+static void put_oa_config(struct i915_oa_config *oa_config)
 {
+	if (!atomic_dec_and_test(&oa_config->ref_count))
+		return;
+
+	if (oa_config->obj) {
+		list_del(&oa_config->vma_link);
+		i915_gem_object_put(oa_config->obj);
+	}
+
 	if (!PTR_ERR(oa_config->flex_regs))
 		kfree(oa_config->flex_regs);
 	if (!PTR_ERR(oa_config->b_counter_regs))
@@ -378,38 +385,126 @@ static void free_oa_config(struct drm_i915_private *dev_priv,
 	kfree(oa_config);
 }
 
-static void put_oa_config(struct drm_i915_private *dev_priv,
-			  struct i915_oa_config *oa_config)
+static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs)
 {
-	if (!atomic_dec_and_test(&oa_config->ref_count))
-		return;
+	u32 i;
 
-	free_oa_config(dev_priv, oa_config);
+	for (i = 0; i < n_regs; i++) {
+		if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
+			u32 n_lri = min(n_regs - i,
+					(u32) MI_LOAD_REGISTER_IMM_MAX_REGS);
+
+			*cs++ = MI_LOAD_REGISTER_IMM(n_lri);
+		}
+		*cs++ = i915_mmio_reg_offset(reg_data[i].addr);
+		*cs++ = reg_data[i].value;
+	}
+
+	return cs;
 }
 
-static int get_oa_config(struct drm_i915_private *dev_priv,
-			 int metrics_set,
-			 struct i915_oa_config **out_config)
+static int alloc_oa_config_buffer(struct drm_i915_private *i915,
+				  struct i915_oa_config *oa_config)
 {
-	int ret;
+	struct drm_i915_gem_object *bo;
+	size_t config_length = 0;
+	u32 *cs;
 
-	if (metrics_set == 1) {
-		*out_config = &dev_priv->perf.oa.test_config;
-		atomic_inc(&dev_priv->perf.oa.test_config.ref_count);
-		return 0;
+	if (oa_config->mux_regs_len > 0) {
+		config_length += DIV_ROUND_UP(oa_config->mux_regs_len,
+					      MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
+		config_length += oa_config->mux_regs_len * 8;
+	}
+	if (oa_config->b_counter_regs_len > 0) {
+		config_length += DIV_ROUND_UP(oa_config->b_counter_regs_len,
+					      MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
+		config_length += oa_config->b_counter_regs_len * 8;
+	}
+	if (oa_config->flex_regs_len > 0) {
+		config_length += DIV_ROUND_UP(oa_config->flex_regs_len,
+					      MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
+		config_length += oa_config->flex_regs_len * 8;
 	}
+	config_length += 4; /* MI_BATCH_BUFFER_END */
+	config_length = ALIGN(config_length, I915_GTT_PAGE_SIZE);
 
-	ret = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
+	bo = i915_gem_object_create_shmem(i915, config_length);
+	if (IS_ERR(bo))
+		return PTR_ERR(bo);
+
+	cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
+	if (IS_ERR(cs)) {
+		i915_gem_object_put(bo);
+		return PTR_ERR(cs);
+	}
+
+	cs = write_cs_mi_lri(cs, oa_config->mux_regs, oa_config->mux_regs_len);
+	cs = write_cs_mi_lri(cs, oa_config->b_counter_regs, oa_config->b_counter_regs_len);
+	cs = write_cs_mi_lri(cs, oa_config->flex_regs, oa_config->flex_regs_len);
+
+	*cs++ = MI_BATCH_BUFFER_END;
+
+	i915_gem_object_flush_map(bo);
+	i915_gem_object_unpin_map(bo);
+
+	oa_config->obj = bo;
+
+	return 0;
+}
+
+int i915_perf_get_oa_config(struct drm_i915_private *i915,
+			    int metrics_set,
+			    struct i915_oa_config **out_config,
+			    struct drm_i915_gem_object **out_obj)
+{
+	int ret = 0;
+	struct i915_oa_config *oa_config;
+
+	if (!i915->perf.initialized)
+		return -ENODEV;
+
+	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
 	if (ret)
 		return ret;
 
-	*out_config = idr_find(&dev_priv->perf.metrics_idr, metrics_set);
-	if (!*out_config)
-		ret = -EINVAL;
-	else
-		atomic_inc(&(*out_config)->ref_count);
+	if (metrics_set == 1) {
+		oa_config = &i915->perf.oa.test_config;
+	} else {
+		oa_config = idr_find(&i915->perf.metrics_idr, metrics_set);
+		if (!oa_config) {
+			ret = -EINVAL;
+			goto unlock;
+		}
+	}
 
-	mutex_unlock(&dev_priv->perf.metrics_lock);
+	if (out_config) {
+		atomic_inc(&oa_config->ref_count);
+		*out_config = oa_config;
+	}
+
+	if (out_obj) {
+		if (oa_config->obj) {
+			*out_obj = i915_gem_object_get(oa_config->obj);
+		} else {
+			ret = alloc_oa_config_buffer(i915, oa_config);
+			if (ret)
+				goto err_buf_alloc;
+
+			list_add(&oa_config->vma_link,
+				 &i915->perf.metrics_buffers);
+			*out_obj = i915_gem_object_get(oa_config->obj);
+		}
+	}
+
+	goto unlock;
+
+err_buf_alloc:
+	if (out_config) {
+		put_oa_config(oa_config);
+		*out_config = NULL;
+	}
+unlock:
+	mutex_unlock(&i915->perf.metrics_lock);
 
 	return ret;
 }
@@ -1380,7 +1475,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
 
-	put_oa_config(dev_priv, stream->oa_config);
+	put_oa_config(stream->oa_config);
 
 	if (dev_priv->perf.oa.spurious_report_rs.missed) {
 		DRM_NOTE("%d spurious OA report notices suppressed due to ratelimiting\n",
@@ -2094,7 +2189,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		}
 	}
 
-	ret = get_oa_config(dev_priv, props->metrics_set, &stream->oa_config);
+	ret = i915_perf_get_oa_config(dev_priv, props->metrics_set,
+				      &stream->oa_config, NULL);
 	if (ret) {
 		DRM_DEBUG("Invalid OA config id=%i\n", props->metrics_set);
 		goto err_config;
@@ -2132,6 +2228,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		goto err_enable;
 	}
 
+	DRM_DEBUG("opening stream oa config uuid=%s\n", stream->oa_config->uuid);
+
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	return 0;
@@ -2145,7 +2243,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	free_oa_buffer(dev_priv);
 
 err_oa_buf_alloc:
-	put_oa_config(dev_priv, stream->oa_config);
+	put_oa_config(stream->oa_config);
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 	intel_runtime_pm_put(dev_priv, stream->wakeref);
@@ -2512,9 +2610,21 @@ static int i915_perf_release(struct inode *inode, struct file *file)
 {
 	struct i915_perf_stream *stream = file->private_data;
 	struct drm_i915_private *dev_priv = stream->dev_priv;
+	struct i915_oa_config *oa_config, *next;
 
 	mutex_lock(&dev_priv->perf.lock);
+
 	i915_perf_destroy_locked(stream);
+
+	/* Dispose of all oa config batch buffers. */
+	mutex_lock(&dev_priv->perf.metrics_lock);
+	list_for_each_entry_safe(oa_config, next, &dev_priv->perf.metrics_buffers, vma_link) {
+		list_del(&oa_config->vma_link);
+		i915_gem_object_put(oa_config->obj);
+		oa_config->obj = NULL;
+	}
+	mutex_unlock(&dev_priv->perf.metrics_lock);
+
 	mutex_unlock(&dev_priv->perf.lock);
 
 	return 0;
@@ -3296,7 +3406,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 sysfs_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
 reg_err:
-	put_oa_config(dev_priv, oa_config);
+	put_oa_config(oa_config);
 	DRM_DEBUG("Failed to add new OA config\n");
 	return err;
 }
@@ -3350,7 +3460,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 
 	DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
 
-	put_oa_config(dev_priv, oa_config);
+	put_oa_config(oa_config);
 
 config_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
@@ -3492,6 +3602,8 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		init_waitqueue_head(&dev_priv->perf.oa.poll_wq);
 
 		INIT_LIST_HEAD(&dev_priv->perf.streams);
+		INIT_LIST_HEAD(&dev_priv->perf.metrics_buffers);
+
 		mutex_init(&dev_priv->perf.lock);
 		spin_lock_init(&dev_priv->perf.oa.oa_buffer.ptr_lock);
 
@@ -3508,10 +3620,9 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 
 static int destroy_config(int id, void *p, void *data)
 {
-	struct drm_i915_private *dev_priv = data;
 	struct i915_oa_config *oa_config = p;
 
-	put_oa_config(dev_priv, oa_config);
+	put_oa_config(oa_config);
 
 	return 0;
 }
@@ -3525,7 +3636,7 @@ void i915_perf_fini(struct drm_i915_private *dev_priv)
 	if (!dev_priv->perf.initialized)
 		return;
 
-	idr_for_each(&dev_priv->perf.metrics_idr, destroy_config, dev_priv);
+	idr_for_each(&dev_priv->perf.metrics_idr, destroy_config, NULL);
 	idr_destroy(&dev_priv->perf.metrics_idr);
 
 	unregister_sysctl_table(dev_priv->perf.sysctl_header);
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 3/7] drm/i915: introduce a mechanism to extend execbuf2
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
  2019-06-04 13:11 ` [PATCH v3 1/7] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
  2019-06-04 13:11 ` [PATCH v3 2/7] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 13:32   ` Chris Wilson
  2019-06-04 13:11 ` [PATCH v3 4/7] drm/i915: add syncobj timeline support Lionel Landwerlin
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 49 ++++++++++++++++++-
 include/uapi/drm/i915_drm.h                   | 44 +++++++++++++++--
 2 files changed, 89 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index ed522fdfbe7f..480e20043d80 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -269,6 +269,10 @@ struct i915_execbuffer {
 	 */
 	int lut_size;
 	struct hlist_head *buckets; /** ht for relocation handles */
+
+	struct {
+		u64 flags; /** Available extensions parameters */
+	} extensions;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1958,7 +1962,7 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec)
 		return false;
 
 	/* Kernel clipping was a DRI1 misfeature */
-	if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+	if (!(exec->flags & (I915_EXEC_FENCE_ARRAY | I915_EXEC_EXT))) {
 		if (exec->num_cliprects || exec->cliprects_ptr)
 			return false;
 	}
@@ -2336,6 +2340,42 @@ signal_fence_array(struct i915_execbuffer *eb,
 	}
 }
 
+static int
+parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
+			  struct i915_execbuffer *eb)
+{
+	u64 iter_ptr = args->cliprects_ptr;
+
+	if (args->num_cliprects != 0)
+		return -EINVAL;
+
+	while (iter_ptr != 0) {
+		struct drm_i915_gem_base_execbuffer_ext iter;
+
+		if (copy_from_user(&iter,
+				   u64_to_user_ptr(iter_ptr), sizeof(iter)))
+			return -EFAULT;
+
+		if (iter.pad)
+			return -EINVAL;
+
+		/* Specifying the same extension multiple times is invalid. */
+		if (eb->extensions.flags & BIT(iter.type))
+			return -EINVAL;
+
+		switch (iter.type) {
+		default:
+			return -EINVAL;
+		}
+
+		eb->extensions.flags |= BIT(iter.type);
+
+		iter_ptr = iter.next_ptr;
+	}
+
+	return 0;
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
 		       struct drm_file *file,
@@ -2382,6 +2422,13 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (args->flags & I915_EXEC_IS_PINNED)
 		eb.batch_flags |= I915_DISPATCH_PINNED;
 
+	eb.extensions.flags = 0;
+	if (args->flags & I915_EXEC_EXT) {
+		err = parse_execbuf2_extensions(args, &eb);
+		if (err)
+			return err;
+	}
+
 	if (args->flags & I915_EXEC_FENCE_IN) {
 		in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
 		if (!in_fence)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index e27a8eda9121..2ef5ab0daae4 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1013,6 +1013,29 @@ struct drm_i915_gem_exec_fence {
 	__u32 flags;
 };
 
+enum drm_i915_gem_base_execbuffer_type {
+	DRM_I915_GEM_BASE_EXECBUFFER_TYPE_MAX /* non-ABI */
+};
+
+struct drm_i915_gem_base_execbuffer_ext {
+	/**
+	 * Type of the extension node as limited in enum
+	 * drm_i915_gem_base_execbuffer_type.
+	 */
+	__u32 type;
+
+	/**
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 pad;
+
+	/**
+	 * Pointer to another struct drm_i915_gem_base_execbuffer_ext or NULL
+	 * to terminate the chain.
+	 */
+	__u64 next_ptr;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
@@ -1029,8 +1052,14 @@ struct drm_i915_gem_execbuffer2 {
 	__u32 num_cliprects;
 	/**
 	 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
-	 * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
-	 * struct drm_i915_gem_exec_fence *fences.
+	 * & I915_EXEC_EXT are not set.
+	 *
+	 * If I915_EXEC_FENCE_ARRAY is set, then this is a pointer to an array
+	 * of struct drm_i915_gem_exec_fence and num_cliprects is the length
+	 * of the array.
+	 *
+	 * If I915_EXEC_EXT is set, then this is a pointer to a single struct
+	 * drm_i915_gem_base_execbuffer_ext and num_cliprects is 0.
 	 */
 	__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK              (0x3f)
@@ -1148,7 +1177,16 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_SUBMIT		(1 << 20)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_SUBMIT << 1))
+/*
+ * Setting I915_EXEC_EXT implies that drm_i915_gem_execbuffer2.cliprects_ptr
+ * is treated as a pointer to an linked list of
+ * drm_i915_gem_base_execbuffer_ext. Each drm_i915_gem_base_execbuffer_ext
+ * node is the base of a larger structure. The list of supported structures
+ * are listed in the drm_i915_gem_base_execbuffer_type enum.
+ */
+#define I915_EXEC_EXT		(1 << 21)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_EXT<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 4/7] drm/i915: add syncobj timeline support
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (2 preceding siblings ...)
  2019-06-04 13:11 ` [PATCH v3 3/7] drm/i915: introduce a mechanism to extend execbuf2 Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 13:11 ` [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 270 ++++++++++++++----
 drivers/gpu/drm/i915/i915_drv.c               |   4 +-
 include/uapi/drm/i915_drm.h                   |  37 +++
 3 files changed, 255 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 480e20043d80..02dc5480e8fe 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -210,6 +210,13 @@ enum {
  * the batchbuffer in trusted mode, otherwise the ioctl is rejected.
  */
 
+struct i915_drm_dma_fences {
+	struct drm_syncobj *syncobj; /* Use with ptr_mask_bits() */
+	struct dma_fence *dma_fence;
+	u64 value;
+	struct dma_fence_chain *chain_fence;
+};
+
 struct i915_execbuffer {
 	struct drm_i915_private *i915; /** i915 backpointer */
 	struct drm_file *file; /** per-file lookup tables and limits */
@@ -272,6 +279,7 @@ struct i915_execbuffer {
 
 	struct {
 		u64 flags; /** Available extensions parameters */
+		struct drm_i915_gem_exec_timeline_fences timeline_fences;
 	} extensions;
 };
 
@@ -2212,67 +2220,203 @@ eb_select_engine(struct i915_execbuffer *eb,
 }
 
 static void
-__free_fence_array(struct drm_syncobj **fences, unsigned int n)
+__free_fence_array(struct i915_drm_dma_fences *fences, unsigned int n)
 {
-	while (n--)
-		drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+	while (n--) {
+		drm_syncobj_put(ptr_mask_bits(fences[n].syncobj, 2));
+		dma_fence_put(fences[n].dma_fence);
+		kfree(fences[n].chain_fence);
+	}
 	kvfree(fences);
 }
 
-static struct drm_syncobj **
+static struct i915_drm_dma_fences *
+get_timeline_fence_array(struct drm_i915_gem_execbuffer2 *args,
+			 struct drm_i915_gem_exec_timeline_fences *timeline_fences,
+			 struct drm_file *file,
+			 int *out_n_fences)
+{
+	struct drm_i915_gem_exec_fence __user *user_fences;
+	struct i915_drm_dma_fences *fences;
+	u64 __user *user_values;
+	unsigned long n;
+	int err;
+
+	*out_n_fences = timeline_fences->handle_count;
+
+	/* Incompatible with I915_EXEC_FENCE_ARRAY. */
+	if (args->flags & I915_EXEC_FENCE_ARRAY)
+		return ERR_PTR(-EINVAL);
+
+	/* Check multiplication overflow for access_ok() and kvmalloc_array() */
+	BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
+	if (*out_n_fences > min_t(unsigned long,
+				  ULONG_MAX / sizeof(*user_fences),
+				  SIZE_MAX / sizeof(*fences)))
+		return ERR_PTR(-EINVAL);
+
+	user_fences = u64_to_user_ptr(timeline_fences->handles_ptr);
+	if (!access_ok(user_fences, *out_n_fences * sizeof(*user_fences)))
+		return ERR_PTR(-EFAULT);
+
+	user_values = u64_to_user_ptr(timeline_fences->values_ptr);
+	if (!access_ok(user_values, *out_n_fences * sizeof(*user_values)))
+		return ERR_PTR(-EFAULT);
+
+	fences = kvmalloc_array(*out_n_fences, sizeof(*fences),
+				__GFP_NOWARN | GFP_KERNEL);
+	if (!fences)
+		return ERR_PTR(-ENOMEM);
+
+	BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
+		     ~__I915_EXEC_FENCE_UNKNOWN_FLAGS);
+
+	for (n = 0; n < *out_n_fences; n++) {
+		struct drm_i915_gem_exec_fence user_fence;
+		struct drm_syncobj *syncobj;
+		struct dma_fence *fence = NULL;
+		u64 point;
+
+		if (__copy_from_user(&user_fence, user_fences++, sizeof(user_fence))) {
+			err = -EFAULT;
+			goto err;
+		}
+
+		if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) {
+			err = -EINVAL;
+			goto err;
+		}
+
+		if (__get_user(point, user_values++)) {
+			err = -EFAULT;
+			goto err;
+		}
+
+		syncobj = drm_syncobj_find(file, user_fence.handle);
+		if (!syncobj) {
+			DRM_DEBUG("Invalid syncobj handle provided\n");
+			err = -EINVAL;
+			goto err;
+		}
+
+		if (user_fence.flags & I915_EXEC_FENCE_WAIT) {
+			fence = drm_syncobj_fence_get(syncobj);
+			if (!fence) {
+				DRM_DEBUG("Syncobj handle has no fence\n");
+				drm_syncobj_put(syncobj);
+				err = -EINVAL;
+				goto err;
+			}
+
+			err = dma_fence_chain_find_seqno(&fence, point);
+			if (err) {
+				DRM_DEBUG("Syncobj handle missing requested point\n");
+				goto err;
+			}
+		}
+
+		/*
+		 * For timeline syncobjs we need to create a chain.
+		 */
+		if (point != 0 && user_fence.flags & I915_EXEC_FENCE_SIGNAL) {
+			fences[n].chain_fence =
+				kmalloc(sizeof(*fences[n].chain_fence),
+					GFP_KERNEL);
+			if (!fences[n].chain_fence) {
+				dma_fence_put(fence);
+				drm_syncobj_put(syncobj);
+				err = -ENOMEM;
+				DRM_DEBUG("Unable to alloc chain_fence\n");
+				goto err;
+			}
+		} else {
+			fences[n].chain_fence = NULL;
+		}
+
+		fences[n].syncobj = ptr_pack_bits(syncobj, user_fence.flags, 2);
+		fences[n].dma_fence = fence;
+		fences[n].value = point;
+	}
+
+	return fences;
+
+err:
+	__free_fence_array(fences, n);
+	return ERR_PTR(err);
+}
+
+static struct i915_drm_dma_fences *
 get_fence_array(struct drm_i915_gem_execbuffer2 *args,
-		struct drm_file *file)
+		struct i915_execbuffer *eb,
+		struct drm_file *file,
+		int *out_n_fences)
 {
-	const unsigned long nfences = args->num_cliprects;
 	struct drm_i915_gem_exec_fence __user *user;
-	struct drm_syncobj **fences;
+	struct i915_drm_dma_fences *fences;
 	unsigned long n;
 	int err;
 
-	if (!(args->flags & I915_EXEC_FENCE_ARRAY))
-		return NULL;
+	*out_n_fences = args->num_cliprects;
+
+	/* Timeline fences are incompatible with the fence array. */
+	if (eb->extensions.flags & BIT(DRM_I915_GEM_BASE_EXECBUFFER_TYPE_TIMELINE_FENCES))
+		return ERR_PTR(-EINVAL);
 
 	/* Check multiplication overflow for access_ok() and kvmalloc_array() */
 	BUILD_BUG_ON(sizeof(size_t) > sizeof(unsigned long));
-	if (nfences > min_t(unsigned long,
-			    ULONG_MAX / sizeof(*user),
-			    SIZE_MAX / sizeof(*fences)))
+	if (*out_n_fences > min_t(unsigned long,
+				  ULONG_MAX / sizeof(*user),
+				  SIZE_MAX / sizeof(*fences)))
 		return ERR_PTR(-EINVAL);
 
 	user = u64_to_user_ptr(args->cliprects_ptr);
-	if (!access_ok(user, nfences * sizeof(*user)))
+	if (!access_ok(user, *out_n_fences * sizeof(*user)))
 		return ERR_PTR(-EFAULT);
 
-	fences = kvmalloc_array(nfences, sizeof(*fences),
+	fences = kvmalloc_array(*out_n_fences, sizeof(*fences),
 				__GFP_NOWARN | GFP_KERNEL);
 	if (!fences)
 		return ERR_PTR(-ENOMEM);
 
-	for (n = 0; n < nfences; n++) {
-		struct drm_i915_gem_exec_fence fence;
+	for (n = 0; n < *out_n_fences; n++) {
+		struct drm_i915_gem_exec_fence user_fence;
 		struct drm_syncobj *syncobj;
+		struct dma_fence *fence = NULL;
 
-		if (__copy_from_user(&fence, user++, sizeof(fence))) {
+		if (__copy_from_user(&user_fence, user++, sizeof(user_fence))) {
 			err = -EFAULT;
 			goto err;
 		}
 
-		if (fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) {
+		if (user_fence.flags & __I915_EXEC_FENCE_UNKNOWN_FLAGS) {
 			err = -EINVAL;
 			goto err;
 		}
 
-		syncobj = drm_syncobj_find(file, fence.handle);
+		syncobj = drm_syncobj_find(file, user_fence.handle);
 		if (!syncobj) {
 			DRM_DEBUG("Invalid syncobj handle provided\n");
 			err = -ENOENT;
 			goto err;
 		}
 
+		if (user_fence.flags & I915_EXEC_FENCE_WAIT) {
+			fence = drm_syncobj_fence_get(syncobj);
+			if (!fence) {
+				DRM_DEBUG("Syncobj handle has no fence\n");
+				drm_syncobj_put(syncobj);
+				err = -EINVAL;
+				goto err;
+			}
+		}
+
 		BUILD_BUG_ON(~(ARCH_KMALLOC_MINALIGN - 1) &
 			     ~__I915_EXEC_FENCE_UNKNOWN_FLAGS);
 
-		fences[n] = ptr_pack_bits(syncobj, fence.flags, 2);
+		fences[n].syncobj = ptr_pack_bits(syncobj, user_fence.flags, 2);
+		fences[n].dma_fence = fence;
+		fences[n].value = 0;
+		fences[n].chain_fence = NULL;
 	}
 
 	return fences;
@@ -2283,36 +2427,30 @@ get_fence_array(struct drm_i915_gem_execbuffer2 *args,
 }
 
 static void
-put_fence_array(struct drm_i915_gem_execbuffer2 *args,
-		struct drm_syncobj **fences)
+put_fence_array(struct i915_drm_dma_fences *fences, int nfences)
 {
 	if (fences)
-		__free_fence_array(fences, args->num_cliprects);
+		__free_fence_array(fences, nfences);
 }
 
 static int
 await_fence_array(struct i915_execbuffer *eb,
-		  struct drm_syncobj **fences)
+		  struct i915_drm_dma_fences *fences,
+		  int nfences)
 {
-	const unsigned int nfences = eb->args->num_cliprects;
 	unsigned int n;
 	int err;
 
 	for (n = 0; n < nfences; n++) {
 		struct drm_syncobj *syncobj;
-		struct dma_fence *fence;
 		unsigned int flags;
 
-		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		syncobj = ptr_unpack_bits(fences[n].syncobj, &flags, 2);
 		if (!(flags & I915_EXEC_FENCE_WAIT))
 			continue;
 
-		fence = drm_syncobj_fence_get(syncobj);
-		if (!fence)
-			return -EINVAL;
-
-		err = i915_request_await_dma_fence(eb->request, fence);
-		dma_fence_put(fence);
+		err = i915_request_await_dma_fence(eb->request,
+						   fences[n].dma_fence);
 		if (err < 0)
 			return err;
 	}
@@ -2322,9 +2460,9 @@ await_fence_array(struct i915_execbuffer *eb,
 
 static void
 signal_fence_array(struct i915_execbuffer *eb,
-		   struct drm_syncobj **fences)
+		   struct i915_drm_dma_fences *fences,
+		   int nfences)
 {
-	const unsigned int nfences = eb->args->num_cliprects;
 	struct dma_fence * const fence = &eb->request->fence;
 	unsigned int n;
 
@@ -2332,11 +2470,21 @@ signal_fence_array(struct i915_execbuffer *eb,
 		struct drm_syncobj *syncobj;
 		unsigned int flags;
 
-		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		syncobj = ptr_unpack_bits(fences[n].syncobj, &flags, 2);
 		if (!(flags & I915_EXEC_FENCE_SIGNAL))
 			continue;
 
-		drm_syncobj_replace_fence(syncobj, fence);
+		if (fences[n].chain_fence) {
+			drm_syncobj_add_point(syncobj, fences[n].chain_fence,
+					      fence, fences[n].value);
+			/*
+			 * The chain's ownership is transfered to the
+			 * timeline.
+			 */
+			fences[n].chain_fence = NULL;
+		} else {
+			drm_syncobj_replace_fence(syncobj, fence);
+		}
 	}
 }
 
@@ -2364,6 +2512,13 @@ parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
 			return -EINVAL;
 
 		switch (iter.type) {
+		case DRM_I915_GEM_BASE_EXECBUFFER_TYPE_TIMELINE_FENCES:
+			if (copy_from_user(&eb->extensions.timeline_fences,
+					   u64_to_user_ptr(iter_ptr),
+					   sizeof(eb->extensions.timeline_fences)))
+				return -EFAULT;
+			break;
+
 		default:
 			return -EINVAL;
 		}
@@ -2380,14 +2535,15 @@ static int
 i915_gem_do_execbuffer(struct drm_device *dev,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec,
-		       struct drm_syncobj **fences)
+		       struct drm_i915_gem_exec_object2 *exec)
 {
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
 	struct dma_fence *exec_fence = NULL;
 	struct sync_file *out_fence = NULL;
+	struct i915_drm_dma_fences *fences = NULL;
 	int out_fence_fd = -1;
+	int nfences = 0;
 	int err;
 
 	BUILD_BUG_ON(__EXEC_INTERNAL_FLAGS & ~__I915_EXEC_ILLEGAL_FLAGS);
@@ -2429,10 +2585,22 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			return err;
 	}
 
+	if (args->flags & I915_EXEC_FENCE_ARRAY) {
+		fences = get_fence_array(args, &eb, file, &nfences);
+	} else if (eb.extensions.flags & BIT(DRM_I915_GEM_BASE_EXECBUFFER_TYPE_TIMELINE_FENCES)) {
+		fences = get_timeline_fence_array(
+			args, &eb.extensions.timeline_fences, file, &nfences);
+	}
+
+	if (IS_ERR(fences))
+		return PTR_ERR(fences);
+
 	if (args->flags & I915_EXEC_FENCE_IN) {
 		in_fence = sync_file_get_fence(lower_32_bits(args->rsvd2));
-		if (!in_fence)
-			return -EINVAL;
+		if (!in_fence) {
+			err = -EINVAL;
+			goto err_fences;
+		}
 	}
 
 	if (args->flags & I915_EXEC_FENCE_SUBMIT) {
@@ -2590,7 +2758,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	}
 
 	if (fences) {
-		err = await_fence_array(&eb, fences);
+		err = await_fence_array(&eb, fences, nfences);
 		if (err)
 			goto err_request;
 	}
@@ -2619,7 +2787,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	i915_request_add(eb.request);
 
 	if (fences)
-		signal_fence_array(&eb, fences);
+		signal_fence_array(&eb, fences, nfences);
 
 	if (out_fence) {
 		if (err == 0) {
@@ -2654,6 +2822,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	dma_fence_put(exec_fence);
 err_in_fence:
 	dma_fence_put(in_fence);
+err_fences:
+	put_fence_array(fences, nfences);
 	return err;
 }
 
@@ -2747,7 +2917,7 @@ i915_gem_execbuffer_ioctl(struct drm_device *dev, void *data,
 			exec2_list[i].flags = 0;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list, NULL);
+	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list);
 	if (exec2.flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
@@ -2778,7 +2948,6 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list;
-	struct drm_syncobj **fences = NULL;
 	const size_t count = args->buffer_count;
 	int err;
 
@@ -2806,15 +2975,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	if (args->flags & I915_EXEC_FENCE_ARRAY) {
-		fences = get_fence_array(args, file);
-		if (IS_ERR(fences)) {
-			kvfree(exec2_list);
-			return PTR_ERR(fences);
-		}
-	}
-
-	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
+	err = i915_gem_do_execbuffer(dev, file, args, exec2_list);
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2854,7 +3015,6 @@ end:;
 	}
 
 	args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;
-	put_fence_array(args, fences);
 	kvfree(exec2_list);
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9ed4d0016ee1..a872791f98bd 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -448,6 +448,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
 	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
 	case I915_PARAM_HAS_EXEC_SUBMIT_FENCE:
+	case I915_PARAM_HAS_EXEC_TIMELINE_FENCES:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
@@ -3192,7 +3193,8 @@ static struct drm_driver driver = {
 	 */
 	.driver_features =
 	    DRIVER_GEM | DRIVER_PRIME |
-	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ,
+	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ |
+	    DRIVER_SYNCOBJ_TIMELINE,
 	.release = i915_driver_release,
 	.open = i915_driver_open,
 	.lastclose = i915_driver_lastclose,
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 2ef5ab0daae4..ba1b02859346 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -617,6 +617,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_PERF_REVISION	54
 
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
+ * timeline syncobj through drm_i915_gem_exec_timeline_fences. See
+ * I915_EXEC_EXT.
+ */
+#define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 55
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1014,6 +1020,12 @@ struct drm_i915_gem_exec_fence {
 };
 
 enum drm_i915_gem_base_execbuffer_type {
+	/**
+	 * This identifier is associated with
+	 * drm_i915_gem_exec_timeline_fences.
+	 */
+	DRM_I915_GEM_BASE_EXECBUFFER_TYPE_TIMELINE_FENCES,
+
 	DRM_I915_GEM_BASE_EXECBUFFER_TYPE_MAX /* non-ABI */
 };
 
@@ -1036,6 +1048,31 @@ struct drm_i915_gem_base_execbuffer_ext {
 	__u64 next_ptr;
 };
 
+/**
+ * This structure describes an array of drm_syncobj and associated points for
+ * timeline variants of drm_syncobj. It is invalid to append this structure to
+ * the execbuf if I915_EXEC_FENCE_ARRAY is set.
+ */
+struct drm_i915_gem_exec_timeline_fences {
+	struct drm_i915_gem_base_execbuffer_ext base;
+
+	/**
+	 * Number of element in the handles_ptr & value_ptr arrays.
+	 */
+	__u64 handle_count;
+
+	/**
+	 * Pointer to an array of struct drm_i915_gem_exec_fence of size handle_count.
+	 */
+	__u64 handles_ptr;
+
+	/**
+	 * Pointer to an array of u64 values of size handle_count. Values must
+	 * be 0 for a binary drm_syncobj.
+	 */
+	__u64 values_ptr;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (3 preceding siblings ...)
  2019-06-04 13:11 ` [PATCH v3 4/7] drm/i915: add syncobj timeline support Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 13:40   ` Chris Wilson
  2019-06-04 13:11 ` [PATCH v3 6/7] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

We want the ability to dispatch a set of command buffer to the
hardware, each with a different OA configuration. To achieve this, we
reuse a couple of fields from the execbuf2 struct (I CAN HAZ
execbuf3?) to notify what OA configuration should be used for a batch
buffer. This requires the process making the execbuf with this flag to
also own the perf fd at the time of execbuf.

v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris)
    Move oa_config vma to active (Chris)

v3: Don't drop the lock for engine lookup (Chris)
    Move OA config vma to active before writing the ringbuffer (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 110 +++++++++++++++++-
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   1 +
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |   4 +-
 drivers/gpu/drm/i915/i915_drv.c               |   4 +
 drivers/gpu/drm/i915/i915_drv.h               |   1 +
 drivers/gpu/drm/i915/i915_perf.c              |  14 +--
 include/uapi/drm/i915_drm.h                   |  37 ++++++
 8 files changed, 169 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 02dc5480e8fe..4a785999a9c5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -280,7 +280,11 @@ struct i915_execbuffer {
 	struct {
 		u64 flags; /** Available extensions parameters */
 		struct drm_i915_gem_exec_timeline_fences timeline_fences;
+		struct drm_i915_gem_execbuffer_perf_ext perf_config;
 	} extensions;
+
+	struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is not needed. */
+	struct drm_i915_gem_object *oa_bo;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1198,6 +1202,34 @@ static int reloc_move_to_gpu(struct i915_request *rq, struct i915_vma *vma)
 	return err;
 }
 
+
+static int
+get_execbuf_oa_config(struct drm_i915_private *dev_priv,
+		      s32 perf_fd, u64 oa_config_id,
+		      struct i915_oa_config **out_oa_config,
+		      struct drm_i915_gem_object **out_oa_obj)
+{
+	struct file *perf_file;
+	int ret;
+
+	if (!dev_priv->perf.oa.exclusive_stream)
+		return -EINVAL;
+
+	perf_file = fget(perf_fd);
+	if (!perf_file)
+		return -EINVAL;
+
+	if (perf_file->private_data != dev_priv->perf.oa.exclusive_stream)
+		return -EINVAL;
+
+	fput(perf_file);
+
+	ret = i915_perf_get_oa_config(dev_priv, oa_config_id,
+				      out_oa_config, out_oa_obj);
+
+	return ret;
+}
+
 static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 			     struct i915_vma *vma,
 			     unsigned int len)
@@ -2060,6 +2092,51 @@ add_to_client(struct i915_request *rq, struct drm_file *file)
 	list_add_tail(&rq->client_link, &rq->file_priv->mm.request_list);
 }
 
+static int eb_oa_config(struct i915_execbuffer *eb)
+{
+	struct i915_vma *oa_vma;
+	int err;
+
+	if (!eb->oa_config)
+		return 0;
+
+	/*
+	 * If the config hasn't changed, skip reconfiguring the HW (this is
+	 * subject to a delay we want to avoid has much as possible).
+	 */
+	if (eb->oa_config == eb->i915->perf.oa.exclusive_stream->oa_config)
+		return 0;
+
+	oa_vma = i915_vma_instance(eb->oa_bo,
+				   &eb->engine->i915->ggtt.vm, NULL);
+	if (unlikely(IS_ERR(oa_vma)))
+		return PTR_ERR(oa_vma);
+
+	err = i915_vma_pin(oa_vma, 0, 0, PIN_GLOBAL);
+	if (err)
+		return err;
+
+	err = i915_vma_move_to_active(oa_vma, eb->request, 0);
+	if (err) {
+		i915_vma_unpin(oa_vma);
+		return err;
+	}
+
+	err = eb->engine->emit_bb_start(eb->request,
+					oa_vma->node.start,
+					0, I915_DISPATCH_SECURE);
+	if (err) {
+		i915_vma_unpin(oa_vma);
+		return err;
+	}
+
+	i915_vma_unpin(oa_vma);
+
+	swap(eb->oa_config, eb->i915->perf.oa.exclusive_stream->oa_config);
+
+	return 0;
+}
+
 static int eb_submit(struct i915_execbuffer *eb)
 {
 	int err;
@@ -2086,6 +2163,10 @@ static int eb_submit(struct i915_execbuffer *eb)
 			return err;
 	}
 
+	err = eb_oa_config(eb);
+	if (err)
+		return err;
+
 	err = eb->engine->emit_bb_start(eb->request,
 					eb->batch->node.start +
 					eb->batch_start_offset,
@@ -2519,6 +2600,13 @@ parse_execbuf2_extensions(struct drm_i915_gem_execbuffer2 *args,
 				return -EFAULT;
 			break;
 
+		case DRM_I915_GEM_BASE_EXECBUFFER_TYPE_PERF:
+			if (copy_from_user(&eb->extensions.perf_config,
+					   u64_to_user_ptr(iter_ptr),
+					   sizeof(eb->extensions.perf_config)))
+				return -EFAULT;
+			break;
+
 		default:
 			return -EINVAL;
 		}
@@ -2567,6 +2655,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.buffer_count = args->buffer_count;
 	eb.batch_start_offset = args->batch_start_offset;
 	eb.batch_len = args->batch_len;
+	eb.oa_config = NULL;
 
 	eb.batch_flags = 0;
 	if (args->flags & I915_EXEC_SECURE) {
@@ -2651,9 +2740,23 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	if (unlikely(err))
 		goto err_unlock;
 
+	if (eb.extensions.flags & BIT(DRM_I915_GEM_BASE_EXECBUFFER_TYPE_PERF)) {
+		if (!intel_engine_has_oa(eb.engine)) {
+			err = -ENODEV;
+			goto err_engine;
+		}
+
+		err = get_execbuf_oa_config(eb.i915,
+					    eb.extensions.perf_config.perf_fd,
+					    eb.extensions.perf_config.oa_config,
+					    &eb.oa_config, &eb.oa_bo);
+		if (err)
+			goto err_engine;
+	}
+
 	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
 	if (unlikely(err))
-		goto err_engine;
+		goto err_oa;
 
 	err = eb_relocate(&eb);
 	if (err) {
@@ -2806,6 +2909,11 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
+err_oa:
+	if (eb.oa_config) {
+		i915_gem_object_put(eb.oa_bo);
+		i915_oa_config_put(eb.oa_config);
+	}
 err_engine:
 	eb_unpin_context(&eb);
 err_unlock:
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 01223864237a..97badb185eb2 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -457,6 +457,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
 #define I915_ENGINE_IS_VIRTUAL       BIT(5)
+#define I915_ENGINE_HAS_OA           BIT(6)
 	unsigned int flags;
 
 	/*
@@ -552,6 +553,12 @@ intel_engine_is_virtual(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_IS_VIRTUAL;
 }
 
+static inline bool
+intel_engine_has_oa(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_HAS_OA;
+}
+
 #define instdone_slice_mask(dev_priv__) \
 	(IS_GEN(dev_priv__, 7) ? \
 	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index fed704802c57..ed19f4e53d31 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2732,6 +2732,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 		engine->init_context = gen8_init_rcs_context;
 		engine->emit_flush = gen8_emit_flush_render;
 		engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
+		engine->flags |= I915_ENGINE_HAS_OA;
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index ff58d658e3e2..972193cfbf41 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -2212,8 +2212,10 @@ static void setup_rcs(struct intel_engine_cs *engine)
 		engine->irq_enable_mask = I915_USER_INTERRUPT;
 	}
 
-	if (IS_HASWELL(i915))
+	if (IS_HASWELL(i915)) {
 		engine->emit_bb_start = hsw_emit_bb_start;
+		engine->flags |= I915_ENGINE_HAS_OA;
+	}
 
 	engine->resume = rcs_resume;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index a872791f98bd..75029d1a3802 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -478,6 +478,10 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_PERF_REVISION:
 		value = 1;
 		break;
+	case I915_PARAM_HAS_EXEC_PERF_CONFIG:
+		/* Obviously requires perf support. */
+		value = dev_priv->perf.initialized;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f1e51307253a..aefdae856e77 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2833,6 +2833,7 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915,
 			    int metrics_set,
 			    struct i915_oa_config **out_config,
 			    struct drm_i915_gem_object **out_obj);
+void i915_oa_config_put(struct i915_oa_config *oa_config);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index e0071e44de3d..82a282f668c0 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -366,7 +366,7 @@ struct perf_open_properties {
 	int oa_period_exponent;
 };
 
-static void put_oa_config(struct i915_oa_config *oa_config)
+void i915_oa_config_put(struct i915_oa_config *oa_config)
 {
 	if (!atomic_dec_and_test(&oa_config->ref_count))
 		return;
@@ -500,7 +500,7 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915,
 
 err_buf_alloc:
 	if (out_config) {
-		put_oa_config(oa_config);
+		i915_oa_config_put(oa_config);
 		*out_config = NULL;
 	}
 unlock:
@@ -1475,7 +1475,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
 
-	put_oa_config(stream->oa_config);
+	i915_oa_config_put(stream->oa_config);
 
 	if (dev_priv->perf.oa.spurious_report_rs.missed) {
 		DRM_NOTE("%d spurious OA report notices suppressed due to ratelimiting\n",
@@ -2243,7 +2243,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	free_oa_buffer(dev_priv);
 
 err_oa_buf_alloc:
-	put_oa_config(stream->oa_config);
+	i915_oa_config_put(stream->oa_config);
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 	intel_runtime_pm_put(dev_priv, stream->wakeref);
@@ -3406,7 +3406,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 sysfs_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
 reg_err:
-	put_oa_config(oa_config);
+	i915_oa_config_put(oa_config);
 	DRM_DEBUG("Failed to add new OA config\n");
 	return err;
 }
@@ -3460,7 +3460,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 
 	DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
 
-	put_oa_config(oa_config);
+	i915_oa_config_put(oa_config);
 
 config_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
@@ -3622,7 +3622,7 @@ static int destroy_config(int id, void *p, void *data)
 {
 	struct i915_oa_config *oa_config = p;
 
-	put_oa_config(oa_config);
+	i915_oa_config_put(oa_config);
 
 	return 0;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index ba1b02859346..7f770183ee31 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -623,6 +623,16 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_TIMELINE_FENCES 55
 
+/*
+ * Request an OA performance configuration change before running the commands
+ * given in an execbuf.
+ *
+ * Performance configuration ID is given in the DR4 field of
+ * drm_i915_gem_execbuffer2 and the file descriptor of the i915 perf stream is
+ * given in DR1. Execbuffer will fail if any of these parameter is invalid.
+ */
+#define I915_PARAM_HAS_EXEC_PERF_CONFIG 56
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1026,6 +1036,12 @@ enum drm_i915_gem_base_execbuffer_type {
 	 */
 	DRM_I915_GEM_BASE_EXECBUFFER_TYPE_TIMELINE_FENCES,
 
+	/**
+	 * This identifier is associated with
+	 * drm_i915_gem_execbuffer_perf_ext.
+	 */
+	DRM_I915_GEM_BASE_EXECBUFFER_TYPE_PERF,
+
 	DRM_I915_GEM_BASE_EXECBUFFER_TYPE_MAX /* non-ABI */
 };
 
@@ -1073,6 +1089,27 @@ struct drm_i915_gem_exec_timeline_fences {
 	__u64 values_ptr;
 };
 
+struct drm_i915_gem_execbuffer_perf_ext {
+	struct drm_i915_gem_base_execbuffer_ext base;
+
+	/**
+	 * Performance file descriptor returned by DRM_IOCTL_I915_PERF_OPEN.
+	 * This is used to identify that the application
+	 */
+	__s32 perf_fd;
+
+	/**
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 pad;
+
+	/**
+	 * OA configuration ID to switch to before executing the commands
+	 * associated to the execbuf.
+	 */
+	__u64 oa_config;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 6/7] drm/i915/perf: allow holding preemption on filtered ctx
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (4 preceding siblings ...)
  2019-06-04 13:11 ` [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 13:45   ` Chris Wilson
  2019-06-04 13:11 ` [PATCH v3 7/7] drm/i915: add support for perf configuration queries Lionel Landwerlin
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

v3: Use priorities to avoid preemption rather than the HW mechanism

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    |  8 +++++++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  6 +++++
 drivers/gpu/drm/i915/i915_drv.c               |  2 +-
 drivers/gpu/drm/i915/i915_drv.h               |  8 +++++++
 drivers/gpu/drm/i915/i915_perf.c              | 22 +++++++++++++++++--
 drivers/gpu/drm/i915/i915_priolist_types.h    |  7 ++++++
 drivers/gpu/drm/i915/i915_request.c           |  1 +
 drivers/gpu/drm/i915/i915_request.h           |  1 +
 drivers/gpu/drm/i915/intel_guc_submission.c   |  6 +++++
 include/uapi/drm/i915_drm.h                   | 10 +++++++++
 10 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4a785999a9c5..59ec1c0a8879 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2100,6 +2100,14 @@ static int eb_oa_config(struct i915_execbuffer *eb)
 	if (!eb->oa_config)
 		return 0;
 
+	/*
+	 * If the perf stream was opened with hold preemption, flag the
+	 * request properly so that the priority of the request is bumped once
+	 * it reaches the execlist ports.
+	 */
+	if (eb->i915->perf.oa.exclusive_stream->hold_preemption)
+		eb->request->has_perf = true;
+
 	/*
 	 * If the config hasn't changed, skip reconfiguring the HW (this is
 	 * subject to a delay we want to avoid has much as possible).
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index ed19f4e53d31..4046f045408b 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -683,6 +683,12 @@ static void port_assign(struct execlist_port *port, struct i915_request *rq)
 	if (port_isset(port))
 		i915_request_put(port_request(port));
 
+	if (rq->has_perf) {
+		rq->sched.attr.priority =
+			(I915_PRIORITY_MASK & rq->sched.attr.priority) |
+			I915_USER_PRIORITY(I915_PRIORITY_PERF);
+	}
+
 	port_set(port, port_pack(i915_request_get(rq), port_count(port)));
 }
 
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 75029d1a3802..984b76a09cfa 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -476,7 +476,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
 		break;
 	case I915_PARAM_PERF_REVISION:
-		value = 1;
+		value = 2;
 		break;
 	case I915_PARAM_HAS_EXEC_PERF_CONFIG:
 		/* Obviously requires perf support. */
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index aefdae856e77..57f915751d76 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1396,6 +1396,14 @@ struct i915_perf_stream {
 	 */
 	bool enabled;
 
+	/**
+	 * @hold_preemption: Whether preemption is put on hold for command
+	 * submissions done on the @ctx. This is useful for some drivers that
+	 * cannot easily post process the OA buffer context to subtract delta
+	 * of performance counters not associated with @ctx.
+	 */
+	bool hold_preemption;
+
 	/**
 	 * @ops: The callbacks providing the implementation of this specific
 	 * type of configured stream.
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 82a282f668c0..f8897e3cc862 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -343,6 +343,8 @@ static const struct i915_oa_format gen8_plus_oa_formats[I915_OA_FORMAT_MAX] = {
  * struct perf_open_properties - for validated properties given to open a stream
  * @sample_flags: `DRM_I915_PERF_PROP_SAMPLE_*` properties are tracked as flags
  * @single_context: Whether a single or all gpu contexts should be monitored
+ * @hold_preemption: Whether the preemption is disabled for the filtered
+ *                   context
  * @ctx_handle: A gem ctx handle for use with @single_context
  * @metrics_set: An ID for an OA unit metric set advertised via sysfs
  * @oa_format: An OA unit HW report format
@@ -357,6 +359,7 @@ struct perf_open_properties {
 	u32 sample_flags;
 
 	u64 single_context:1;
+	u64 hold_preemption:1;
 	u64 ctx_handle;
 
 	/* OA sampling state */
@@ -2170,6 +2173,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	stream->sample_flags |= SAMPLE_OA_REPORT;
 	stream->sample_size += format_size;
 
+	stream->hold_preemption = props->hold_preemption;
+
 	dev_priv->perf.oa.oa_buffer.format_size = format_size;
 	if (WARN_ON(dev_priv->perf.oa.oa_buffer.format_size == 0))
 		return -EINVAL;
@@ -2695,6 +2700,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 		}
 	}
 
+	if (props->hold_preemption) {
+		if (!props->single_context) {
+			DRM_DEBUG("preemption disable with no context\n");
+			ret = -EINVAL;
+			goto err;
+		}
+		privileged_op = true;
+	}
+
 	/*
 	 * On Haswell the OA unit supports clock gating off for a specific
 	 * context and in this mode there's no visibility of metrics for the
@@ -2709,8 +2723,9 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	 * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to
 	 * enable the OA unit by default.
 	 */
-	if (IS_HASWELL(dev_priv) && specific_ctx)
+	if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) {
 		privileged_op = false;
+	}
 
 	/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
 	 * we check a dev.i915.perf_stream_paranoid sysctl option
@@ -2719,7 +2734,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	 */
 	if (privileged_op &&
 	    i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
-		DRM_DEBUG("Insufficient privileges to open system-wide i915 perf stream\n");
+		DRM_DEBUG("Insufficient privileges to open i915 perf stream\n");
 		ret = -EACCES;
 		goto err_ctx;
 	}
@@ -2911,6 +2926,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 			props->oa_periodic = true;
 			props->oa_period_exponent = value;
 			break;
+		case DRM_I915_PERF_PROP_HOLD_PREEMPTION:
+			props->hold_preemption = value != 0 ? 1 : 0;
+			break;
 		case DRM_I915_PERF_PROP_MAX:
 			MISSING_CASE(id);
 			return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_priolist_types.h b/drivers/gpu/drm/i915/i915_priolist_types.h
index 49709de69875..fc1c1fe7ce58 100644
--- a/drivers/gpu/drm/i915/i915_priolist_types.h
+++ b/drivers/gpu/drm/i915/i915_priolist_types.h
@@ -17,6 +17,13 @@ enum {
 	I915_PRIORITY_NORMAL = I915_CONTEXT_DEFAULT_PRIORITY,
 	I915_PRIORITY_MAX = I915_CONTEXT_MAX_USER_PRIORITY + 1,
 
+	/* Requests containing performance queries must not be preempted by
+	 * another context. They get scheduled with their default priority and
+	 * once they reach the execlist ports we bump them to
+	 * I915_PRIORITY_PERF so that they stick to the HW until they finish.
+	 */
+	I915_PRIORITY_PERF = I915_CONTEXT_MAX_USER_PRIORITY + 2,
+
 	I915_PRIORITY_INVALID = INT_MIN
 };
 
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index da1e6984a8cc..48dbe988db23 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -725,6 +725,7 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
 	rq->batch = NULL;
 	rq->capture_list = NULL;
 	rq->waitboost = false;
+	rq->has_perf = false;
 	rq->execution_mask = ALL_ENGINES;
 
 	INIT_LIST_HEAD(&rq->active_list);
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index c9f7d07991c8..ab09b5bb89f7 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -216,6 +216,7 @@ struct i915_request {
 	unsigned long emitted_jiffies;
 
 	bool waitboost;
+	bool has_perf;
 
 	/** engine->request_list entry for this request */
 	struct list_head link;
diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c
index a4f98ccef0fe..cba3f50dc1fe 100644
--- a/drivers/gpu/drm/i915/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/intel_guc_submission.c
@@ -717,6 +717,12 @@ static void port_assign(struct execlist_port *port, struct i915_request *rq)
 {
 	GEM_BUG_ON(port_isset(port));
 
+	if (rq->has_perf) {
+		rq->sched.attr.priority =
+			(I915_PRIORITY_MASK & rq->sched.attr.priority) |
+			I915_USER_PRIORITY(I915_PRIORITY_PERF);
+	}
+
 	port_set(port, i915_request_get(rq));
 }
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7f770183ee31..8ffdbc0e4616 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2001,6 +2001,16 @@ enum drm_i915_perf_property_id {
 	 */
 	DRM_I915_PERF_PROP_OA_EXPONENT,
 
+	/**
+	 * Specifying this property is only valid when specify a context to
+	 * filter with DRM_I915_PERF_PROP_CTX_HANDLE. Specifying this property
+	 * will hold preemption of the particular context we want to gather
+	 * performance data about.
+	 *
+	 * This property is available in perf revision 2.
+	 */
+	DRM_I915_PERF_PROP_HOLD_PREEMPTION,
+
 	DRM_I915_PERF_PROP_MAX /* non-ABI */
 };
 
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 7/7] drm/i915: add support for perf configuration queries
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (5 preceding siblings ...)
  2019-06-04 13:11 ` [PATCH v3 6/7] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
@ 2019-06-04 13:11 ` Lionel Landwerlin
  2019-06-04 16:09 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support (rev3) Patchwork
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:11 UTC (permalink / raw)
  To: intel-gfx

Listing configurations at the moment is supported only through sysfs.
This might cause issues for applications wanting to list
configurations from a container where sysfs isn't available.

This change adds a way to query the number of configurations and their
content through the i915 query uAPI.

v2: Fix sparse warnings (Lionel)
    Add support to query configuration using uuid (Lionel)

v3: Fix some inconsistency in uapi header (Lionel)
    Fix unlocking when not locked issue (Lionel)
    Add debug messages (Lionel)

v4: Fix missing unlock (Dan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |   6 +
 drivers/gpu/drm/i915/i915_perf.c  |   4 +
 drivers/gpu/drm/i915/i915_query.c | 279 ++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h       |  62 ++++++-
 4 files changed, 350 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 57f915751d76..a3edf02c8a0e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1892,6 +1892,12 @@ struct drm_i915_private {
 		 */
 		struct list_head metrics_buffers;
 
+		/*
+		 * Number of dynamic configurations, you need to hold
+		 * dev_priv->perf.metrics_lock to access it.
+		 */
+		u32 n_metrics;
+
 		/*
 		 * Lock associated with anything below within this structure
 		 * except exclusive_stream.
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index f8897e3cc862..3a6e726c30bd 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3415,6 +3415,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 		goto sysfs_err;
 	}
 
+	dev_priv->perf.n_metrics++;
+
 	mutex_unlock(&dev_priv->perf.metrics_lock);
 
 	DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id);
@@ -3476,6 +3478,8 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 
 	idr_remove(&dev_priv->perf.metrics_idr, *arg);
 
+	dev_priv->perf.n_metrics--;
+
 	DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
 
 	i915_oa_config_put(oa_config);
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 7b7016171057..c1e203c7b2c2 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -143,10 +143,289 @@ query_engine_info(struct drm_i915_private *i915,
 	return len;
 }
 
+static int can_copy_perf_config_registers_or_number(u32 user_n_regs,
+						    u64 user_regs_ptr,
+						    u32 kernel_n_regs)
+{
+	/*
+	 * We'll just put the number of registers, and won't copy the
+	 * register.
+	 */
+	if (user_n_regs == 0)
+		return 0;
+
+	if (user_n_regs < kernel_n_regs)
+		return -EINVAL;
+
+	if (!access_ok(u64_to_user_ptr(user_regs_ptr),
+		       2 * sizeof(u32) * kernel_n_regs))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int copy_perf_config_registers_or_number(const struct i915_oa_reg *kernel_regs,
+						u32 kernel_n_regs,
+						u64 user_regs_ptr,
+						u32 *user_n_regs)
+{
+	u32 r;
+
+	if (*user_n_regs == 0) {
+		*user_n_regs = kernel_n_regs;
+		return 0;
+	}
+
+	*user_n_regs = kernel_n_regs;
+
+	for (r = 0; r < kernel_n_regs; r++) {
+		u32 __user *user_reg_ptr =
+			u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2);
+		u32 __user *user_val_ptr =
+			u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 +
+					sizeof(u32));
+		int ret;
+
+		ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr),
+				 user_reg_ptr);
+		if (ret)
+			return -EFAULT;
+
+		ret = __put_user(kernel_regs[r].value, user_val_ptr);
+		if (ret)
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int query_perf_config_data(struct drm_i915_private *i915,
+				  struct drm_i915_query_item *query_item,
+				  bool use_uuid)
+{
+	struct drm_i915_query_perf_config __user *user_query_config_ptr =
+		u64_to_user_ptr(query_item->data_ptr);
+	struct drm_i915_perf_oa_config __user *user_config_ptr =
+		u64_to_user_ptr(query_item->data_ptr +
+				sizeof(struct drm_i915_query_perf_config));
+	struct drm_i915_perf_oa_config user_config;
+	struct i915_oa_config *oa_config = NULL;
+	u32 flags, total_size;
+	int ret;
+
+	if (!i915->perf.initialized)
+		return -ENODEV;
+
+	total_size = sizeof(struct drm_i915_query_perf_config) +
+		sizeof(struct drm_i915_perf_oa_config);
+
+	if (query_item->length == 0)
+		return total_size;
+
+	if (query_item->length < total_size) {
+		DRM_DEBUG("Invalid query config data item size=%u expected=%u\n",
+			  query_item->length, total_size);
+		return -EINVAL;
+	}
+
+	if (!access_ok(user_query_config_ptr, total_size))
+		return -EFAULT;
+
+	if (__get_user(flags, &user_query_config_ptr->flags))
+		return -EFAULT;
+
+	if (flags != 0)
+		return -EINVAL;
+
+	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
+	if (ret)
+		return ret;
+
+	if (use_uuid) {
+		char uuid[UUID_STRING_LEN + 1] = { 0, };
+		struct i915_oa_config *tmp;
+		int id;
+
+		BUILD_BUG_ON(sizeof(user_query_config_ptr->uuid) >= sizeof(uuid));
+
+		if (__copy_from_user(uuid, user_query_config_ptr->uuid,
+				     sizeof(user_query_config_ptr->uuid))) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		idr_for_each_entry(&i915->perf.metrics_idr, tmp, id) {
+			if (!strcmp(tmp->uuid, uuid)) {
+				oa_config = tmp;
+				break;
+			}
+		}
+	} else {
+		u64 config_id;
+
+		if (__get_user(config_id, &user_query_config_ptr->config)) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		if (config_id == 1)
+			oa_config = &i915->perf.oa.test_config;
+		else
+			oa_config = idr_find(&i915->perf.metrics_idr, config_id);
+	}
+
+	if (!oa_config) {
+		ret = -ENOENT;
+		goto out;
+	}
+
+	if (__copy_from_user(&user_config, user_config_ptr,
+			     sizeof(user_config))) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	ret = can_copy_perf_config_registers_or_number(user_config.n_boolean_regs,
+						       user_config.boolean_regs_ptr,
+						       oa_config->b_counter_regs_len);
+	if (ret)
+		goto out;
+
+	ret = can_copy_perf_config_registers_or_number(user_config.n_flex_regs,
+						       user_config.flex_regs_ptr,
+						       oa_config->flex_regs_len);
+	if (ret)
+		goto out;
+
+	ret = can_copy_perf_config_registers_or_number(user_config.n_mux_regs,
+						       user_config.mux_regs_ptr,
+						       oa_config->mux_regs_len);
+	if (ret)
+		goto out;
+
+	ret = copy_perf_config_registers_or_number(oa_config->b_counter_regs,
+						   oa_config->b_counter_regs_len,
+						   user_config.boolean_regs_ptr,
+						   &user_config.n_boolean_regs);
+	if (ret)
+		goto out;
+
+	ret = copy_perf_config_registers_or_number(oa_config->flex_regs,
+						   oa_config->flex_regs_len,
+						   user_config.flex_regs_ptr,
+						   &user_config.n_flex_regs);
+	if (ret)
+		goto out;
+
+	ret = copy_perf_config_registers_or_number(oa_config->mux_regs,
+						   oa_config->mux_regs_len,
+						   user_config.mux_regs_ptr,
+						   &user_config.n_mux_regs);
+	if (ret)
+		goto out;
+
+	memcpy(user_config.uuid, oa_config->uuid, sizeof(user_config.uuid));
+
+	if (__copy_to_user(user_config_ptr, &user_config,
+			   sizeof(user_config))) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	ret = total_size;
+
+out:
+	mutex_unlock(&i915->perf.metrics_lock);
+	return ret;
+}
+
+static int query_perf_config_list(struct drm_i915_private *i915,
+				  struct drm_i915_query_item *query_item)
+{
+	struct drm_i915_query_perf_config __user *user_query_config_ptr =
+		u64_to_user_ptr(query_item->data_ptr);
+	struct i915_oa_config *oa_config;
+	u32 flags, total_size;
+	u64 n_configs;
+	int ret, id;
+
+	if (!i915->perf.initialized)
+		return -ENODEV;
+
+	/* Count the default test configuration */
+	n_configs = i915->perf.n_metrics + 1;
+	total_size = sizeof(struct drm_i915_query_perf_config) +
+		sizeof(u64) * n_configs;
+
+	if (query_item->length == 0)
+		return total_size;
+
+	if (query_item->length < total_size) {
+		DRM_DEBUG("Invalid query config list item size=%u expected=%u\n",
+			  query_item->length, total_size);
+		return -EINVAL;
+	}
+
+	if (!access_ok(user_query_config_ptr, total_size))
+		return -EFAULT;
+
+	if (__get_user(flags, &user_query_config_ptr->flags))
+		return -EFAULT;
+
+	if (flags != 0)
+		return -EINVAL;
+
+	if (__put_user(n_configs, &user_query_config_ptr->config))
+		return -EFAULT;
+
+	if (__put_user((u64)1ULL, &user_query_config_ptr->data[0]))
+		return -EFAULT;
+
+	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
+	if (ret)
+		return ret;
+
+	n_configs = 1;
+	idr_for_each_entry(&i915->perf.metrics_idr, oa_config, id) {
+		u64 __user *item =
+			u64_to_user_ptr(query_item->data_ptr +
+					sizeof(struct drm_i915_query_perf_config) +
+					n_configs * sizeof(u64));
+
+		if (__put_user((u64)id, item)) {
+			ret = -EFAULT;
+			goto out;
+		}
+		n_configs++;
+	}
+
+	ret = total_size;
+
+out:
+	mutex_unlock(&i915->perf.metrics_lock);
+	return ret;
+}
+
+static int query_perf_config(struct drm_i915_private *i915,
+			     struct drm_i915_query_item *query_item)
+{
+	switch (query_item->flags) {
+	case DRM_I915_QUERY_PERF_CONFIG_LIST:
+		return query_perf_config_list(i915, query_item);
+	case DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID:
+		return query_perf_config_data(i915, query_item, true);
+	case DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID:
+		return query_perf_config_data(i915, query_item, false);
+	default:
+		return -EINVAL;
+	}
+}
+
 static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
 					struct drm_i915_query_item *query_item) = {
 	query_topology_info,
 	query_engine_info,
+	query_perf_config,
 };
 
 int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 8ffdbc0e4616..61c53f3bc435 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -2126,6 +2126,7 @@ struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
 #define DRM_I915_QUERY_ENGINE_INFO	2
+#define DRM_I915_QUERY_PERF_CONFIG      3
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -2137,9 +2138,18 @@ struct drm_i915_query_item {
 	__s32 length;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
+	 *
+	 * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
+	 * following :
+	 *         - DRM_I915_QUERY_PERF_CONFIG_LIST
+	 *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
+	 *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
 	 */
 	__u32 flags;
+#define DRM_I915_QUERY_PERF_CONFIG_LIST          1
+#define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
+#define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
 
 	/*
 	 * Data will be written at the location pointed by data_ptr when the
@@ -2265,6 +2275,56 @@ struct drm_i915_query_engine_info {
 	struct drm_i915_engine_info engines[];
 };
 
+/*
+ * Data written by the kernel with query DRM_I915_QUERY_PERF_CONFIG.
+ */
+struct drm_i915_query_perf_config {
+	union {
+		/*
+		 * When query_item.flags == DRM_I915_QUERY_PERF_CONFIG_LIST, i915 sets
+		 * this fields to the number of configurations available.
+		 */
+		__u64 n_configs;
+
+		/*
+		 * When query_id == DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID,
+		 * i915 will use the value in this field as configuration
+		 * identifier to decide what data to write into config_ptr.
+		 */
+		__u64 config;
+
+		/*
+		 * When query_id == DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID,
+		 * i915 will use the value in this field as configuration
+		 * identifier to decide what data to write into config_ptr.
+		 *
+		 * String formatted like "%08x-%04x-%04x-%04x-%012x"
+		 */
+		char uuid[36];
+	};
+
+	/*
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 flags;
+
+	/*
+	 * When query_item.flags == DRM_I915_QUERY_PERF_CONFIG_LIST, i915 will
+	 * write an array of __u64 of configuration identifiers.
+	 *
+	 * When query_item.flags == DRM_I915_QUERY_PERF_CONFIG_DATA, i915 will
+	 * write a struct drm_i915_perf_oa_config. If the following fields of
+	 * drm_i915_perf_oa_config are set not set to 0, i915 will write into
+	 * the associated pointers the values of submitted when the
+	 * configuration was created :
+	 *
+	 *         - n_mux_regs
+	 *         - n_boolean_regs
+	 *         - n_flex_regs
+	 */
+	__u8 data[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/7] drm/i915/perf: allow for CS OA configs to be created lazily
  2019-06-04 13:11 ` [PATCH v3 2/7] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
@ 2019-06-04 13:29   ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2019-06-04 13:29 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-06-04 14:11:35)
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 2e33a9b4eae7..e0071e44de3d 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -366,9 +366,16 @@ struct perf_open_properties {
>         int oa_period_exponent;
>  };
>  
> -static void free_oa_config(struct drm_i915_private *dev_priv,
> -                          struct i915_oa_config *oa_config)
> +static void put_oa_config(struct i915_oa_config *oa_config)
>  {
> +       if (!atomic_dec_and_test(&oa_config->ref_count))

Might as well use struct kref? At the very least, refcount_t.

> +               return;
> +

> +       if (oa_config->obj) {
lockdep_assert_held(&dev_priv->perf.metrics_lock);

It's not apparent that the lock is held for the list_del.

> +               list_del(&oa_config->vma_link);
> +               i915_gem_object_put(oa_config->obj);
> +       }
> +
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 3/7] drm/i915: introduce a mechanism to extend execbuf2
  2019-06-04 13:11 ` [PATCH v3 3/7] drm/i915: introduce a mechanism to extend execbuf2 Lionel Landwerlin
@ 2019-06-04 13:32   ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2019-06-04 13:32 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-06-04 14:11:36)
> We're planning to use this for a couple of new feature where we need
> to provide additional parameters to execbuf.

See struct i915_user_extension. Might as well try and share common code.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter
  2019-06-04 13:11 ` [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
@ 2019-06-04 13:40   ` Chris Wilson
  2019-06-04 13:51     ` Lionel Landwerlin
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Wilson @ 2019-06-04 13:40 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-06-04 14:11:38)
>         list_add_tail(&rq->client_link, &rq->file_priv->mm.request_list);
>  }
>  
> +static int eb_oa_config(struct i915_execbuffer *eb)
> +{
> +       struct i915_vma *oa_vma;
> +       int err;
> +
> +       if (!eb->oa_config)
> +               return 0;
> +
> +       /*
> +        * If the config hasn't changed, skip reconfiguring the HW (this is
> +        * subject to a delay we want to avoid has much as possible).
> +        */
> +       if (eb->oa_config == eb->i915->perf.oa.exclusive_stream->oa_config)
> +               return 0;

But you don't order the execution so it may not be the right oa_config.
Just add the barrier. It is virtually no cost for the exclusive oa
userspace.

How does this interact with the global oa_config being changed via the
ioctl? What significance is there for this per-execbuf oa_config being
applied to other users?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 6/7] drm/i915/perf: allow holding preemption on filtered ctx
  2019-06-04 13:11 ` [PATCH v3 6/7] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
@ 2019-06-04 13:45   ` Chris Wilson
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Wilson @ 2019-06-04 13:45 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-06-04 14:11:39)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index ed19f4e53d31..4046f045408b 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -683,6 +683,12 @@ static void port_assign(struct execlist_port *port, struct i915_request *rq)
>         if (port_isset(port))
>                 i915_request_put(port_request(port));
>  
> +       if (rq->has_perf) {
> +               rq->sched.attr.priority =
> +                       (I915_PRIORITY_MASK & rq->sched.attr.priority) |
> +                       I915_USER_PRIORITY(I915_PRIORITY_PERF);
> +       }

This is broken. You can not ignore PI here. If you bump the priority here you
must increase the priority of all of its cross-engine dependencies as
they may still be inflight and later reordered causing deadlocks. (Note
you cannot take the locks required for bumping other engines here.)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter
  2019-06-04 13:40   ` Chris Wilson
@ 2019-06-04 13:51     ` Lionel Landwerlin
  0 siblings, 0 replies; 17+ messages in thread
From: Lionel Landwerlin @ 2019-06-04 13:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 04/06/2019 16:40, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-06-04 14:11:38)
>>          list_add_tail(&rq->client_link, &rq->file_priv->mm.request_list);
>>   }
>>   
>> +static int eb_oa_config(struct i915_execbuffer *eb)
>> +{
>> +       struct i915_vma *oa_vma;
>> +       int err;
>> +
>> +       if (!eb->oa_config)
>> +               return 0;
>> +
>> +       /*
>> +        * If the config hasn't changed, skip reconfiguring the HW (this is
>> +        * subject to a delay we want to avoid has much as possible).
>> +        */
>> +       if (eb->oa_config == eb->i915->perf.oa.exclusive_stream->oa_config)
>> +               return 0;
> But you don't order the execution so it may not be the right oa_config.
> Just add the barrier. It is virtually no cost for the exclusive oa
> userspace.


Ah right sorry :(


>
> How does this interact with the global oa_config being changed via the
> ioctl?


eb_oa_config() should be called under the global lock right?
Or are you referring to something else?


>   What significance is there for this per-execbuf oa_config being
> applied to other users?


The expectation is that a single application is using this and data 
other users get from MI_REPORT_PERF_COUNT is undefined (much like when 
OA is turned off).

Now I see there is a problem with an application with multiple contexts 
because we have the Flex EU counter configurations per context.

I can break the config in 2 parts and execute the flex counter 
programming everything regardless.


We really want to minimize the NOA reprogramming because it takes an 
undefined amount of time to apply (been told below 1ms).


Thanks for spotting those issue.


-Lionel


> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support (rev3)
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (6 preceding siblings ...)
  2019-06-04 13:11 ` [PATCH v3 7/7] drm/i915: add support for perf configuration queries Lionel Landwerlin
@ 2019-06-04 16:09 ` Patchwork
  2019-06-04 16:14 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2019-06-04 16:09 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support (rev3)
URL   : https://patchwork.freedesktop.org/series/60916/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
87e2a194bf9d drm/i915/perf: introduce a versioning of the i915-perf uapi
d05986d1855c drm/i915/perf: allow for CS OA configs to be created lazily
-:124: CHECK:SPACING: No space is necessary after a cast
#124: FILE: drivers/gpu/drm/i915/i915_perf.c:395:
+					(u32) MI_LOAD_REGISTER_IMM_MAX_REGS);

total: 0 errors, 0 warnings, 1 checks, 311 lines checked
4c87d084a453 drm/i915: introduce a mechanism to extend execbuf2
-:157: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#157: FILE: include/uapi/drm/i915_drm.h:1189:
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_EXT<<1))
                                                   ^

total: 0 errors, 0 warnings, 1 checks, 135 lines checked
23e946fb5ef7 drm/i915: add syncobj timeline support
-:331: WARNING:TYPO_SPELLING: 'transfered' may be misspelled - perhaps 'transferred'?
#331: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2481:
+			 * The chain's ownership is transfered to the

-:380: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#380: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2591:
+		fences = get_timeline_fence_array(

total: 0 errors, 1 warnings, 1 checks, 507 lines checked
8c3c1b054277 drm/i915: add a new perf configuration execbuf parameter
-:41: CHECK:LINE_SPACING: Please don't use multiple blank lines
#41: FILE: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:1205:
 
+

total: 0 errors, 0 warnings, 1 checks, 326 lines checked
d0c380225218 drm/i915/perf: allow holding preemption on filtered ctx
-:151: WARNING:BRACES: braces {} are not necessary for single statement blocks
#151: FILE: drivers/gpu/drm/i915/i915_perf.c:2726:
+	if (IS_HASWELL(dev_priv) && specific_ctx && !props->hold_preemption) {
 		privileged_op = false;
+	}

total: 0 errors, 1 warnings, 0 checks, 168 lines checked
71394e773da8 drm/i915: add support for perf configuration queries

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✗ Fi.CI.SPARSE: warning for drm/i915: Vulkan performance query support (rev3)
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (7 preceding siblings ...)
  2019-06-04 16:09 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support (rev3) Patchwork
@ 2019-06-04 16:14 ` Patchwork
  2019-06-04 16:30 ` ✓ Fi.CI.BAT: success " Patchwork
  2019-06-05 12:11 ` ✓ Fi.CI.IGT: " Patchwork
  10 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2019-06-04 16:14 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support (rev3)
URL   : https://patchwork.freedesktop.org/series/60916/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/perf: introduce a versioning of the i915-perf uapi
Okay!

Commit: drm/i915/perf: allow for CS OA configs to be created lazily
+drivers/gpu/drm/i915/i915_perf.c:394:37: warning: expression using sizeof(void)

Commit: drm/i915: introduce a mechanism to extend execbuf2
Okay!

Commit: drm/i915: add syncobj timeline support
+./include/linux/mm.h:652:13: error: not a function <noident>

Commit: drm/i915: add a new perf configuration execbuf parameter
Okay!

Commit: drm/i915/perf: allow holding preemption on filtered ctx
Okay!

Commit: drm/i915: add support for perf configuration queries
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: Vulkan performance query support (rev3)
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (8 preceding siblings ...)
  2019-06-04 16:14 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-06-04 16:30 ` Patchwork
  2019-06-05 12:11 ` ✓ Fi.CI.IGT: " Patchwork
  10 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2019-06-04 16:30 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support (rev3)
URL   : https://patchwork.freedesktop.org/series/60916/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6186 -> Patchwork_13168
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/

Known issues
------------

  Here are the changes found in Patchwork_13168 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_fence@basic-busy-default:
    - fi-icl-y:           [PASS][1] -> [INCOMPLETE][2] ([fdo#107713])
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/fi-icl-y/igt@gem_exec_fence@basic-busy-default.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/fi-icl-y/igt@gem_exec_fence@basic-busy-default.html

  * igt@kms_busy@basic-flip-b:
    - fi-icl-u3:          [PASS][3] -> [DMESG-WARN][4] ([fdo#107724])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/fi-icl-u3/igt@kms_busy@basic-flip-b.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/fi-icl-u3/igt@kms_busy@basic-flip-b.html

  
#### Possible fixes ####

  * igt@gem_ctx_switch@basic-default:
    - {fi-icl-guc}:       [INCOMPLETE][5] ([fdo#107713] / [fdo#108569]) -> [PASS][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/fi-icl-guc/igt@gem_ctx_switch@basic-default.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/fi-icl-guc/igt@gem_ctx_switch@basic-default.html

  * igt@i915_pm_rpm@module-reload:
    - fi-icl-u3:          [DMESG-WARN][7] ([fdo#107724]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/fi-icl-u3/igt@i915_pm_rpm@module-reload.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/fi-icl-u3/igt@i915_pm_rpm@module-reload.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
    - fi-blb-e6850:       [INCOMPLETE][9] ([fdo#107718]) -> [PASS][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/fi-blb-e6850/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569


Participating hosts (53 -> 45)
------------------------------

  Missing    (8): fi-ilk-m540 fi-bxt-dsi fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-kbl-7560u fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6186 -> Patchwork_13168

  CI_DRM_6186: a629ccaaa66bb4effc461a00de5b3f92b6ea9c4c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5037: a98c9cd50aa48933217ca41055279ccb1680d25b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13168: 71394e773da8593dc7da9aa95d43112cf606504d @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

71394e773da8 drm/i915: add support for perf configuration queries
d0c380225218 drm/i915/perf: allow holding preemption on filtered ctx
8c3c1b054277 drm/i915: add a new perf configuration execbuf parameter
23e946fb5ef7 drm/i915: add syncobj timeline support
4c87d084a453 drm/i915: introduce a mechanism to extend execbuf2
d05986d1855c drm/i915/perf: allow for CS OA configs to be created lazily
87e2a194bf9d drm/i915/perf: introduce a versioning of the i915-perf uapi

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* ✓ Fi.CI.IGT: success for drm/i915: Vulkan performance query support (rev3)
  2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (9 preceding siblings ...)
  2019-06-04 16:30 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2019-06-05 12:11 ` Patchwork
  10 siblings, 0 replies; 17+ messages in thread
From: Patchwork @ 2019-06-05 12:11 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support (rev3)
URL   : https://patchwork.freedesktop.org/series/60916/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_6186_full -> Patchwork_13168_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_13168_full:

### IGT changes ###

#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * {igt@gem_ctx_shared@exec-shared-gtt-render}:
    - shard-glk:          [PASS][1] -> [FAIL][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-glk9/igt@gem_ctx_shared@exec-shared-gtt-render.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-glk2/igt@gem_ctx_shared@exec-shared-gtt-render.html

  
Known issues
------------

  Here are the changes found in Patchwork_13168_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_softpin@noreloc-s3:
    - shard-kbl:          [PASS][3] -> [DMESG-WARN][4] ([fdo#108566])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-kbl3/igt@gem_softpin@noreloc-s3.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-kbl3/igt@gem_softpin@noreloc-s3.html

  * igt@kms_cursor_crc@pipe-b-cursor-suspend:
    - shard-skl:          [PASS][5] -> [INCOMPLETE][6] ([fdo#110741])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-skl5/igt@kms_cursor_crc@pipe-b-cursor-suspend.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-skl6/igt@kms_cursor_crc@pipe-b-cursor-suspend.html

  * igt@kms_draw_crc@draw-method-xrgb8888-mmap-cpu-xtiled:
    - shard-iclb:         [PASS][7] -> [INCOMPLETE][8] ([fdo#107713])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb5/igt@kms_draw_crc@draw-method-xrgb8888-mmap-cpu-xtiled.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb7/igt@kms_draw_crc@draw-method-xrgb8888-mmap-cpu-xtiled.html

  * igt@kms_flip@flip-vs-suspend-interruptible:
    - shard-apl:          [PASS][9] -> [DMESG-WARN][10] ([fdo#108566]) +1 similar issue
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-apl4/igt@kms_flip@flip-vs-suspend-interruptible.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-apl6/igt@kms_flip@flip-vs-suspend-interruptible.html

  * igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move:
    - shard-hsw:          [PASS][11] -> [SKIP][12] ([fdo#109271]) +36 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-hsw5/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-hsw1/igt@kms_frontbuffer_tracking@fbc-2p-scndscrn-spr-indfb-move.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-plflip-blt:
    - shard-iclb:         [PASS][13] -> [FAIL][14] ([fdo#103167]) +4 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb1/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-plflip-blt.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb4/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-plflip-blt.html

  * igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-mmap-wc:
    - shard-iclb:         [PASS][15] -> [INCOMPLETE][16] ([fdo#106978] / [fdo#107713])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-mmap-wc.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb7/igt@kms_frontbuffer_tracking@fbcpsr-rgb101010-draw-mmap-wc.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [PASS][17] -> [FAIL][18] ([fdo#108145] / [fdo#110403])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-skl3/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-skl2/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_psr@psr2_cursor_plane_onoff:
    - shard-iclb:         [PASS][19] -> [SKIP][20] ([fdo#109441])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb2/igt@kms_psr@psr2_cursor_plane_onoff.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb4/igt@kms_psr@psr2_cursor_plane_onoff.html

  
#### Possible fixes ####

  * {igt@gem_ctx_param@vm}:
    - shard-hsw:          [DMESG-WARN][21] ([fdo#110836]) -> [PASS][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-hsw6/igt@gem_ctx_param@vm.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-hsw8/igt@gem_ctx_param@vm.html

  * igt@gem_mmap_gtt@forked-medium-copy-odd:
    - shard-iclb:         [INCOMPLETE][23] ([fdo#107713]) -> [PASS][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb8/igt@gem_mmap_gtt@forked-medium-copy-odd.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb1/igt@gem_mmap_gtt@forked-medium-copy-odd.html

  * igt@kms_flip@2x-modeset-vs-vblank-race-interruptible:
    - shard-glk:          [FAIL][25] ([fdo#103060]) -> [PASS][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-glk3/igt@kms_flip@2x-modeset-vs-vblank-race-interruptible.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-glk6/igt@kms_flip@2x-modeset-vs-vblank-race-interruptible.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-skl:          [FAIL][27] ([fdo#105363]) -> [PASS][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-skl5/igt@kms_flip@flip-vs-expired-vblank-interruptible.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-skl6/igt@kms_flip@flip-vs-expired-vblank-interruptible.html

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render:
    - shard-iclb:         [FAIL][29] ([fdo#103167]) -> [PASS][30] +2 similar issues
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb2/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb3/igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-cur-indfb-draw-render.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min:
    - shard-skl:          [FAIL][31] ([fdo#108145]) -> [PASS][32] +1 similar issue
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-skl10/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-skl1/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-min.html

  * igt@kms_psr2_su@page_flip:
    - shard-iclb:         [SKIP][33] ([fdo#109642]) -> [PASS][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb1/igt@kms_psr2_su@page_flip.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb2/igt@kms_psr2_su@page_flip.html

  * igt@kms_psr@psr2_primary_mmap_cpu:
    - shard-iclb:         [SKIP][35] ([fdo#109441]) -> [PASS][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb6/igt@kms_psr@psr2_primary_mmap_cpu.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb2/igt@kms_psr@psr2_primary_mmap_cpu.html

  * igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend:
    - shard-skl:          [INCOMPLETE][37] ([fdo#104108]) -> [PASS][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-skl6/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-skl10/igt@kms_vblank@pipe-a-ts-continuation-dpms-suspend.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-apl:          [DMESG-WARN][39] ([fdo#108566]) -> [PASS][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-apl3/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-apl5/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  
#### Warnings ####

  * igt@kms_hdmi_inject@inject-audio:
    - shard-iclb:         [FAIL][41] ([fdo#110842]) -> [FAIL][42] ([fdo#102370])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6186/shard-iclb2/igt@kms_hdmi_inject@inject-audio.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/shard-iclb4/igt@kms_hdmi_inject@inject-audio.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102370]: https://bugs.freedesktop.org/show_bug.cgi?id=102370
  [fdo#103060]: https://bugs.freedesktop.org/show_bug.cgi?id=103060
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#106978]: https://bugs.freedesktop.org/show_bug.cgi?id=106978
  [fdo#107713]: https://bugs.freedesktop.org/show_bug.cgi?id=107713
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108566]: https://bugs.freedesktop.org/show_bug.cgi?id=108566
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#109642]: https://bugs.freedesktop.org/show_bug.cgi?id=109642
  [fdo#110403]: https://bugs.freedesktop.org/show_bug.cgi?id=110403
  [fdo#110741]: https://bugs.freedesktop.org/show_bug.cgi?id=110741
  [fdo#110836]: https://bugs.freedesktop.org/show_bug.cgi?id=110836
  [fdo#110842]: https://bugs.freedesktop.org/show_bug.cgi?id=110842


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_6186 -> Patchwork_13168

  CI_DRM_6186: a629ccaaa66bb4effc461a00de5b3f92b6ea9c4c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5037: a98c9cd50aa48933217ca41055279ccb1680d25b @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13168: 71394e773da8593dc7da9aa95d43112cf606504d @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13168/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-06-05 12:11 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 13:11 [PATCH v3 0/7] drm/i915: Vulkan performance query support Lionel Landwerlin
2019-06-04 13:11 ` [PATCH v3 1/7] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
2019-06-04 13:11 ` [PATCH v3 2/7] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
2019-06-04 13:29   ` Chris Wilson
2019-06-04 13:11 ` [PATCH v3 3/7] drm/i915: introduce a mechanism to extend execbuf2 Lionel Landwerlin
2019-06-04 13:32   ` Chris Wilson
2019-06-04 13:11 ` [PATCH v3 4/7] drm/i915: add syncobj timeline support Lionel Landwerlin
2019-06-04 13:11 ` [PATCH v3 5/7] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
2019-06-04 13:40   ` Chris Wilson
2019-06-04 13:51     ` Lionel Landwerlin
2019-06-04 13:11 ` [PATCH v3 6/7] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
2019-06-04 13:45   ` Chris Wilson
2019-06-04 13:11 ` [PATCH v3 7/7] drm/i915: add support for perf configuration queries Lionel Landwerlin
2019-06-04 16:09 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support (rev3) Patchwork
2019-06-04 16:14 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-06-04 16:30 ` ✓ Fi.CI.BAT: success " Patchwork
2019-06-05 12:11 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.