All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] drm/i915: Vulkan performance query support
@ 2019-05-21 14:08 Lionel Landwerlin
  2019-05-21 14:08 ` [PATCH 1/5] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
                   ` (7 more replies)
  0 siblings, 8 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 14:08 UTC (permalink / raw)
  To: intel-gfx

Hi all,

This small (but maybe not to everybody's taste) series enables us to
support performance queries on Vulkan. We've gone through the process
to define this as a Vulkan INTEL extension (it should appear on [1]
soonish).

We'll publish the Mesa side shortly.

Cheers,

 [1] : https://github.com/KhronosGroup/Vulkan-Docs

Lionel Landwerlin (5):
  drm/i915/perf: introduce a versioning of the i915-perf uapi
  drm/i915/perf: allow holding preemption on filtered ctx
  drm/i915/perf: allow for CS OA configs to be created lazily
  drm/i915: add a new perf configuration execbuf parameter
  drm/i915: add support for perf configuration queries

 drivers/gpu/drm/i915/gt/intel_context.c       |   1 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |   3 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   1 +
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   3 +-
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c    |   4 +-
 drivers/gpu/drm/i915/i915_drv.c               |   7 +
 drivers/gpu/drm/i915/i915_drv.h               |  29 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c    | 113 ++++++-
 drivers/gpu/drm/i915/i915_perf.c              | 226 +++++++++++---
 drivers/gpu/drm/i915/i915_query.c             | 277 ++++++++++++++++++
 include/uapi/drm/i915_drm.h                   | 112 ++++++-
 12 files changed, 727 insertions(+), 56 deletions(-)

--
2.21.0.392.gf8f6787159e
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/5] drm/i915/perf: introduce a versioning of the i915-perf uapi
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
@ 2019-05-21 14:08 ` Lionel Landwerlin
  2019-05-21 14:08 ` [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 14:08 UTC (permalink / raw)
  To: intel-gfx

Reporting this version will help application figure out what level of
the support the running kernel provides.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |  3 +++
 include/uapi/drm/i915_drm.h     | 20 ++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 2c7a4318d13c..f309a0b2ccfc 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -469,6 +469,9 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_MMAP_GTT_COHERENT:
 		value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
 		break;
+	case I915_PARAM_PERF_REVISION:
+		value = 1;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 3a73f5316766..ad8a3e4f6355 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -598,6 +598,12 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_MMAP_GTT_COHERENT	52
 
+/*
+ * Revision of the i915-perf uAPI. The value returned helps determine what
+ * i915-perf features are available. See drm_i915_perf_property_id.
+ */
+#define I915_PARAM_PERF_REVISION	53
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1682,23 +1688,31 @@ enum drm_i915_perf_property_id {
 	 * Open the stream for a specific context handle (as used with
 	 * execbuffer2). A stream opened for a specific context this way
 	 * won't typically require root privileges.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_CTX_HANDLE = 1,
 
 	/**
 	 * A value of 1 requests the inclusion of raw OA unit reports as
 	 * part of stream samples.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_SAMPLE_OA,
 
 	/**
 	 * The value specifies which set of OA unit metrics should be
 	 * be configured, defining the contents of any OA unit reports.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_OA_METRICS_SET,
 
 	/**
 	 * The value specifies the size and layout of OA unit reports.
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_OA_FORMAT,
 
@@ -1708,6 +1722,8 @@ enum drm_i915_perf_property_id {
 	 * from this exponent as follows:
 	 *
 	 *   80ns * 2^(period_exponent + 1)
+	 *
+	 * This property is available in perf revision 1.
 	 */
 	DRM_I915_PERF_PROP_OA_EXPONENT,
 
@@ -1739,6 +1755,8 @@ struct drm_i915_perf_open_param {
  * to close and re-open a stream with the same configuration.
  *
  * It's undefined whether any pending data for the stream will be lost.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_ENABLE	_IO('i', 0x0)
 
@@ -1746,6 +1764,8 @@ struct drm_i915_perf_open_param {
  * Disable data capture for a stream.
  *
  * It is an error to try and read a stream that is disabled.
+ *
+ * This ioctl is available in perf revision 1.
  */
 #define I915_PERF_IOCTL_DISABLE	_IO('i', 0x1)
 
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
  2019-05-21 14:08 ` [PATCH 1/5] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
@ 2019-05-21 14:08 ` Lionel Landwerlin
  2019-05-21 16:36   ` Chris Wilson
  2019-05-22  4:33   ` kbuild test robot
  2019-05-21 14:08 ` [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 14:08 UTC (permalink / raw)
  To: intel-gfx

We would like to make use of perf in Vulkan. The Vulkan API is much
lower level than OpenGL, with applications directly exposed to the
concept of command buffers (pretty much equivalent to our batch
buffers). In Vulkan, queries are always limited in scope to a command
buffer. In OpenGL, the lack of command buffer concept meant that
queries' duration could span multiple command buffers.

With that restriction gone in Vulkan, we would like to simplify
measuring performance just by measuring the deltas between the counter
snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the
more complex scheme we currently have in the GL driver, using 2
MI_RECORD_PERF_COUNT commands and doing some post processing on the
stream of OA reports, coming from the global OA buffer, to remove any
unrelated deltas in between the 2 MI_RECORD_PERF_COUNT.

Disabling preemption only apply to a single context with which want to
query performance counters for and is considered a privileged
operation, by default protected by CAP_SYS_ADMIN. It is possible to
enable it for a normal user by disabling the paranoid stream setting.

v2: Store preemption setting in intel_context (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_context.c       |  1 +
 drivers/gpu/drm/i915/gt/intel_context_types.h |  3 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c           |  2 +-
 drivers/gpu/drm/i915/i915_drv.c               |  2 +-
 drivers/gpu/drm/i915/i915_perf.c              | 35 +++++++++++++++----
 include/uapi/drm/i915_drm.h                   | 10 ++++++
 6 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c
index 5b31e1e05ddd..68a4b888fb1a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -117,6 +117,7 @@ intel_context_init(struct intel_context *ce,
 	ce->ops = engine->cops;
 	ce->sseu = engine->sseu;
 	ce->saturated = 0;
+	ce->arb_enable = MI_ARB_ENABLE;
 
 	INIT_LIST_HEAD(&ce->signal_link);
 	INIT_LIST_HEAD(&ce->signals);
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 963a312430e6..07f586e3608d 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -65,6 +65,9 @@ struct intel_context {
 
 	/** sseu: Control eu/slice partitioning */
 	struct intel_sseu sseu;
+
+	/** arb_enable: Control preemption */
+	u32 arb_enable;
 };
 
 #endif /* __INTEL_CONTEXT_TYPES__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f263a8374273..2ad95977f7a8 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
-	*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
+	*cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
 
 	*cs++ = MI_BATCH_BUFFER_START_GEN8 |
 		(flags & I915_DISPATCH_SECURE ? 0 : BIT(8));
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f309a0b2ccfc..5871e0cfbab0 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -470,7 +470,7 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 		value = INTEL_INFO(dev_priv)->has_coherent_ggtt;
 		break;
 	case I915_PARAM_PERF_REVISION:
-		value = 1;
+		value = 2;
 		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index c4995d5a16d2..8c7fa7f7014b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -355,6 +355,7 @@ struct perf_open_properties {
 	u32 sample_flags;
 
 	u64 single_context:1;
+	u64 context_disable_preemption:1;
 	u64 ctx_handle;
 
 	/* OA sampling state */
@@ -1201,7 +1202,8 @@ static int i915_oa_read(struct i915_perf_stream *stream,
 }
 
 static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
-					    struct i915_gem_context *ctx)
+					    struct i915_gem_context *ctx,
+					    bool disable_preemption)
 {
 	struct i915_gem_engines_iter it;
 	struct intel_context *ce;
@@ -1222,6 +1224,7 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 		err = intel_context_pin(ce);
 		if (err == 0) {
 			i915->perf.oa.pinned_ctx = ce;
+			ce->arb_enable = MI_ARB_DISABLE;
 			break;
 		}
 	}
@@ -1237,19 +1240,22 @@ static struct intel_context *oa_pin_context(struct drm_i915_private *i915,
 /**
  * oa_get_render_ctx_id - determine and hold ctx hw id
  * @stream: An i915-perf stream opened for OA metrics
+ * @disable_preemption: Whether to disable preemption on the context
  *
  * Determine the render context hw id, and ensure it remains fixed for the
  * lifetime of the stream. This ensures that we don't have to worry about
- * updating the context ID in OACONTROL on the fly.
+ * updating the context ID in OACONTROL on the fly. Also disable preemption on
+ * the context if needed.
  *
  * Returns: zero on success or a negative error code
  */
-static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
+static int oa_get_render_ctx_id(struct i915_perf_stream *stream,
+				bool disable_preemption)
 {
 	struct drm_i915_private *i915 = stream->dev_priv;
 	struct intel_context *ce;
 
-	ce = oa_pin_context(i915, stream->ctx);
+	ce = oa_pin_context(i915, stream->ctx, disable_preemption);
 	if (IS_ERR(ce))
 		return PTR_ERR(ce);
 
@@ -1337,6 +1343,7 @@ static void oa_put_render_ctx_id(struct i915_perf_stream *stream)
 	ce = fetch_and_zero(&dev_priv->perf.oa.pinned_ctx);
 	if (ce) {
 		mutex_lock(&dev_priv->drm.struct_mutex);
+		ce->arb_enable = MI_ARB_ENABLE;
 		intel_context_unpin(ce);
 		mutex_unlock(&dev_priv->drm.struct_mutex);
 	}
@@ -2085,7 +2092,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		dev_priv->perf.oa.period_exponent = props->oa_period_exponent;
 
 	if (stream->ctx) {
-		ret = oa_get_render_ctx_id(stream);
+		ret = oa_get_render_ctx_id(stream, props->context_disable_preemption);
 		if (ret) {
 			DRM_DEBUG("Invalid context id to filter with\n");
 			return ret;
@@ -2583,6 +2590,15 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 		}
 	}
 
+	if (props->context_disable_preemption) {
+		if (!props->single_context) {
+			DRM_DEBUG("preemption disable with no context\n");
+			ret = -EINVAL;
+			goto err;
+		}
+		privileged_op = true;
+	}
+
 	/*
 	 * On Haswell the OA unit supports clock gating off for a specific
 	 * context and in this mode there's no visibility of metrics for the
@@ -2597,8 +2613,10 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	 * MI_REPORT_PERF_COUNT commands and so consider it a privileged op to
 	 * enable the OA unit by default.
 	 */
-	if (IS_HASWELL(dev_priv) && specific_ctx)
+	if (IS_HASWELL(dev_priv) && specific_ctx &&
+	    !props->context_disable_preemption) {
 		privileged_op = false;
+	}
 
 	/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
 	 * we check a dev.i915.perf_stream_paranoid sysctl option
@@ -2607,7 +2625,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	 */
 	if (privileged_op &&
 	    i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
-		DRM_DEBUG("Insufficient privileges to open system-wide i915 perf stream\n");
+		DRM_DEBUG("Insufficient privileges to open i915 perf stream\n");
 		ret = -EACCES;
 		goto err_ctx;
 	}
@@ -2799,6 +2817,9 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 			props->oa_periodic = true;
 			props->oa_period_exponent = value;
 			break;
+		case DRM_I915_PERF_PROP_HOLD_PREEMPTION:
+			props->context_disable_preemption = value != 0 ? 1 : 0;
+			break;
 		case DRM_I915_PERF_PROP_MAX:
 			MISSING_CASE(id);
 			return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index ad8a3e4f6355..5601dc688295 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1727,6 +1727,16 @@ enum drm_i915_perf_property_id {
 	 */
 	DRM_I915_PERF_PROP_OA_EXPONENT,
 
+	/**
+	 * Specifying this property is only valid when specify a context to
+	 * filter with DRM_I915_PERF_PROP_CTX_HANDLE. Specifying this property
+	 * will hold preemption of the particular context we want to gather
+	 * performance data about.
+	 *
+	 * This property is available in perf revision 2.
+	 */
+	DRM_I915_PERF_PROP_HOLD_PREEMPTION,
+
 	DRM_I915_PERF_PROP_MAX /* non-ABI */
 };
 
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
  2019-05-21 14:08 ` [PATCH 1/5] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
  2019-05-21 14:08 ` [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
@ 2019-05-21 14:08 ` Lionel Landwerlin
  2019-05-21 16:43   ` Chris Wilson
  2019-05-21 14:08 ` [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 14:08 UTC (permalink / raw)
  To: intel-gfx

Here we introduce a mechanism by which the execbuf part of the i915
driver will be able to request that a batch buffer containing the
programming for a particular OA config be created.

We'll execute these OA configuration buffers right before executing a
set of userspace commands so that a particular user batchbuffer be
executed with a given OA configuration.

This mechanism essentially allows the userspace driver to go through
several OA configuration without having to open/close the i915/perf
stream.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
 drivers/gpu/drm/i915/i915_drv.h              |  22 ++-
 drivers/gpu/drm/i915/i915_perf.c             | 187 ++++++++++++++++---
 3 files changed, 178 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index a34ece53a771..bbcb80cf2a85 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -126,6 +126,7 @@
  */
 #define MI_LOAD_REGISTER_IMM(x)	MI_INSTR(0x22, 2*(x)-1)
 #define   MI_LRI_FORCE_POSTED		(1<<12)
+#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
 #define MI_STORE_REGISTER_MEM        MI_INSTR(0x24, 1)
 #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
 #define   MI_SRM_LRM_GLOBAL_GTT		(1<<22)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1ad3818d2676..abd564bfa03b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1274,6 +1274,10 @@ struct i915_oa_config {
 	struct attribute *attrs[2];
 	struct device_attribute sysfs_metric_id;
 
+	struct drm_i915_gem_object *obj;
+
+	struct list_head vma_link;
+
 	atomic_t ref_count;
 };
 
@@ -1856,11 +1860,21 @@ struct drm_i915_private {
 		struct mutex metrics_lock;
 
 		/*
-		 * List of dynamic configurations, you need to hold
-		 * dev_priv->perf.metrics_lock to access it.
+		 * List of dynamic configurations (struct i915_oa_config), you
+		 * need to hold dev_priv->perf.metrics_lock to access it.
 		 */
 		struct idr metrics_idr;
 
+		/*
+		 * List of dynamic configurations (struct i915_oa_config)
+		 * which have an allocated buffer in GGTT for reconfiguration,
+		 * you need to hold dev_priv->perf.metrics_lock to access it.
+		 * Elements are added to the list lazilly on execbuf (when a
+		 * particular configuration is requested). The list is freed
+		 * upon closing the perf stream.
+		 */
+		struct list_head metrics_buffers;
+
 		/*
 		 * Lock associated with anything below within this structure
 		 * except exclusive_stream.
@@ -3136,6 +3150,10 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct intel_context *ce,
 			    u32 *reg_state);
+int i915_perf_get_oa_config(struct drm_i915_private *i915,
+			    int metrics_set,
+			    struct i915_oa_config **out_config,
+			    struct drm_i915_gem_object **out_obj);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 8c7fa7f7014b..7e0ebd4bc8f2 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -365,9 +365,16 @@ struct perf_open_properties {
 	int oa_period_exponent;
 };
 
-static void free_oa_config(struct drm_i915_private *dev_priv,
-			   struct i915_oa_config *oa_config)
+static void put_oa_config(struct i915_oa_config *oa_config)
 {
+	if (!atomic_dec_and_test(&oa_config->ref_count))
+		return;
+
+	if (oa_config->obj) {
+		list_del(&oa_config->vma_link);
+		i915_gem_object_put(oa_config->obj);
+	}
+
 	if (!PTR_ERR(oa_config->flex_regs))
 		kfree(oa_config->flex_regs);
 	if (!PTR_ERR(oa_config->b_counter_regs))
@@ -377,38 +384,142 @@ static void free_oa_config(struct drm_i915_private *dev_priv,
 	kfree(oa_config);
 }
 
-static void put_oa_config(struct drm_i915_private *dev_priv,
-			  struct i915_oa_config *oa_config)
+static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs)
 {
-	if (!atomic_dec_and_test(&oa_config->ref_count))
-		return;
+	u32 i;
+
+	for (i = 0; i < n_regs; i++) {
+		if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
+			u32 n_lri = min(n_regs - i,
+					(u32) MI_LOAD_REGISTER_IMM_MAX_REGS);
 
-	free_oa_config(dev_priv, oa_config);
+			*cs++ = MI_LOAD_REGISTER_IMM(n_lri);
+		}
+		*cs++ = i915_mmio_reg_offset(reg_data[i].addr);
+		*cs++ = reg_data[i].value;
+	}
+
+	return cs;
 }
 
-static int get_oa_config(struct drm_i915_private *dev_priv,
-			 int metrics_set,
-			 struct i915_oa_config **out_config)
+static int alloc_oa_config_buffer(struct drm_i915_private *i915,
+				  struct i915_oa_config *oa_config)
 {
+	struct drm_i915_gem_object *bo;
+	size_t config_length = 0;
 	int ret;
+	u32 *cs;
 
-	if (metrics_set == 1) {
-		*out_config = &dev_priv->perf.oa.test_config;
-		atomic_inc(&dev_priv->perf.oa.test_config.ref_count);
-		return 0;
+	if (oa_config->mux_regs_len > 0) {
+		config_length += DIV_ROUND_UP(oa_config->mux_regs_len,
+					      MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
+		config_length += oa_config->mux_regs_len * 8;
+	}
+	if (oa_config->b_counter_regs_len > 0) {
+		config_length += DIV_ROUND_UP(oa_config->b_counter_regs_len,
+					      MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
+		config_length += oa_config->b_counter_regs_len * 8;
 	}
+	if (oa_config->flex_regs_len > 0) {
+		config_length += DIV_ROUND_UP(oa_config->flex_regs_len,
+					      MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
+		config_length += oa_config->flex_regs_len * 8;
+	}
+	config_length += 4; /* MI_BATCH_BUFFER_END */
+	config_length = ALIGN(config_length, I915_GTT_PAGE_SIZE);
 
-	ret = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
+	ret = i915_mutex_lock_interruptible(&i915->drm);
 	if (ret)
 		return ret;
 
-	*out_config = idr_find(&dev_priv->perf.metrics_idr, metrics_set);
-	if (!*out_config)
-		ret = -EINVAL;
-	else
-		atomic_inc(&(*out_config)->ref_count);
+	bo = i915_gem_object_create(i915, config_length);
+	if (IS_ERR(bo)) {
+		ret = PTR_ERR(bo);
+		goto unlock;
+	}
 
-	mutex_unlock(&dev_priv->perf.metrics_lock);
+	cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
+	if (IS_ERR(cs)) {
+		ret = PTR_ERR(cs);
+		goto err_unref;
+	}
+
+	memset(cs, 0, config_length);
+
+	cs = write_cs_mi_lri(cs, oa_config->mux_regs, oa_config->mux_regs_len);
+	cs = write_cs_mi_lri(cs, oa_config->b_counter_regs, oa_config->b_counter_regs_len);
+	cs = write_cs_mi_lri(cs, oa_config->flex_regs, oa_config->flex_regs_len);
+
+	*cs++ = MI_BATCH_BUFFER_END;
+
+	i915_gem_object_unpin_map(bo);
+
+	oa_config->obj = bo;
+
+	goto unlock;
+
+err_unref:
+	oa_config->obj = NULL;
+	i915_gem_object_put(bo);
+
+unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return ret;
+}
+
+int i915_perf_get_oa_config(struct drm_i915_private *i915,
+			    int metrics_set,
+			    struct i915_oa_config **out_config,
+			    struct drm_i915_gem_object **out_obj)
+{
+	int ret = 0;
+	struct i915_oa_config *oa_config;
+
+	if (!i915->perf.initialized)
+		return -ENODEV;
+
+	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
+	if (ret)
+		return ret;
+
+	if (metrics_set == 1) {
+		oa_config = &i915->perf.oa.test_config;
+	} else {
+		oa_config = idr_find(&i915->perf.metrics_idr, metrics_set);
+		if (!oa_config) {
+			ret = -EINVAL;
+			goto unlock;
+		}
+	}
+
+	if (out_config) {
+		atomic_inc(&oa_config->ref_count);
+		*out_config = oa_config;
+	}
+
+	if (out_obj) {
+		if (oa_config->obj) {
+			*out_obj = i915_gem_object_get(oa_config->obj);
+		} else {
+			ret = alloc_oa_config_buffer(i915, oa_config);
+			if (ret)
+				goto err_buf_alloc;
+
+			list_add(&oa_config->vma_link,
+				 &i915->perf.metrics_buffers);
+			*out_obj = i915_gem_object_get(oa_config->obj);
+		}
+	}
+
+	goto unlock;
+
+err_buf_alloc:
+	if (out_config) {
+		put_oa_config(oa_config);
+		*out_config = NULL;
+	}
+unlock:
+	mutex_unlock(&i915->perf.metrics_lock);
 
 	return ret;
 }
@@ -1385,7 +1496,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
 
-	put_oa_config(dev_priv, stream->oa_config);
+	put_oa_config(stream->oa_config);
 
 	if (dev_priv->perf.oa.spurious_report_rs.missed) {
 		DRM_NOTE("%d spurious OA report notices suppressed due to ratelimiting\n",
@@ -2099,7 +2210,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		}
 	}
 
-	ret = get_oa_config(dev_priv, props->metrics_set, &stream->oa_config);
+	ret = i915_perf_get_oa_config(dev_priv, props->metrics_set,
+				      &stream->oa_config, NULL);
 	if (ret) {
 		DRM_DEBUG("Invalid OA config id=%i\n", props->metrics_set);
 		goto err_config;
@@ -2112,7 +2224,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	 *   "When this bit is set, in order to have coherent counts,
 	 *   RC6 power state and trunk clock gating must be disabled.
 	 *   This can be achieved by programming MMIO registers as
-	 *   0xA094=0 and 0xA090[31]=1"
+	 *   u0xA094=0 and 0xA090[31]=1"
 	 *
 	 *   In our case we are expecting that taking pm + FORCEWAKE
 	 *   references will effectively disable RC6.
@@ -2137,6 +2249,8 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		goto err_enable;
 	}
 
+	DRM_DEBUG("opening stream oa config uuid=%s\n", stream->oa_config->uuid);
+
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	return 0;
@@ -2150,7 +2264,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	free_oa_buffer(dev_priv);
 
 err_oa_buf_alloc:
-	put_oa_config(dev_priv, stream->oa_config);
+	put_oa_config(stream->oa_config);
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 	intel_runtime_pm_put(dev_priv, stream->wakeref);
@@ -2517,9 +2631,21 @@ static int i915_perf_release(struct inode *inode, struct file *file)
 {
 	struct i915_perf_stream *stream = file->private_data;
 	struct drm_i915_private *dev_priv = stream->dev_priv;
+	struct i915_oa_config *oa_config, *next;
 
 	mutex_lock(&dev_priv->perf.lock);
+
 	i915_perf_destroy_locked(stream);
+
+	/* Dispose of all oa config batch buffers. */
+	mutex_lock(&dev_priv->perf.metrics_lock);
+	list_for_each_entry_safe(oa_config, next, &dev_priv->perf.metrics_buffers, vma_link) {
+		list_del(&oa_config->vma_link);
+		i915_gem_object_put(oa_config->obj);
+		oa_config->obj = NULL;
+	}
+	mutex_unlock(&dev_priv->perf.metrics_lock);
+
 	mutex_unlock(&dev_priv->perf.lock);
 
 	return 0;
@@ -3315,7 +3441,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 sysfs_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
 reg_err:
-	put_oa_config(dev_priv, oa_config);
+	put_oa_config(oa_config);
 	DRM_DEBUG("Failed to add new OA config\n");
 	return err;
 }
@@ -3369,7 +3495,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 
 	DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
 
-	put_oa_config(dev_priv, oa_config);
+	put_oa_config(oa_config);
 
 config_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
@@ -3511,6 +3637,8 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		init_waitqueue_head(&dev_priv->perf.oa.poll_wq);
 
 		INIT_LIST_HEAD(&dev_priv->perf.streams);
+		INIT_LIST_HEAD(&dev_priv->perf.metrics_buffers);
+
 		mutex_init(&dev_priv->perf.lock);
 		spin_lock_init(&dev_priv->perf.oa.oa_buffer.ptr_lock);
 
@@ -3527,10 +3655,9 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 
 static int destroy_config(int id, void *p, void *data)
 {
-	struct drm_i915_private *dev_priv = data;
 	struct i915_oa_config *oa_config = p;
 
-	put_oa_config(dev_priv, oa_config);
+	put_oa_config(oa_config);
 
 	return 0;
 }
@@ -3544,7 +3671,7 @@ void i915_perf_fini(struct drm_i915_private *dev_priv)
 	if (!dev_priv->perf.initialized)
 		return;
 
-	idr_for_each(&dev_priv->perf.metrics_idr, destroy_config, dev_priv);
+	idr_for_each(&dev_priv->perf.metrics_idr, destroy_config, NULL);
 	idr_destroy(&dev_priv->perf.metrics_idr);
 
 	unregister_sysctl_table(dev_priv->perf.sysctl_header);
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (2 preceding siblings ...)
  2019-05-21 14:08 ` [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
@ 2019-05-21 14:08 ` Lionel Landwerlin
  2019-05-21 17:07   ` Chris Wilson
  2019-05-28 10:52   ` Chris Wilson
  2019-05-21 14:08 ` [PATCH 5/5] drm/i915: add support for perf configuration queries Lionel Landwerlin
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 14:08 UTC (permalink / raw)
  To: intel-gfx

We want the ability to dispatch a set of command buffer to the
hardware, each with a different OA configuration. To achieve this, we
reuse a couple of fields from the execbuf2 struct (I CAN HAZ
execbuf3?) to notify what OA configuration should be used for a batch
buffer. This requires the process making the execbuf with this flag to
also own the perf fd at the time of execbuf.

v2: Add a emit_oa_config() vfunc in the intel_engine_cs (Chris)
    Move oa_config vma to active (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine_types.h |   7 ++
 drivers/gpu/drm/i915/gt/intel_lrc.c          |   1 +
 drivers/gpu/drm/i915/gt/intel_ringbuffer.c   |   4 +-
 drivers/gpu/drm/i915/i915_drv.c              |   4 +
 drivers/gpu/drm/i915/i915_drv.h              |   1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 113 ++++++++++++++++---
 drivers/gpu/drm/i915/i915_perf.c             |  14 +--
 include/uapi/drm/i915_drm.h                  |  20 +++-
 8 files changed, 142 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index e381c1c73902..766fbbede430 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -445,6 +445,7 @@ struct intel_engine_cs {
 #define I915_ENGINE_HAS_PREEMPTION   BIT(2)
 #define I915_ENGINE_HAS_SEMAPHORES   BIT(3)
 #define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
+#define I915_ENGINE_HAS_OA           BIT(5)
 	unsigned int flags;
 
 	/*
@@ -534,6 +535,12 @@ intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
 	return engine->flags & I915_ENGINE_NEEDS_BREADCRUMB_TASKLET;
 }
 
+static inline bool
+intel_engine_has_oa(const struct intel_engine_cs *engine)
+{
+	return engine->flags & I915_ENGINE_HAS_OA;
+}
+
 #define instdone_slice_mask(dev_priv__) \
 	(IS_GEN(dev_priv__, 7) ? \
 	 1 : RUNTIME_INFO(dev_priv__)->sseu.slice_mask)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 2ad95977f7a8..cad6fca4ba0f 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2395,6 +2395,7 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
 		engine->init_context = gen8_init_rcs_context;
 		engine->emit_flush = gen8_emit_flush_render;
 		engine->emit_fini_breadcrumb = gen8_emit_fini_breadcrumb_rcs;
+		engine->flags |= I915_ENGINE_HAS_OA;
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
index f0d60affdba3..dc85a3e474b9 100644
--- a/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/gt/intel_ringbuffer.c
@@ -2210,8 +2210,10 @@ static void setup_rcs(struct intel_engine_cs *engine)
 		engine->irq_enable_mask = I915_USER_INTERRUPT;
 	}
 
-	if (IS_HASWELL(i915))
+	if (IS_HASWELL(i915)) {
 		engine->emit_bb_start = hsw_emit_bb_start;
+		engine->flags |= I915_ENGINE_HAS_OA;
+	}
 
 	engine->resume = rcs_resume;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 5871e0cfbab0..6d9a15642342 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -472,6 +472,10 @@ static int i915_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_PARAM_PERF_REVISION:
 		value = 2;
 		break;
+	case I915_PARAM_HAS_EXEC_PERF_CONFIG:
+		/* Obviously requires perf support. */
+		value = dev_priv->perf.initialized;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index abd564bfa03b..25860d99ffc6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3154,6 +3154,7 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915,
 			    int metrics_set,
 			    struct i915_oa_config **out_config,
 			    struct drm_i915_gem_object **out_obj);
+void i915_oa_config_put(struct i915_oa_config *oa_config);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 361c232dde83..3794c6ce71e3 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -288,6 +288,9 @@ struct i915_execbuffer {
 	 */
 	int lut_size;
 	struct hlist_head *buckets; /** ht for relocation handles */
+
+	struct i915_oa_config *oa_config; /** HW configuration for OA, NULL is not needed. */
+	struct drm_i915_gem_object *oa_bo;
 };
 
 #define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
@@ -1183,6 +1186,33 @@ static void clflush_write32(u32 *addr, u32 value, unsigned int flushes)
 		*addr = value;
 }
 
+static int
+get_execbuf_oa_config(struct drm_i915_private *dev_priv,
+		      int perf_fd, u32 oa_config_id,
+		      struct i915_oa_config **out_oa_config,
+		      struct drm_i915_gem_object **out_oa_obj)
+{
+	struct file *perf_file;
+	int ret;
+
+	if (!dev_priv->perf.oa.exclusive_stream)
+		return -EINVAL;
+
+	perf_file = fget(perf_fd);
+	if (!perf_file)
+		return -EINVAL;
+
+	if (perf_file->private_data != dev_priv->perf.oa.exclusive_stream)
+		return -EINVAL;
+
+	fput(perf_file);
+
+	ret = i915_perf_get_oa_config(dev_priv, oa_config_id,
+				      out_oa_config, out_oa_obj);
+
+	return ret;
+}
+
 static int __reloc_gpu_alloc(struct i915_execbuffer *eb,
 			     struct i915_vma *vma,
 			     unsigned int len)
@@ -1937,12 +1967,15 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec)
 			return false;
 	}
 
-	if (exec->DR4 == 0xffffffff) {
-		DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
-		exec->DR4 = 0;
+	/* We reuse DR1 & DR4 fields for passing the perf config detail. */
+	if (!(exec->flags & I915_EXEC_PERF_CONFIG)) {
+		if (exec->DR4 == 0xffffffff) {
+			DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
+			exec->DR4 = 0;
+		}
+		if (exec->DR1 || exec->DR4)
+			return false;
 	}
-	if (exec->DR1 || exec->DR4)
-		return false;
 
 	if ((exec->batch_start_offset | exec->batch_len) & 0x7)
 		return false;
@@ -2048,6 +2081,42 @@ static int eb_submit(struct i915_execbuffer *eb)
 			return err;
 	}
 
+	if (eb->oa_config &&
+	    eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
+		struct i915_vma *oa_vma;
+
+		oa_vma = i915_vma_instance(eb->oa_bo,
+					      &eb->engine->i915->ggtt.vm, NULL);
+		if (unlikely(IS_ERR(oa_vma))) {
+			err = PTR_ERR(oa_vma);
+			return err;
+		}
+
+		err = i915_vma_pin(oa_vma, 0, 0, PIN_GLOBAL);
+		if (err)
+			return err;
+
+		err = eb->engine->emit_bb_start(eb->request,
+						oa_vma->node.start,
+						0, I915_DISPATCH_SECURE);
+		if (err) {
+			i915_vma_unpin(oa_vma);
+			return err;
+		}
+
+		err = i915_vma_move_to_active(oa_vma, eb->request, 0);
+		if (err) {
+			i915_vma_unpin(oa_vma);
+			return err;
+		}
+
+
+		i915_vma_unpin(oa_vma);
+
+
+		swap(eb->oa_config, eb->i915->perf.oa.exclusive_stream->oa_config);
+	}
+
 	err = eb->engine->emit_bb_start(eb->request,
 					eb->batch->node.start +
 					eb->batch_start_offset,
@@ -2341,6 +2410,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.buffer_count = args->buffer_count;
 	eb.batch_start_offset = args->batch_start_offset;
 	eb.batch_len = args->batch_len;
+	eb.oa_config = NULL;
 
 	eb.batch_flags = 0;
 	if (args->flags & I915_EXEC_SECURE) {
@@ -2385,17 +2455,29 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 */
 	intel_gt_pm_get(eb.i915);
 
-	err = i915_mutex_lock_interruptible(dev);
-	if (err)
-		goto err_rpm;
-
 	err = eb_select_engine(&eb, file, args);
 	if (unlikely(err))
-		goto err_unlock;
+		goto err_rpm;
+
+	if (args->flags & I915_EXEC_PERF_CONFIG) {
+		if (!intel_engine_has_oa(eb.engine)) {
+			err = -ENODEV;
+			goto err_engine;
+		}
+
+		err = get_execbuf_oa_config(eb.i915, args->DR1, args->DR4,
+					    &eb.oa_config, &eb.oa_bo);
+		if (err)
+			goto err_engine;
+	}
+
+	err = i915_mutex_lock_interruptible(dev);
+	if (err)
+		goto err_oa;
 
 	err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
 	if (unlikely(err))
-		goto err_engine;
+		goto err_unlock;
 
 	err = eb_relocate(&eb);
 	if (err) {
@@ -2541,10 +2623,15 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
-err_engine:
-	eb_unpin_context(&eb);
 err_unlock:
 	mutex_unlock(&dev->struct_mutex);
+err_oa:
+	if (eb.oa_config) {
+		i915_gem_object_put(eb.oa_bo);
+		i915_oa_config_put(eb.oa_config);
+	}
+err_engine:
+	eb_unpin_context(&eb);
 err_rpm:
 	intel_gt_pm_put(eb.i915);
 	i915_gem_context_put(eb.gem_context);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 7e0ebd4bc8f2..7b861f12f161 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -365,7 +365,7 @@ struct perf_open_properties {
 	int oa_period_exponent;
 };
 
-static void put_oa_config(struct i915_oa_config *oa_config)
+void i915_oa_config_put(struct i915_oa_config *oa_config)
 {
 	if (!atomic_dec_and_test(&oa_config->ref_count))
 		return;
@@ -515,7 +515,7 @@ int i915_perf_get_oa_config(struct drm_i915_private *i915,
 
 err_buf_alloc:
 	if (out_config) {
-		put_oa_config(oa_config);
+		i915_oa_config_put(oa_config);
 		*out_config = NULL;
 	}
 unlock:
@@ -1496,7 +1496,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
 
-	put_oa_config(stream->oa_config);
+	i915_oa_config_put(stream->oa_config);
 
 	if (dev_priv->perf.oa.spurious_report_rs.missed) {
 		DRM_NOTE("%d spurious OA report notices suppressed due to ratelimiting\n",
@@ -2264,7 +2264,7 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	free_oa_buffer(dev_priv);
 
 err_oa_buf_alloc:
-	put_oa_config(stream->oa_config);
+	i915_oa_config_put(stream->oa_config);
 
 	intel_uncore_forcewake_put(&dev_priv->uncore, FORCEWAKE_ALL);
 	intel_runtime_pm_put(dev_priv, stream->wakeref);
@@ -3441,7 +3441,7 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 sysfs_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
 reg_err:
-	put_oa_config(oa_config);
+	i915_oa_config_put(oa_config);
 	DRM_DEBUG("Failed to add new OA config\n");
 	return err;
 }
@@ -3495,7 +3495,7 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 
 	DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
 
-	put_oa_config(oa_config);
+	i915_oa_config_put(oa_config);
 
 config_err:
 	mutex_unlock(&dev_priv->perf.metrics_lock);
@@ -3657,7 +3657,7 @@ static int destroy_config(int id, void *p, void *data)
 {
 	struct i915_oa_config *oa_config = p;
 
-	put_oa_config(oa_config);
+	i915_oa_config_put(oa_config);
 
 	return 0;
 }
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 5601dc688295..e57fb5f249da 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -604,6 +604,16 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_PERF_REVISION	53
 
+/*
+ * Request an OA performance configuration change before running the commands
+ * given in an execbuf.
+ *
+ * Performance configuration ID is given in the DR4 field of
+ * drm_i915_gem_execbuffer2 and the file descriptor of the i915 perf stream is
+ * given in DR1. Execbuffer will fail if any of these parameter is invalid.
+ */
+#define I915_PARAM_HAS_EXEC_PERF_CONFIG 54
+
 /* Must be kept compact -- no holes and well documented */
 
 typedef struct drm_i915_getparam {
@@ -1126,7 +1136,15 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_FENCE_ARRAY   (1<<19)
 
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
+/* Request that perf monitoring hardware be reprogrammed before executing the
+ * commands from the batch in the execbuf. The DR1 & DR4 fields of the execbuf
+ * must respectively contain the file descriptor of the perf monitoring device
+ * and the configuration to program.
+ */
+#define I915_EXEC_PERF_CONFIG   (1<<20)
+
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_PERF_CONFIG<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 5/5] drm/i915: add support for perf configuration queries
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (3 preceding siblings ...)
  2019-05-21 14:08 ` [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
@ 2019-05-21 14:08 ` Lionel Landwerlin
  2019-05-23 10:32   ` Dan Carpenter
  2019-05-21 16:37 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support Patchwork
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 14:08 UTC (permalink / raw)
  To: intel-gfx

Listing configurations at the moment is supported only through sysfs.
This might cause issues for applications wanting to list
configurations from a container where sysfs isn't available.

This change adds a way to query the number of configurations and their
content through the i915 query uAPI.

v2: Fix sparse warnings (Lionel)
    Add support to query configuration using uuid (Lionel)

v3: Fix some inconsistency in uapi header (Lionel)
    Fix unlocking when not locked issue (Lionel)
    Add debug messages (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h   |   6 +
 drivers/gpu/drm/i915/i915_perf.c  |   4 +
 drivers/gpu/drm/i915/i915_query.c | 277 ++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h       |  62 ++++++-
 4 files changed, 348 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 25860d99ffc6..6127c6890e0b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1875,6 +1875,12 @@ struct drm_i915_private {
 		 */
 		struct list_head metrics_buffers;
 
+		/*
+		 * Number of dynamic configurations, you need to hold
+		 * dev_priv->perf.metrics_lock to access it.
+		 */
+		u32 n_metrics;
+
 		/*
 		 * Lock associated with anything below within this structure
 		 * except exclusive_stream.
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 7b861f12f161..6ab414e0ba1c 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3432,6 +3432,8 @@ int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
 		goto sysfs_err;
 	}
 
+	dev_priv->perf.n_metrics++;
+
 	mutex_unlock(&dev_priv->perf.metrics_lock);
 
 	DRM_DEBUG("Added config %s id=%i\n", oa_config->uuid, oa_config->id);
@@ -3493,6 +3495,8 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
 
 	idr_remove(&dev_priv->perf.metrics_idr, *arg);
 
+	dev_priv->perf.n_metrics--;
+
 	DRM_DEBUG("Removed config %s id=%i\n", oa_config->uuid, oa_config->id);
 
 	i915_oa_config_put(oa_config);
diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
index 782183b78f49..82bd6d973527 100644
--- a/drivers/gpu/drm/i915/i915_query.c
+++ b/drivers/gpu/drm/i915/i915_query.c
@@ -96,9 +96,286 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
 	return total_length;
 }
 
+static int can_copy_perf_config_registers_or_number(u32 user_n_regs,
+						    u64 user_regs_ptr,
+						    u32 kernel_n_regs)
+{
+	/*
+	 * We'll just put the number of registers, and won't copy the
+	 * register.
+	 */
+	if (user_n_regs == 0)
+		return 0;
+
+	if (user_n_regs < kernel_n_regs)
+		return -EINVAL;
+
+	if (!access_ok(u64_to_user_ptr(user_regs_ptr),
+		       2 * sizeof(u32) * kernel_n_regs))
+		return -EFAULT;
+
+	return 0;
+}
+
+static int copy_perf_config_registers_or_number(const struct i915_oa_reg *kernel_regs,
+						u32 kernel_n_regs,
+						u64 user_regs_ptr,
+						u32 *user_n_regs)
+{
+	u32 r;
+
+	if (*user_n_regs == 0) {
+		*user_n_regs = kernel_n_regs;
+		return 0;
+	}
+
+	*user_n_regs = kernel_n_regs;
+
+	for (r = 0; r < kernel_n_regs; r++) {
+		u32 __user *user_reg_ptr =
+			u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2);
+		u32 __user *user_val_ptr =
+			u64_to_user_ptr(user_regs_ptr + sizeof(u32) * r * 2 +
+					sizeof(u32));
+		int ret;
+
+		ret = __put_user(i915_mmio_reg_offset(kernel_regs[r].addr),
+				 user_reg_ptr);
+		if (ret)
+			return -EFAULT;
+
+		ret = __put_user(kernel_regs[r].value, user_val_ptr);
+		if (ret)
+			return -EFAULT;
+	}
+
+	return 0;
+}
+
+static int query_perf_config_data(struct drm_i915_private *i915,
+				  struct drm_i915_query_item *query_item,
+				  bool use_uuid)
+{
+	struct drm_i915_query_perf_config __user *user_query_config_ptr =
+		u64_to_user_ptr(query_item->data_ptr);
+	struct drm_i915_perf_oa_config __user *user_config_ptr =
+		u64_to_user_ptr(query_item->data_ptr +
+				sizeof(struct drm_i915_query_perf_config));
+	struct drm_i915_perf_oa_config user_config;
+	struct i915_oa_config *oa_config = NULL;
+	u32 flags, total_size;
+	int ret;
+
+	if (!i915->perf.initialized)
+		return -ENODEV;
+
+	total_size = sizeof(struct drm_i915_query_perf_config) +
+		sizeof(struct drm_i915_perf_oa_config);
+
+	if (query_item->length == 0)
+		return total_size;
+
+	if (query_item->length < total_size) {
+		DRM_DEBUG("Invalid query config data item size=%u expected=%u\n",
+			  query_item->length, total_size);
+		return -EINVAL;
+	}
+
+	if (!access_ok(user_query_config_ptr, total_size))
+		return -EFAULT;
+
+	if (__get_user(flags, &user_query_config_ptr->flags))
+		return -EFAULT;
+
+	if (flags != 0)
+		return -EINVAL;
+
+	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
+	if (ret)
+		return ret;
+
+	if (use_uuid) {
+		char uuid[UUID_STRING_LEN + 1] = { 0, };
+		struct i915_oa_config *tmp;
+		int id;
+
+		BUILD_BUG_ON(sizeof(user_query_config_ptr->uuid) >= sizeof(uuid));
+
+		if (__copy_from_user(uuid, user_query_config_ptr->uuid,
+				     sizeof(user_query_config_ptr->uuid))) {
+			ret = -EFAULT;
+			goto out;
+		}
+
+		idr_for_each_entry(&i915->perf.metrics_idr, tmp, id) {
+			if (!strcmp(tmp->uuid, uuid)) {
+				oa_config = tmp;
+				break;
+			}
+		}
+	} else {
+		u64 config_id;
+
+		if (__get_user(config_id, &user_query_config_ptr->config))
+			return -EFAULT;
+
+		if (config_id == 1)
+			oa_config = &i915->perf.oa.test_config;
+		else
+			oa_config = idr_find(&i915->perf.metrics_idr, config_id);
+	}
+
+	if (!oa_config) {
+		ret = -ENOENT;
+		goto out;
+	}
+
+	if (__copy_from_user(&user_config, user_config_ptr,
+			     sizeof(user_config))) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	ret = can_copy_perf_config_registers_or_number(user_config.n_boolean_regs,
+						       user_config.boolean_regs_ptr,
+						       oa_config->b_counter_regs_len);
+	if (ret)
+		goto out;
+
+	ret = can_copy_perf_config_registers_or_number(user_config.n_flex_regs,
+						       user_config.flex_regs_ptr,
+						       oa_config->flex_regs_len);
+	if (ret)
+		goto out;
+
+	ret = can_copy_perf_config_registers_or_number(user_config.n_mux_regs,
+						       user_config.mux_regs_ptr,
+						       oa_config->mux_regs_len);
+	if (ret)
+		goto out;
+
+	ret = copy_perf_config_registers_or_number(oa_config->b_counter_regs,
+						   oa_config->b_counter_regs_len,
+						   user_config.boolean_regs_ptr,
+						   &user_config.n_boolean_regs);
+	if (ret)
+		goto out;
+
+	ret = copy_perf_config_registers_or_number(oa_config->flex_regs,
+						   oa_config->flex_regs_len,
+						   user_config.flex_regs_ptr,
+						   &user_config.n_flex_regs);
+	if (ret)
+		goto out;
+
+	ret = copy_perf_config_registers_or_number(oa_config->mux_regs,
+						   oa_config->mux_regs_len,
+						   user_config.mux_regs_ptr,
+						   &user_config.n_mux_regs);
+	if (ret)
+		goto out;
+
+	memcpy(user_config.uuid, oa_config->uuid, sizeof(user_config.uuid));
+
+	if (__copy_to_user(user_config_ptr, &user_config,
+			   sizeof(user_config))) {
+		ret = -EFAULT;
+		goto out;
+	}
+
+	ret = total_size;
+
+out:
+	mutex_unlock(&i915->perf.metrics_lock);
+	return ret;
+}
+
+static int query_perf_config_list(struct drm_i915_private *i915,
+				  struct drm_i915_query_item *query_item)
+{
+	struct drm_i915_query_perf_config __user *user_query_config_ptr =
+		u64_to_user_ptr(query_item->data_ptr);
+	struct i915_oa_config *oa_config;
+	u32 flags, total_size;
+	u64 n_configs;
+	int ret, id;
+
+	if (!i915->perf.initialized)
+		return -ENODEV;
+
+	/* Count the default test configuration */
+	n_configs = i915->perf.n_metrics + 1;
+	total_size = sizeof(struct drm_i915_query_perf_config) +
+		sizeof(u64) * n_configs;
+
+	if (query_item->length == 0)
+		return total_size;
+
+	if (query_item->length < total_size) {
+		DRM_DEBUG("Invalid query config list item size=%u expected=%u\n",
+			  query_item->length, total_size);
+		return -EINVAL;
+	}
+
+	if (!access_ok(user_query_config_ptr, total_size))
+		return -EFAULT;
+
+	if (__get_user(flags, &user_query_config_ptr->flags))
+		return -EFAULT;
+
+	if (flags != 0)
+		return -EINVAL;
+
+	if (__put_user(n_configs, &user_query_config_ptr->config))
+		return -EFAULT;
+
+	if (__put_user((u64)1ULL, &user_query_config_ptr->data[0]))
+		return -EFAULT;
+
+	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
+	if (ret)
+		return ret;
+
+	n_configs = 1;
+	idr_for_each_entry(&i915->perf.metrics_idr, oa_config, id) {
+		u64 __user *item =
+			u64_to_user_ptr(query_item->data_ptr +
+					sizeof(struct drm_i915_query_perf_config) +
+					n_configs * sizeof(u64));
+
+		if (__put_user((u64)id, item)) {
+			ret = -EFAULT;
+			goto out;
+		}
+		n_configs++;
+	}
+
+	ret = total_size;
+
+out:
+	mutex_unlock(&i915->perf.metrics_lock);
+	return ret;
+}
+
+static int query_perf_config(struct drm_i915_private *i915,
+			     struct drm_i915_query_item *query_item)
+{
+	switch (query_item->flags) {
+	case DRM_I915_QUERY_PERF_CONFIG_LIST:
+		return query_perf_config_list(i915, query_item);
+	case DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID:
+		return query_perf_config_data(i915, query_item, true);
+	case DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID:
+		return query_perf_config_data(i915, query_item, false);
+	default:
+		return -EINVAL;
+	}
+}
+
 static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv,
 					struct drm_i915_query_item *query_item) = {
 	query_topology_info,
+	query_perf_config,
 };
 
 int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index e57fb5f249da..aafe7a3569ef 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1869,6 +1869,7 @@ struct drm_i915_perf_oa_config {
 struct drm_i915_query_item {
 	__u64 query_id;
 #define DRM_I915_QUERY_TOPOLOGY_INFO    1
+#define DRM_I915_QUERY_PERF_CONFIG      2
 /* Must be kept compact -- no holes and well documented */
 
 	/*
@@ -1880,9 +1881,18 @@ struct drm_i915_query_item {
 	__s32 length;
 
 	/*
-	 * Unused for now. Must be cleared to zero.
+	 * When query_id == DRM_I915_QUERY_TOPOLOGY_INFO, must be 0.
+	 *
+	 * When query_id == DRM_I915_QUERY_PERF_CONFIG, must be one of the
+	 * following :
+	 *         - DRM_I915_QUERY_PERF_CONFIG_LIST
+	 *         - DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID
+	 *         - DRM_I915_QUERY_PERF_CONFIG_FOR_UUID
 	 */
 	__u32 flags;
+#define DRM_I915_QUERY_PERF_CONFIG_LIST          1
+#define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID 2
+#define DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID   3
 
 	/*
 	 * Data will be written at the location pointed by data_ptr when the
@@ -1967,6 +1977,56 @@ struct drm_i915_query_topology_info {
 	__u8 data[];
 };
 
+/*
+ * Data written by the kernel with query DRM_I915_QUERY_PERF_CONFIG.
+ */
+struct drm_i915_query_perf_config {
+	union {
+		/*
+		 * When query_item.flags == DRM_I915_QUERY_PERF_CONFIG_LIST, i915 sets
+		 * this fields to the number of configurations available.
+		 */
+		__u64 n_configs;
+
+		/*
+		 * When query_id == DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_ID,
+		 * i915 will use the value in this field as configuration
+		 * identifier to decide what data to write into config_ptr.
+		 */
+		__u64 config;
+
+		/*
+		 * When query_id == DRM_I915_QUERY_PERF_CONFIG_DATA_FOR_UUID,
+		 * i915 will use the value in this field as configuration
+		 * identifier to decide what data to write into config_ptr.
+		 *
+		 * String formatted like "%08x-%04x-%04x-%04x-%012x"
+		 */
+		char uuid[36];
+	};
+
+	/*
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 flags;
+
+	/*
+	 * When query_item.flags == DRM_I915_QUERY_PERF_CONFIG_LIST, i915 will
+	 * write an array of __u64 of configuration identifiers.
+	 *
+	 * When query_item.flags == DRM_I915_QUERY_PERF_CONFIG_DATA, i915 will
+	 * write a struct drm_i915_perf_oa_config. If the following fields of
+	 * drm_i915_perf_oa_config are set not set to 0, i915 will write into
+	 * the associated pointers the values of submitted when the
+	 * configuration was created :
+	 *
+	 *         - n_mux_regs
+	 *         - n_boolean_regs
+	 *         - n_flex_regs
+	 */
+	__u8 data[];
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.21.0.392.gf8f6787159e

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 14:08 ` [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
@ 2019-05-21 16:36   ` Chris Wilson
  2019-05-21 16:50     ` Lionel Landwerlin
  2019-05-24  9:28     ` Lionel Landwerlin
  2019-05-22  4:33   ` kbuild test robot
  1 sibling, 2 replies; 33+ messages in thread
From: Chris Wilson @ 2019-05-21 16:36 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-21 15:08:52)
> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> index f263a8374273..2ad95977f7a8 100644
> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
>         if (IS_ERR(cs))
>                 return PTR_ERR(cs);
>  
> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;

My prediction is that this will result in this context being reset due
to preemption timeouts and the context under profile being banned. Note
that preemption timeouts will be the primary means for hang detection
for endless batches.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (4 preceding siblings ...)
  2019-05-21 14:08 ` [PATCH 5/5] drm/i915: add support for perf configuration queries Lionel Landwerlin
@ 2019-05-21 16:37 ` Patchwork
  2019-05-21 16:40 ` ✗ Fi.CI.SPARSE: " Patchwork
  2019-05-21 17:26 ` ✗ Fi.CI.BAT: failure " Patchwork
  7 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2019-05-21 16:37 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support
URL   : https://patchwork.freedesktop.org/series/60916/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
9577e457931d drm/i915/perf: introduce a versioning of the i915-perf uapi
3744cae4f12a drm/i915/perf: allow holding preemption on filtered ctx
9e8f734b8a37 drm/i915/perf: allow for CS OA configs to be created lazily
-:120: CHECK:SPACING: No space is necessary after a cast
#120: FILE: drivers/gpu/drm/i915/i915_perf.c:394:
+					(u32) MI_LOAD_REGISTER_IMM_MAX_REGS);

total: 0 errors, 0 warnings, 1 checks, 334 lines checked
e3b3246a741b drm/i915: add a new perf configuration execbuf parameter
-:176: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis
#176: FILE: drivers/gpu/drm/i915/i915_gem_execbuffer.c:2089:
+		oa_vma = i915_vma_instance(eb->oa_bo,
+					      &eb->engine->i915->ggtt.vm, NULL);

-:200: CHECK:LINE_SPACING: Please don't use multiple blank lines
#200: FILE: drivers/gpu/drm/i915/i915_gem_execbuffer.c:2113:
+
+

-:203: CHECK:LINE_SPACING: Please don't use multiple blank lines
#203: FILE: drivers/gpu/drm/i915/i915_gem_execbuffer.c:2116:
+
+

-:370: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#370: FILE: include/uapi/drm/i915_drm.h:1144:
+#define I915_EXEC_PERF_CONFIG   (1<<20)
                                   ^

-:372: CHECK:LINE_SPACING: Please don't use multiple blank lines
#372: FILE: include/uapi/drm/i915_drm.h:1146:
+
+

-:373: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#373: FILE: include/uapi/drm/i915_drm.h:1147:
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_PERF_CONFIG<<1))
                                                           ^

total: 0 errors, 0 warnings, 6 checks, 305 lines checked
bc43d752f0ac drm/i915: add support for perf configuration queries

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* ✗ Fi.CI.SPARSE: warning for drm/i915: Vulkan performance query support
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (5 preceding siblings ...)
  2019-05-21 16:37 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support Patchwork
@ 2019-05-21 16:40 ` Patchwork
  2019-05-21 17:26 ` ✗ Fi.CI.BAT: failure " Patchwork
  7 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2019-05-21 16:40 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support
URL   : https://patchwork.freedesktop.org/series/60916/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/perf: introduce a versioning of the i915-perf uapi
Okay!

Commit: drm/i915/perf: allow holding preemption on filtered ctx
Okay!

Commit: drm/i915/perf: allow for CS OA configs to be created lazily
+drivers/gpu/drm/i915/i915_perf.c:393:37: warning: expression using sizeof(void)

Commit: drm/i915: add a new perf configuration execbuf parameter
Okay!

Commit: drm/i915: add support for perf configuration queries
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily
  2019-05-21 14:08 ` [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
@ 2019-05-21 16:43   ` Chris Wilson
  2019-05-21 16:55     ` Lionel Landwerlin
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-21 16:43 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-21 15:08:53)
> Here we introduce a mechanism by which the execbuf part of the i915
> driver will be able to request that a batch buffer containing the
> programming for a particular OA config be created.
> 
> We'll execute these OA configuration buffers right before executing a
> set of userspace commands so that a particular user batchbuffer be
> executed with a given OA configuration.
> 
> This mechanism essentially allows the userspace driver to go through
> several OA configuration without having to open/close the i915/perf
> stream.
> 
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
>  drivers/gpu/drm/i915/i915_drv.h              |  22 ++-
>  drivers/gpu/drm/i915/i915_perf.c             | 187 ++++++++++++++++---
>  3 files changed, 178 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index a34ece53a771..bbcb80cf2a85 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -126,6 +126,7 @@
>   */
>  #define MI_LOAD_REGISTER_IMM(x)        MI_INSTR(0x22, 2*(x)-1)
>  #define   MI_LRI_FORCE_POSTED          (1<<12)
> +#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
>  #define MI_STORE_REGISTER_MEM        MI_INSTR(0x24, 1)
>  #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
>  #define   MI_SRM_LRM_GLOBAL_GTT                (1<<22)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 1ad3818d2676..abd564bfa03b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1274,6 +1274,10 @@ struct i915_oa_config {
>         struct attribute *attrs[2];
>         struct device_attribute sysfs_metric_id;
>  
> +       struct drm_i915_gem_object *obj;
> +
> +       struct list_head vma_link;
> +
>         atomic_t ref_count;
>  };
>  
> @@ -1856,11 +1860,21 @@ struct drm_i915_private {
>                 struct mutex metrics_lock;
>  
>                 /*
> -                * List of dynamic configurations, you need to hold
> -                * dev_priv->perf.metrics_lock to access it.
> +                * List of dynamic configurations (struct i915_oa_config), you
> +                * need to hold dev_priv->perf.metrics_lock to access it.
>                  */
>                 struct idr metrics_idr;
>  
> +               /*
> +                * List of dynamic configurations (struct i915_oa_config)
> +                * which have an allocated buffer in GGTT for reconfiguration,
> +                * you need to hold dev_priv->perf.metrics_lock to access it.
> +                * Elements are added to the list lazilly on execbuf (when a
> +                * particular configuration is requested). The list is freed
> +                * upon closing the perf stream.
> +                */
> +               struct list_head metrics_buffers;
> +
>                 /*
>                  * Lock associated with anything below within this structure
>                  * except exclusive_stream.
> @@ -3136,6 +3150,10 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
>  void i915_oa_init_reg_state(struct intel_engine_cs *engine,
>                             struct intel_context *ce,
>                             u32 *reg_state);
> +int i915_perf_get_oa_config(struct drm_i915_private *i915,
> +                           int metrics_set,
> +                           struct i915_oa_config **out_config,
> +                           struct drm_i915_gem_object **out_obj);
>  
>  /* i915_gem_evict.c */
>  int __must_check i915_gem_evict_something(struct i915_address_space *vm,
> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
> index 8c7fa7f7014b..7e0ebd4bc8f2 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -365,9 +365,16 @@ struct perf_open_properties {
>         int oa_period_exponent;
>  };
>  
> -static void free_oa_config(struct drm_i915_private *dev_priv,
> -                          struct i915_oa_config *oa_config)
> +static void put_oa_config(struct i915_oa_config *oa_config)
>  {
> +       if (!atomic_dec_and_test(&oa_config->ref_count))
> +               return;
> +
> +       if (oa_config->obj) {
> +               list_del(&oa_config->vma_link);
> +               i915_gem_object_put(oa_config->obj);
> +       }
> +
>         if (!PTR_ERR(oa_config->flex_regs))
>                 kfree(oa_config->flex_regs);
>         if (!PTR_ERR(oa_config->b_counter_regs))
> @@ -377,38 +384,142 @@ static void free_oa_config(struct drm_i915_private *dev_priv,
>         kfree(oa_config);
>  }
>  
> -static void put_oa_config(struct drm_i915_private *dev_priv,
> -                         struct i915_oa_config *oa_config)
> +static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs)
>  {
> -       if (!atomic_dec_and_test(&oa_config->ref_count))
> -               return;
> +       u32 i;
> +
> +       for (i = 0; i < n_regs; i++) {
> +               if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
> +                       u32 n_lri = min(n_regs - i,
> +                                       (u32) MI_LOAD_REGISTER_IMM_MAX_REGS);
>  
> -       free_oa_config(dev_priv, oa_config);
> +                       *cs++ = MI_LOAD_REGISTER_IMM(n_lri);
> +               }
> +               *cs++ = i915_mmio_reg_offset(reg_data[i].addr);
> +               *cs++ = reg_data[i].value;
> +       }
> +
> +       return cs;
>  }
>  
> -static int get_oa_config(struct drm_i915_private *dev_priv,
> -                        int metrics_set,
> -                        struct i915_oa_config **out_config)
> +static int alloc_oa_config_buffer(struct drm_i915_private *i915,
> +                                 struct i915_oa_config *oa_config)
>  {
> +       struct drm_i915_gem_object *bo;
> +       size_t config_length = 0;
>         int ret;
> +       u32 *cs;
>  
> -       if (metrics_set == 1) {
> -               *out_config = &dev_priv->perf.oa.test_config;
> -               atomic_inc(&dev_priv->perf.oa.test_config.ref_count);
> -               return 0;
> +       if (oa_config->mux_regs_len > 0) {
> +               config_length += DIV_ROUND_UP(oa_config->mux_regs_len,
> +                                             MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
> +               config_length += oa_config->mux_regs_len * 8;
> +       }
> +       if (oa_config->b_counter_regs_len > 0) {
> +               config_length += DIV_ROUND_UP(oa_config->b_counter_regs_len,
> +                                             MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
> +               config_length += oa_config->b_counter_regs_len * 8;
>         }
> +       if (oa_config->flex_regs_len > 0) {
> +               config_length += DIV_ROUND_UP(oa_config->flex_regs_len,
> +                                             MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
> +               config_length += oa_config->flex_regs_len * 8;
> +       }
> +       config_length += 4; /* MI_BATCH_BUFFER_END */
> +       config_length = ALIGN(config_length, I915_GTT_PAGE_SIZE);
>  
> -       ret = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
> +       ret = i915_mutex_lock_interruptible(&i915->drm);

struct_mutex not required for creating/populating an object.

>         if (ret)
>                 return ret;
>  
> -       *out_config = idr_find(&dev_priv->perf.metrics_idr, metrics_set);
> -       if (!*out_config)
> -               ret = -EINVAL;
> -       else
> -               atomic_inc(&(*out_config)->ref_count);
> +       bo = i915_gem_object_create(i915, config_length);
> +       if (IS_ERR(bo)) {
> +               ret = PTR_ERR(bo);
> +               goto unlock;
> +       }
>  
> -       mutex_unlock(&dev_priv->perf.metrics_lock);
> +       cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
> +       if (IS_ERR(cs)) {
> +               ret = PTR_ERR(cs);
> +               goto err_unref;
> +       }
> +
> +       memset(cs, 0, config_length);

Already zeroed, and write_cs_mi_lri() leaves no holes.

> +       cs = write_cs_mi_lri(cs, oa_config->mux_regs, oa_config->mux_regs_len);
> +       cs = write_cs_mi_lri(cs, oa_config->b_counter_regs, oa_config->b_counter_regs_len);
> +       cs = write_cs_mi_lri(cs, oa_config->flex_regs, oa_config->flex_regs_len);
> +
> +       *cs++ = MI_BATCH_BUFFER_END;
> +

i915_gem_object_flush_map(bo);

> +       i915_gem_object_unpin_map(bo);
> +
> +       oa_config->obj = bo;
> +
> +       goto unlock;
> +
> +err_unref:
> +       oa_config->obj = NULL;

was never set.

> +       i915_gem_object_put(bo);

You could avoid the unconditional jump by just taking the ref in
oa_config->obj = i915_gem_object_get(bo);

> +unlock:
> +       mutex_unlock(&i915->drm.struct_mutex);
> +       return ret;
> +}
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 16:36   ` Chris Wilson
@ 2019-05-21 16:50     ` Lionel Landwerlin
  2019-05-21 17:17       ` Chris Wilson
  2019-05-24  9:28     ` Lionel Landwerlin
  1 sibling, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 16:50 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 17:36, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 15:08:52)
>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> index f263a8374273..2ad95977f7a8 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
>>          if (IS_ERR(cs))
>>                  return PTR_ERR(cs);
>>   
>> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
>> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
> My prediction is that this will result in this context being reset due 
> to preemption timeouts and the context under profile being banned. 
> Note that preemption timeouts will be the primary means for hang 
> detection for endless batches. -Chris

Thanks,

One question : how is that dealt with with compute workloads at the moment?
I though those where still not fully preemptable.

I need to rework this with a more "software" approach holding on preemption.
Adding a condition in intel_lrc.c need_preempt() looks like the right 
direction?

Cheers,

-Lionel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily
  2019-05-21 16:43   ` Chris Wilson
@ 2019-05-21 16:55     ` Lionel Landwerlin
  0 siblings, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 16:55 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 17:43, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 15:08:53)
>> Here we introduce a mechanism by which the execbuf part of the i915
>> driver will be able to request that a batch buffer containing the
>> programming for a particular OA config be created.
>>
>> We'll execute these OA configuration buffers right before executing a
>> set of userspace commands so that a particular user batchbuffer be
>> executed with a given OA configuration.
>>
>> This mechanism essentially allows the userspace driver to go through
>> several OA configuration without having to open/close the i915/perf
>> stream.
>>
>> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   1 +
>>   drivers/gpu/drm/i915/i915_drv.h              |  22 ++-
>>   drivers/gpu/drm/i915/i915_perf.c             | 187 ++++++++++++++++---
>>   3 files changed, 178 insertions(+), 32 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> index a34ece53a771..bbcb80cf2a85 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
>> @@ -126,6 +126,7 @@
>>    */
>>   #define MI_LOAD_REGISTER_IMM(x)        MI_INSTR(0x22, 2*(x)-1)
>>   #define   MI_LRI_FORCE_POSTED          (1<<12)
>> +#define MI_LOAD_REGISTER_IMM_MAX_REGS (126)
>>   #define MI_STORE_REGISTER_MEM        MI_INSTR(0x24, 1)
>>   #define MI_STORE_REGISTER_MEM_GEN8   MI_INSTR(0x24, 2)
>>   #define   MI_SRM_LRM_GLOBAL_GTT                (1<<22)
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 1ad3818d2676..abd564bfa03b 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1274,6 +1274,10 @@ struct i915_oa_config {
>>          struct attribute *attrs[2];
>>          struct device_attribute sysfs_metric_id;
>>   
>> +       struct drm_i915_gem_object *obj;
>> +
>> +       struct list_head vma_link;
>> +
>>          atomic_t ref_count;
>>   };
>>   
>> @@ -1856,11 +1860,21 @@ struct drm_i915_private {
>>                  struct mutex metrics_lock;
>>   
>>                  /*
>> -                * List of dynamic configurations, you need to hold
>> -                * dev_priv->perf.metrics_lock to access it.
>> +                * List of dynamic configurations (struct i915_oa_config), you
>> +                * need to hold dev_priv->perf.metrics_lock to access it.
>>                   */
>>                  struct idr metrics_idr;
>>   
>> +               /*
>> +                * List of dynamic configurations (struct i915_oa_config)
>> +                * which have an allocated buffer in GGTT for reconfiguration,
>> +                * you need to hold dev_priv->perf.metrics_lock to access it.
>> +                * Elements are added to the list lazilly on execbuf (when a
>> +                * particular configuration is requested). The list is freed
>> +                * upon closing the perf stream.
>> +                */
>> +               struct list_head metrics_buffers;
>> +
>>                  /*
>>                   * Lock associated with anything below within this structure
>>                   * except exclusive_stream.
>> @@ -3136,6 +3150,10 @@ int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
>>   void i915_oa_init_reg_state(struct intel_engine_cs *engine,
>>                              struct intel_context *ce,
>>                              u32 *reg_state);
>> +int i915_perf_get_oa_config(struct drm_i915_private *i915,
>> +                           int metrics_set,
>> +                           struct i915_oa_config **out_config,
>> +                           struct drm_i915_gem_object **out_obj);
>>   
>>   /* i915_gem_evict.c */
>>   int __must_check i915_gem_evict_something(struct i915_address_space *vm,
>> diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
>> index 8c7fa7f7014b..7e0ebd4bc8f2 100644
>> --- a/drivers/gpu/drm/i915/i915_perf.c
>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>> @@ -365,9 +365,16 @@ struct perf_open_properties {
>>          int oa_period_exponent;
>>   };
>>   
>> -static void free_oa_config(struct drm_i915_private *dev_priv,
>> -                          struct i915_oa_config *oa_config)
>> +static void put_oa_config(struct i915_oa_config *oa_config)
>>   {
>> +       if (!atomic_dec_and_test(&oa_config->ref_count))
>> +               return;
>> +
>> +       if (oa_config->obj) {
>> +               list_del(&oa_config->vma_link);
>> +               i915_gem_object_put(oa_config->obj);
>> +       }
>> +
>>          if (!PTR_ERR(oa_config->flex_regs))
>>                  kfree(oa_config->flex_regs);
>>          if (!PTR_ERR(oa_config->b_counter_regs))
>> @@ -377,38 +384,142 @@ static void free_oa_config(struct drm_i915_private *dev_priv,
>>          kfree(oa_config);
>>   }
>>   
>> -static void put_oa_config(struct drm_i915_private *dev_priv,
>> -                         struct i915_oa_config *oa_config)
>> +static u32 *write_cs_mi_lri(u32 *cs, const struct i915_oa_reg *reg_data, u32 n_regs)
>>   {
>> -       if (!atomic_dec_and_test(&oa_config->ref_count))
>> -               return;
>> +       u32 i;
>> +
>> +       for (i = 0; i < n_regs; i++) {
>> +               if ((i % MI_LOAD_REGISTER_IMM_MAX_REGS) == 0) {
>> +                       u32 n_lri = min(n_regs - i,
>> +                                       (u32) MI_LOAD_REGISTER_IMM_MAX_REGS);
>>   
>> -       free_oa_config(dev_priv, oa_config);
>> +                       *cs++ = MI_LOAD_REGISTER_IMM(n_lri);
>> +               }
>> +               *cs++ = i915_mmio_reg_offset(reg_data[i].addr);
>> +               *cs++ = reg_data[i].value;
>> +       }
>> +
>> +       return cs;
>>   }
>>   
>> -static int get_oa_config(struct drm_i915_private *dev_priv,
>> -                        int metrics_set,
>> -                        struct i915_oa_config **out_config)
>> +static int alloc_oa_config_buffer(struct drm_i915_private *i915,
>> +                                 struct i915_oa_config *oa_config)
>>   {
>> +       struct drm_i915_gem_object *bo;
>> +       size_t config_length = 0;
>>          int ret;
>> +       u32 *cs;
>>   
>> -       if (metrics_set == 1) {
>> -               *out_config = &dev_priv->perf.oa.test_config;
>> -               atomic_inc(&dev_priv->perf.oa.test_config.ref_count);
>> -               return 0;
>> +       if (oa_config->mux_regs_len > 0) {
>> +               config_length += DIV_ROUND_UP(oa_config->mux_regs_len,
>> +                                             MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
>> +               config_length += oa_config->mux_regs_len * 8;
>> +       }
>> +       if (oa_config->b_counter_regs_len > 0) {
>> +               config_length += DIV_ROUND_UP(oa_config->b_counter_regs_len,
>> +                                             MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
>> +               config_length += oa_config->b_counter_regs_len * 8;
>>          }
>> +       if (oa_config->flex_regs_len > 0) {
>> +               config_length += DIV_ROUND_UP(oa_config->flex_regs_len,
>> +                                             MI_LOAD_REGISTER_IMM_MAX_REGS) * 4;
>> +               config_length += oa_config->flex_regs_len * 8;
>> +       }
>> +       config_length += 4; /* MI_BATCH_BUFFER_END */
>> +       config_length = ALIGN(config_length, I915_GTT_PAGE_SIZE);
>>   
>> -       ret = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
>> +       ret = i915_mutex_lock_interruptible(&i915->drm);
> struct_mutex not required for creating/populating an object.


Oh nice! I'll clean this up.

Thanks!


>
>>          if (ret)
>>                  return ret;
>>   
>> -       *out_config = idr_find(&dev_priv->perf.metrics_idr, metrics_set);
>> -       if (!*out_config)
>> -               ret = -EINVAL;
>> -       else
>> -               atomic_inc(&(*out_config)->ref_count);
>> +       bo = i915_gem_object_create(i915, config_length);
>> +       if (IS_ERR(bo)) {
>> +               ret = PTR_ERR(bo);
>> +               goto unlock;
>> +       }
>>   
>> -       mutex_unlock(&dev_priv->perf.metrics_lock);
>> +       cs = i915_gem_object_pin_map(bo, I915_MAP_WB);
>> +       if (IS_ERR(cs)) {
>> +               ret = PTR_ERR(cs);
>> +               goto err_unref;
>> +       }
>> +
>> +       memset(cs, 0, config_length);
> Already zeroed, and write_cs_mi_lri() leaves no holes.
>
>> +       cs = write_cs_mi_lri(cs, oa_config->mux_regs, oa_config->mux_regs_len);
>> +       cs = write_cs_mi_lri(cs, oa_config->b_counter_regs, oa_config->b_counter_regs_len);
>> +       cs = write_cs_mi_lri(cs, oa_config->flex_regs, oa_config->flex_regs_len);
>> +
>> +       *cs++ = MI_BATCH_BUFFER_END;
>> +
> i915_gem_object_flush_map(bo);
>
>> +       i915_gem_object_unpin_map(bo);
>> +
>> +       oa_config->obj = bo;
>> +
>> +       goto unlock;
>> +
>> +err_unref:
>> +       oa_config->obj = NULL;
> was never set.
>
>> +       i915_gem_object_put(bo);
> You could avoid the unconditional jump by just taking the ref in
> oa_config->obj = i915_gem_object_get(bo);
>
>> +unlock:
>> +       mutex_unlock(&i915->drm.struct_mutex);
>> +       return ret;
>> +}


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 14:08 ` [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
@ 2019-05-21 17:07   ` Chris Wilson
  2019-05-21 17:19     ` Lionel Landwerlin
  2019-05-28 10:52   ` Chris Wilson
  1 sibling, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-21 17:07 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-21 15:08:54)
> +       if (eb->oa_config &&
> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {

But the oa_config is not ordered with respect to requests, and the
registers changed here are not context saved and so may be changed by a
third party before execution. The first party must presumably dropped
the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
a third party as the perf_fd owner may change their mind for different
contexts and be surprised when the wrong set is used.

> +               struct i915_vma *oa_vma;
> +
> +               oa_vma = i915_vma_instance(eb->oa_bo,
> +                                             &eb->engine->i915->ggtt.vm, NULL);
> +               if (unlikely(IS_ERR(oa_vma))) {
> +                       err = PTR_ERR(oa_vma);
> +                       return err;
> +               }
> +
> +               err = i915_vma_pin(oa_vma, 0, 0, PIN_GLOBAL);
> +               if (err)
> +                       return err;
> +
> +               err = eb->engine->emit_bb_start(eb->request,
> +                                               oa_vma->node.start,
> +                                               0, I915_DISPATCH_SECURE);
> +               if (err) {
> +                       i915_vma_unpin(oa_vma);
> +                       return err;
> +               }
> +
> +               err = i915_vma_move_to_active(oa_vma, eb->request, 0);

Move to active first, so that the vma is not in use if the move fails.

> +               if (err) {
> +                       i915_vma_unpin(oa_vma);
> +                       return err;
> +               }
> +
> +
> +               i915_vma_unpin(oa_vma);
> +
> +
> +               swap(eb->oa_config, eb->i915->perf.oa.exclusive_stream->oa_config);
> +       }
> +
>         err = eb->engine->emit_bb_start(eb->request,
>                                         eb->batch->node.start +
>                                         eb->batch_start_offset,
> @@ -2341,6 +2410,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>         eb.buffer_count = args->buffer_count;
>         eb.batch_start_offset = args->batch_start_offset;
>         eb.batch_len = args->batch_len;
> +       eb.oa_config = NULL;
>  
>         eb.batch_flags = 0;
>         if (args->flags & I915_EXEC_SECURE) {
> @@ -2385,17 +2455,29 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>          */
>         intel_gt_pm_get(eb.i915);
>  
> -       err = i915_mutex_lock_interruptible(dev);
> -       if (err)
> -               goto err_rpm;
> -
>         err = eb_select_engine(&eb, file, args);

Lost the lock.

>         if (unlikely(err))
> -               goto err_unlock;
> +               goto err_rpm;
> +
> +       if (args->flags & I915_EXEC_PERF_CONFIG) {
> +               if (!intel_engine_has_oa(eb.engine)) {
> +                       err = -ENODEV;
> +                       goto err_engine;
> +               }
> +
> +               err = get_execbuf_oa_config(eb.i915, args->DR1, args->DR4,
> +                                           &eb.oa_config, &eb.oa_bo);
> +               if (err)
> +                       goto err_engine;
> +       }
> +
> +       err = i915_mutex_lock_interruptible(dev);
> +       if (err)
> +               goto err_oa;
>  
>         err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
>         if (unlikely(err))
> -               goto err_engine;
> +               goto err_unlock;
>  
>         err = eb_relocate(&eb);
>         if (err) {
> @@ -2541,10 +2623,15 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  err_vma:
>         if (eb.exec)
>                 eb_release_vmas(&eb);
> -err_engine:
> -       eb_unpin_context(&eb);
>  err_unlock:
>         mutex_unlock(&dev->struct_mutex);
> +err_oa:
> +       if (eb.oa_config) {
> +               i915_gem_object_put(eb.oa_bo);
> +               i915_oa_config_put(eb.oa_config);
> +       }
> +err_engine:
> +       eb_unpin_context(&eb);

Lost the lock.

>  err_rpm:
>         intel_gt_pm_put(eb.i915);
>         i915_gem_context_put(eb.gem_context);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 16:50     ` Lionel Landwerlin
@ 2019-05-21 17:17       ` Chris Wilson
  2019-05-21 17:52         ` Lionel Landwerlin
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-21 17:17 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-21 17:50:30)
> On 21/05/2019 17:36, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-05-21 15:08:52)
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> index f263a8374273..2ad95977f7a8 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
> >>          if (IS_ERR(cs))
> >>                  return PTR_ERR(cs);
> >>   
> >> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> >> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
> > My prediction is that this will result in this context being reset due 
> > to preemption timeouts and the context under profile being banned. 
> > Note that preemption timeouts will be the primary means for hang 
> > detection for endless batches. -Chris
> 
> Thanks,
> 
> One question : how is that dealt with with compute workloads at the moment?
> I though those where still not fully preemptable.

Not blocking is the condition under which they get to use endless...
compute jobs are preemptible from gen9 afaik, gen8 was problematic and so
disabled.
 
> I need to rework this with a more "software" approach holding on preemption.
> Adding a condition in intel_lrc.c need_preempt() looks like the right 
> direction?

Even less if that is our means of hangcheck.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 17:07   ` Chris Wilson
@ 2019-05-21 17:19     ` Lionel Landwerlin
  2019-05-21 17:48       ` Chris Wilson
  0 siblings, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 17:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 18:07, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 15:08:54)
>> +       if (eb->oa_config &&
>> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
> But the oa_config is not ordered with respect to requests, and the
> registers changed here are not context saved and so may be changed by a
> third party before execution. The first party must presumably dropped
> the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
> a third party as the perf_fd owner may change their mind for different
> contexts and be surprised when the wrong set is used.


The OA config batch should be ordered with regard to the MI_BBS we're 
doing just below right?

It's written before in the ring buffer.


That essentially all we need so that as the perf queries run in the 
batch supplied by the application runs with the configuration needed.

If the application shares the perf FD and someone else runs another 
configuration, it's the application fault.

It needs to hold the perf FD for as long as it's doing perf queries and 
not allow anybody else to interact with the OA configuration.


This mechanism is unfortunately what we have resolve to because we don't 
have per context performance counters.

The alternative is post processing the OA buffer (which we do in GL) 
from the CPU which is not really compatible with Vulkan queries.


-Lionel


>
>> +               struct i915_vma *oa_vma;
>> +
>> +               oa_vma = i915_vma_instance(eb->oa_bo,
>> +                                             &eb->engine->i915->ggtt.vm, NULL);
>> +               if (unlikely(IS_ERR(oa_vma))) {
>> +                       err = PTR_ERR(oa_vma);
>> +                       return err;
>> +               }
>> +
>> +               err = i915_vma_pin(oa_vma, 0, 0, PIN_GLOBAL);
>> +               if (err)
>> +                       return err;
>> +
>> +               err = eb->engine->emit_bb_start(eb->request,
>> +                                               oa_vma->node.start,
>> +                                               0, I915_DISPATCH_SECURE);
>> +               if (err) {
>> +                       i915_vma_unpin(oa_vma);
>> +                       return err;
>> +               }
>> +
>> +               err = i915_vma_move_to_active(oa_vma, eb->request, 0);
> Move to active first, so that the vma is not in use if the move fails.
>
>> +               if (err) {
>> +                       i915_vma_unpin(oa_vma);
>> +                       return err;
>> +               }
>> +
>> +
>> +               i915_vma_unpin(oa_vma);
>> +
>> +
>> +               swap(eb->oa_config, eb->i915->perf.oa.exclusive_stream->oa_config);
>> +       }
>> +
>>          err = eb->engine->emit_bb_start(eb->request,
>>                                          eb->batch->node.start +
>>                                          eb->batch_start_offset,
>> @@ -2341,6 +2410,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>          eb.buffer_count = args->buffer_count;
>>          eb.batch_start_offset = args->batch_start_offset;
>>          eb.batch_len = args->batch_len;
>> +       eb.oa_config = NULL;
>>   
>>          eb.batch_flags = 0;
>>          if (args->flags & I915_EXEC_SECURE) {
>> @@ -2385,17 +2455,29 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>           */
>>          intel_gt_pm_get(eb.i915);
>>   
>> -       err = i915_mutex_lock_interruptible(dev);
>> -       if (err)
>> -               goto err_rpm;
>> -
>>          err = eb_select_engine(&eb, file, args);
> Lost the lock.


Whoops... I'll split the engine_has_oa() check away.

Thanks,


-Lionel


>
>>          if (unlikely(err))
>> -               goto err_unlock;
>> +               goto err_rpm;
>> +
>> +       if (args->flags & I915_EXEC_PERF_CONFIG) {
>> +               if (!intel_engine_has_oa(eb.engine)) {
>> +                       err = -ENODEV;
>> +                       goto err_engine;
>> +               }
>> +
>> +               err = get_execbuf_oa_config(eb.i915, args->DR1, args->DR4,
>> +                                           &eb.oa_config, &eb.oa_bo);
>> +               if (err)
>> +                       goto err_engine;
>> +       }
>> +
>> +       err = i915_mutex_lock_interruptible(dev);
>> +       if (err)
>> +               goto err_oa;
>>   
>>          err = eb_wait_for_ring(&eb); /* may temporarily drop struct_mutex */
>>          if (unlikely(err))
>> -               goto err_engine;
>> +               goto err_unlock;
>>   
>>          err = eb_relocate(&eb);
>>          if (err) {
>> @@ -2541,10 +2623,15 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>   err_vma:
>>          if (eb.exec)
>>                  eb_release_vmas(&eb);
>> -err_engine:
>> -       eb_unpin_context(&eb);
>>   err_unlock:
>>          mutex_unlock(&dev->struct_mutex);
>> +err_oa:
>> +       if (eb.oa_config) {
>> +               i915_gem_object_put(eb.oa_bo);
>> +               i915_oa_config_put(eb.oa_config);
>> +       }
>> +err_engine:
>> +       eb_unpin_context(&eb);
> Lost the lock.
>
>>   err_rpm:
>>          intel_gt_pm_put(eb.i915);
>>          i915_gem_context_put(eb.gem_context);


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Vulkan performance query support
  2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
                   ` (6 preceding siblings ...)
  2019-05-21 16:40 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-05-21 17:26 ` Patchwork
  7 siblings, 0 replies; 33+ messages in thread
From: Patchwork @ 2019-05-21 17:26 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Vulkan performance query support
URL   : https://patchwork.freedesktop.org/series/60916/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6114 -> Patchwork_13060
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_13060 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_13060, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_13060:

### IGT changes ###

#### Possible regressions ####

  * igt@gem_busy@basic-busy-default:
    - fi-apl-guc:         [PASS][1] -> [DMESG-WARN][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-apl-guc/igt@gem_busy@basic-busy-default.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-apl-guc/igt@gem_busy@basic-busy-default.html
    - fi-glk-dsi:         [PASS][3] -> [DMESG-WARN][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-glk-dsi/igt@gem_busy@basic-busy-default.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-glk-dsi/igt@gem_busy@basic-busy-default.html
    - fi-ivb-3770:        [PASS][5] -> [DMESG-WARN][6]
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-ivb-3770/igt@gem_busy@basic-busy-default.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-ivb-3770/igt@gem_busy@basic-busy-default.html
    - fi-skl-6600u:       [PASS][7] -> [DMESG-WARN][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-6600u/igt@gem_busy@basic-busy-default.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-6600u/igt@gem_busy@basic-busy-default.html
    - fi-kbl-guc:         [PASS][9] -> [DMESG-WARN][10]
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-kbl-guc/igt@gem_busy@basic-busy-default.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-guc/igt@gem_busy@basic-busy-default.html
    - fi-bsw-kefka:       [PASS][11] -> [DMESG-WARN][12]
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bsw-kefka/igt@gem_busy@basic-busy-default.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bsw-kefka/igt@gem_busy@basic-busy-default.html
    - fi-kbl-x1275:       [PASS][13] -> [DMESG-WARN][14]
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-kbl-x1275/igt@gem_busy@basic-busy-default.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-x1275/igt@gem_busy@basic-busy-default.html
    - fi-kbl-7500u:       [PASS][15] -> [DMESG-WARN][16]
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-kbl-7500u/igt@gem_busy@basic-busy-default.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-7500u/igt@gem_busy@basic-busy-default.html
    - fi-kbl-8809g:       [PASS][17] -> [DMESG-WARN][18]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-kbl-8809g/igt@gem_busy@basic-busy-default.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-8809g/igt@gem_busy@basic-busy-default.html
    - fi-whl-u:           [PASS][19] -> [DMESG-WARN][20]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-whl-u/igt@gem_busy@basic-busy-default.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-whl-u/igt@gem_busy@basic-busy-default.html
    - fi-skl-gvtdvm:      [PASS][21] -> [DMESG-WARN][22]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-gvtdvm/igt@gem_busy@basic-busy-default.html
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-gvtdvm/igt@gem_busy@basic-busy-default.html
    - fi-ilk-650:         [PASS][23] -> [DMESG-WARN][24]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-ilk-650/igt@gem_busy@basic-busy-default.html
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-ilk-650/igt@gem_busy@basic-busy-default.html
    - fi-bdw-gvtdvm:      [PASS][25] -> [DMESG-WARN][26]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bdw-gvtdvm/igt@gem_busy@basic-busy-default.html
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bdw-gvtdvm/igt@gem_busy@basic-busy-default.html
    - fi-bxt-j4205:       [PASS][27] -> [DMESG-WARN][28]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bxt-j4205/igt@gem_busy@basic-busy-default.html
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bxt-j4205/igt@gem_busy@basic-busy-default.html
    - fi-cfl-guc:         [PASS][29] -> [DMESG-WARN][30]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-cfl-guc/igt@gem_busy@basic-busy-default.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cfl-guc/igt@gem_busy@basic-busy-default.html
    - fi-skl-iommu:       [PASS][31] -> [DMESG-WARN][32]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-iommu/igt@gem_busy@basic-busy-default.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-iommu/igt@gem_busy@basic-busy-default.html
    - fi-skl-guc:         [PASS][33] -> [DMESG-WARN][34]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-guc/igt@gem_busy@basic-busy-default.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-guc/igt@gem_busy@basic-busy-default.html
    - fi-kbl-7567u:       [PASS][35] -> [DMESG-WARN][36]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-kbl-7567u/igt@gem_busy@basic-busy-default.html
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-7567u/igt@gem_busy@basic-busy-default.html
    - fi-elk-e7500:       [PASS][37] -> [DMESG-WARN][38]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-elk-e7500/igt@gem_busy@basic-busy-default.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-elk-e7500/igt@gem_busy@basic-busy-default.html
    - fi-snb-2520m:       [PASS][39] -> [DMESG-WARN][40]
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-snb-2520m/igt@gem_busy@basic-busy-default.html
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-snb-2520m/igt@gem_busy@basic-busy-default.html
    - fi-hsw-peppy:       [PASS][41] -> [DMESG-WARN][42]
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-hsw-peppy/igt@gem_busy@basic-busy-default.html
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-hsw-peppy/igt@gem_busy@basic-busy-default.html
    - fi-cfl-8700k:       [PASS][43] -> [DMESG-WARN][44]
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-cfl-8700k/igt@gem_busy@basic-busy-default.html
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cfl-8700k/igt@gem_busy@basic-busy-default.html
    - fi-hsw-4770r:       [PASS][45] -> [DMESG-WARN][46]
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-hsw-4770r/igt@gem_busy@basic-busy-default.html
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-hsw-4770r/igt@gem_busy@basic-busy-default.html
    - fi-bdw-5557u:       [PASS][47] -> [DMESG-WARN][48]
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bdw-5557u/igt@gem_busy@basic-busy-default.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bdw-5557u/igt@gem_busy@basic-busy-default.html
    - fi-kbl-r:           [PASS][49] -> [DMESG-WARN][50]
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-kbl-r/igt@gem_busy@basic-busy-default.html
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-r/igt@gem_busy@basic-busy-default.html
    - fi-skl-6770hq:      [PASS][51] -> [DMESG-WARN][52]
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-6770hq/igt@gem_busy@basic-busy-default.html
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-6770hq/igt@gem_busy@basic-busy-default.html
    - fi-byt-n2820:       [PASS][53] -> [DMESG-WARN][54]
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-byt-n2820/igt@gem_busy@basic-busy-default.html
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-byt-n2820/igt@gem_busy@basic-busy-default.html
    - fi-skl-lmem:        [PASS][55] -> [DMESG-WARN][56]
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-lmem/igt@gem_busy@basic-busy-default.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-lmem/igt@gem_busy@basic-busy-default.html
    - fi-cfl-8109u:       [PASS][57] -> [DMESG-WARN][58]
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-cfl-8109u/igt@gem_busy@basic-busy-default.html
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cfl-8109u/igt@gem_busy@basic-busy-default.html
    - fi-byt-j1900:       [PASS][59] -> [DMESG-WARN][60]
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-byt-j1900/igt@gem_busy@basic-busy-default.html
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-byt-j1900/igt@gem_busy@basic-busy-default.html
    - fi-bxt-dsi:         [PASS][61] -> [DMESG-WARN][62]
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bxt-dsi/igt@gem_busy@basic-busy-default.html
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bxt-dsi/igt@gem_busy@basic-busy-default.html
    - fi-skl-6260u:       [PASS][63] -> [DMESG-WARN][64]
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-6260u/igt@gem_busy@basic-busy-default.html
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-6260u/igt@gem_busy@basic-busy-default.html
    - fi-snb-2600:        [PASS][65] -> [DMESG-WARN][66]
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-snb-2600/igt@gem_busy@basic-busy-default.html
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-snb-2600/igt@gem_busy@basic-busy-default.html
    - fi-bsw-n3050:       [PASS][67] -> [DMESG-WARN][68]
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bsw-n3050/igt@gem_busy@basic-busy-default.html
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bsw-n3050/igt@gem_busy@basic-busy-default.html
    - fi-skl-6700k2:      [PASS][69] -> [DMESG-WARN][70]
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-skl-6700k2/igt@gem_busy@basic-busy-default.html
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-skl-6700k2/igt@gem_busy@basic-busy-default.html
    - fi-hsw-4770:        [PASS][71] -> [DMESG-WARN][72]
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-hsw-4770/igt@gem_busy@basic-busy-default.html
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-hsw-4770/igt@gem_busy@basic-busy-default.html

  * igt@gem_close_race@basic-process:
    - fi-bwr-2160:        [PASS][73] -> [DMESG-WARN][74]
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-bwr-2160/igt@gem_close_race@basic-process.html
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bwr-2160/igt@gem_close_race@basic-process.html

  * igt@runner@aborted:
    - fi-ilk-650:         NOTRUN -> [FAIL][75]
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-ilk-650/igt@runner@aborted.html
    - fi-bdw-gvtdvm:      NOTRUN -> [FAIL][76]
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bdw-gvtdvm/igt@runner@aborted.html
    - fi-cfl-8109u:       NOTRUN -> [FAIL][77]
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cfl-8109u/igt@runner@aborted.html
    - fi-hsw-peppy:       NOTRUN -> [FAIL][78]
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-hsw-peppy/igt@runner@aborted.html
    - fi-snb-2520m:       NOTRUN -> [FAIL][79]
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-snb-2520m/igt@runner@aborted.html
    - fi-hsw-4770:        NOTRUN -> [FAIL][80]
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-hsw-4770/igt@runner@aborted.html
    - fi-kbl-7500u:       NOTRUN -> [FAIL][81]
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-7500u/igt@runner@aborted.html
    - fi-bxt-j4205:       NOTRUN -> [FAIL][82]
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bxt-j4205/igt@runner@aborted.html
    - fi-whl-u:           NOTRUN -> [FAIL][83]
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-whl-u/igt@runner@aborted.html
    - fi-ivb-3770:        NOTRUN -> [FAIL][84]
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-ivb-3770/igt@runner@aborted.html
    - fi-bxt-dsi:         NOTRUN -> [FAIL][85]
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bxt-dsi/igt@runner@aborted.html
    - fi-byt-j1900:       NOTRUN -> [FAIL][86]
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-byt-j1900/igt@runner@aborted.html
    - fi-cfl-guc:         NOTRUN -> [FAIL][87]
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cfl-guc/igt@runner@aborted.html
    - fi-kbl-7567u:       NOTRUN -> [FAIL][88]
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-7567u/igt@runner@aborted.html
    - fi-kbl-x1275:       NOTRUN -> [FAIL][89]
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-x1275/igt@runner@aborted.html
    - fi-cfl-8700k:       NOTRUN -> [FAIL][90]
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cfl-8700k/igt@runner@aborted.html
    - fi-hsw-4770r:       NOTRUN -> [FAIL][91]
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-hsw-4770r/igt@runner@aborted.html
    - fi-kbl-8809g:       NOTRUN -> [FAIL][92]
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-8809g/igt@runner@aborted.html
    - fi-kbl-r:           NOTRUN -> [FAIL][93]
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-r/igt@runner@aborted.html
    - fi-bdw-5557u:       NOTRUN -> [FAIL][94]
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-bdw-5557u/igt@runner@aborted.html
    - fi-byt-n2820:       NOTRUN -> [FAIL][95]
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-byt-n2820/igt@runner@aborted.html
    - fi-kbl-guc:         NOTRUN -> [FAIL][96]
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-kbl-guc/igt@runner@aborted.html
    - fi-snb-2600:        NOTRUN -> [FAIL][97]
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-snb-2600/igt@runner@aborted.html
    - fi-elk-e7500:       NOTRUN -> [FAIL][98]
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-elk-e7500/igt@runner@aborted.html

  
#### Warnings ####

  * igt@runner@aborted:
    - fi-apl-guc:         [FAIL][99] ([fdo#110622]) -> [FAIL][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-apl-guc/igt@runner@aborted.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-apl-guc/igt@runner@aborted.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_busy@basic-busy-default:
    - {fi-icl-y}:         [PASS][101] -> [DMESG-WARN][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-icl-y/igt@gem_busy@basic-busy-default.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-y/igt@gem_busy@basic-busy-default.html
    - {fi-icl-dsi}:       [PASS][103] -> [DMESG-WARN][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-icl-dsi/igt@gem_busy@basic-busy-default.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-dsi/igt@gem_busy@basic-busy-default.html
    - {fi-cml-u}:         [PASS][105] -> [DMESG-WARN][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-cml-u/igt@gem_busy@basic-busy-default.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cml-u/igt@gem_busy@basic-busy-default.html
    - {fi-icl-u3}:        [PASS][107] -> [DMESG-WARN][108]
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-icl-u3/igt@gem_busy@basic-busy-default.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-u3/igt@gem_busy@basic-busy-default.html
    - {fi-icl-u2}:        [PASS][109] -> [DMESG-WARN][110]
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-icl-u2/igt@gem_busy@basic-busy-default.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-u2/igt@gem_busy@basic-busy-default.html
    - {fi-cml-u2}:        [PASS][111] -> [DMESG-WARN][112]
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-cml-u2/igt@gem_busy@basic-busy-default.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cml-u2/igt@gem_busy@basic-busy-default.html

  * igt@runner@aborted:
    - {fi-icl-u2}:        NOTRUN -> [FAIL][113]
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-u2/igt@runner@aborted.html
    - {fi-cml-u2}:        NOTRUN -> [FAIL][114]
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cml-u2/igt@runner@aborted.html
    - {fi-icl-u3}:        NOTRUN -> [FAIL][115]
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-u3/igt@runner@aborted.html
    - {fi-cml-u}:         NOTRUN -> [FAIL][116]
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-cml-u/igt@runner@aborted.html
    - {fi-icl-y}:         NOTRUN -> [FAIL][117]
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-y/igt@runner@aborted.html
    - {fi-icl-dsi}:       NOTRUN -> [FAIL][118]
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-dsi/igt@runner@aborted.html

  
Known issues
------------

  Here are the changes found in Patchwork_13060 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_module_load@reload:
    - fi-blb-e6850:       [PASS][119] -> [INCOMPLETE][120] ([fdo#107718])
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-blb-e6850/igt@i915_module_load@reload.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-blb-e6850/igt@i915_module_load@reload.html

  
#### Possible fixes ####

  * igt@gem_basic@create-fd-close:
    - {fi-icl-u3}:        [DMESG-WARN][121] ([fdo#107724]) -> [PASS][122]
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6114/fi-icl-u3/igt@gem_basic@create-fd-close.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/fi-icl-u3/igt@gem_basic@create-fd-close.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#110622]: https://bugs.freedesktop.org/show_bug.cgi?id=110622


Participating hosts (53 -> 46)
------------------------------

  Missing    (7): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-byt-clapper fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_6114 -> Patchwork_13060

  CI_DRM_6114: 8691fe536e41c852d3d420ed09b1d5f9916031e7 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5000: f9961d14d76b3a0fa1296e547f7c065e2f93955c @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_13060: bc43d752f0ac41e75563f6fd18b55852dc4f5e42 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

bc43d752f0ac drm/i915: add support for perf configuration queries
e3b3246a741b drm/i915: add a new perf configuration execbuf parameter
9e8f734b8a37 drm/i915/perf: allow for CS OA configs to be created lazily
3744cae4f

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_13060/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 17:19     ` Lionel Landwerlin
@ 2019-05-21 17:48       ` Chris Wilson
  2019-05-21 17:59         ` Lionel Landwerlin
  2019-05-22  9:19         ` Lionel Landwerlin
  0 siblings, 2 replies; 33+ messages in thread
From: Chris Wilson @ 2019-05-21 17:48 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-21 18:19:50)
> On 21/05/2019 18:07, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-05-21 15:08:54)
> >> +       if (eb->oa_config &&
> >> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
> > But the oa_config is not ordered with respect to requests, and the
> > registers changed here are not context saved and so may be changed by a
> > third party before execution. The first party must presumably dropped
> > the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
> > a third party as the perf_fd owner may change their mind for different
> > contexts and be surprised when the wrong set is used.
> 
> 
> The OA config batch should be ordered with regard to the MI_BBS we're 
> doing just below right?

But you only emit if the oa_config has changed. Ergo, it may have
changed between submission and execution.

> It's written before in the ring buffer.
> 
> 
> That essentially all we need so that as the perf queries run in the 
> batch supplied by the application runs with the configuration needed.
> 
> If the application shares the perf FD and someone else runs another 
> configuration, it's the application fault.
> 
> It needs to hold the perf FD for as long as it's doing perf queries and 
> not allow anybody else to interact with the OA configuration.

If one app is trying to use different configs on different contexts
(which seems reasonable if it is trying to sample different stats?) then
it can be caught out. The order of execution is not the same as the
order of submission (unless we enforce it by e.g. defining the
perf.oa_config as a barrier).


Another way would be to unconditionally emit the BB_START for the
oa_vma, and instead do the early exit with a MI_CONDITIONAL_BB_END by
comparing against a value stashed in the engine hwsp. You could do a
predicated BB_START instead, but that looks to be more of a nuisance.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 17:17       ` Chris Wilson
@ 2019-05-21 17:52         ` Lionel Landwerlin
  0 siblings, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 17:52 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 18:17, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 17:50:30)
>> On 21/05/2019 17:36, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2019-05-21 15:08:52)
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>> index f263a8374273..2ad95977f7a8 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
>>>>           if (IS_ERR(cs))
>>>>                   return PTR_ERR(cs);
>>>>    
>>>> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
>>>> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
>>> My prediction is that this will result in this context being reset due
>>> to preemption timeouts and the context under profile being banned.
>>> Note that preemption timeouts will be the primary means for hang
>>> detection for endless batches. -Chris
>> Thanks,
>>
>> One question : how is that dealt with with compute workloads at the moment?
>> I though those where still not fully preemptable.
> Not blocking is the condition under which they get to use endless...
> compute jobs are preemptible from gen9 afaik, gen8 was problematic and so
> disabled.
>   
>> I need to rework this with a more "software" approach holding on preemption.
>> Adding a condition in intel_lrc.c need_preempt() looks like the right
>> direction?
> Even less if that is our means of hangcheck.
> -Chris
>

Can we differentiate between a hangcheck & a high priority request?

If I remember correctly, we can set the hangcheck timeout somewhere in /sys.

I think it's fine to ban the context doing a perf query if it's taking 
too long.


If a user runs into that scenario we can tell them to increase the timeout.


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 17:48       ` Chris Wilson
@ 2019-05-21 17:59         ` Lionel Landwerlin
  2019-05-22  9:19         ` Lionel Landwerlin
  1 sibling, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-21 17:59 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 18:48, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 18:19:50)
>> On 21/05/2019 18:07, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2019-05-21 15:08:54)
>>>> +       if (eb->oa_config &&
>>>> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
>>> But the oa_config is not ordered with respect to requests, and the
>>> registers changed here are not context saved and so may be changed by a
>>> third party before execution. The first party must presumably dropped
>>> the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
>>> a third party as the perf_fd owner may change their mind for different
>>> contexts and be surprised when the wrong set is used.
>>
>> The OA config batch should be ordered with regard to the MI_BBS we're
>> doing just below right?
> But you only emit if the oa_config has changed. Ergo, it may have
> changed between submission and execution.
>
>> It's written before in the ring buffer.
>>
>>
>> That essentially all we need so that as the perf queries run in the
>> batch supplied by the application runs with the configuration needed.
>>
>> If the application shares the perf FD and someone else runs another
>> configuration, it's the application fault.
>>
>> It needs to hold the perf FD for as long as it's doing perf queries and
>> not allow anybody else to interact with the OA configuration.
> If one app is trying to use different configs on different contexts
> (which seems reasonable if it is trying to sample different stats?) then
> it can be caught out. The order of execution is not the same as the
> order of submission (unless we enforce it by e.g. defining the
> perf.oa_config as a barrier).


Thanks, I think I see the problem. It's pretty much the same as the sseu 
reconfiguration.

Looking at the code it seems that the barrier is gone for sseu and I'm 
afraid that sounds like what's needed here :(


-Lionel

>
>
> Another way would be to unconditionally emit the BB_START for the
> oa_vma, and instead do the early exit with a MI_CONDITIONAL_BB_END by
> comparing against a value stashed in the engine hwsp. You could do a
> predicated BB_START instead, but that looks to be more of a nuisance.
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 14:08 ` [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
  2019-05-21 16:36   ` Chris Wilson
@ 2019-05-22  4:33   ` kbuild test robot
  1 sibling, 0 replies; 33+ messages in thread
From: kbuild test robot @ 2019-05-22  4:33 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 25475 bytes --]

Hi Lionel,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20190521]
[cannot apply to v5.2-rc1]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Lionel-Landwerlin/drm-i915-Vulkan-performance-query-support/20190522-083115
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
reproduce: make htmldocs

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   WARNING: convert(1) not found, for SVG to PDF conversion install ImageMagick (https://www.imagemagick.org)
   include/linux/generic-radix-tree.h:1: warning: no structured comments found
   kernel/rcu/tree_plugin.h:1: warning: no structured comments found
   kernel/rcu/tree_plugin.h:1: warning: no structured comments found
   include/linux/firmware/intel/stratix10-svc-client.h:1: warning: no structured comments found
   include/linux/gpio/driver.h:371: warning: Function parameter or member 'init_valid_mask' not described in 'gpio_chip'
   include/linux/i2c.h:343: warning: Function parameter or member 'init_irq' not described in 'i2c_client'
   include/linux/iio/hw-consumer.h:1: warning: no structured comments found
   include/linux/input/sparse-keymap.h:46: warning: Function parameter or member 'sw' not described in 'key_entry'
   include/linux/regulator/machine.h:199: warning: Function parameter or member 'max_uV_step' not described in 'regulation_constraints'
   include/linux/regulator/driver.h:228: warning: Function parameter or member 'resume' not described in 'regulator_ops'
   drivers/slimbus/stream.c:1: warning: no structured comments found
   include/linux/spi/spi.h:188: warning: Function parameter or member 'driver_override' not described in 'spi_device'
   drivers/target/target_core_device.c:1: warning: no structured comments found
   drivers/usb/typec/bus.c:1: warning: no structured comments found
   drivers/usb/typec/class.c:1: warning: no structured comments found
   include/linux/w1.h:281: warning: Function parameter or member 'of_match_table' not described in 'w1_family'
   fs/direct-io.c:257: warning: Excess function parameter 'offset' description in 'dio_complete'
   fs/file_table.c:1: warning: no structured comments found
   fs/libfs.c:477: warning: Excess function parameter 'available' description in 'simple_write_end'
   fs/posix_acl.c:646: warning: Function parameter or member 'inode' not described in 'posix_acl_update_mode'
   fs/posix_acl.c:646: warning: Function parameter or member 'mode_p' not described in 'posix_acl_update_mode'
   fs/posix_acl.c:646: warning: Function parameter or member 'acl' not described in 'posix_acl_update_mode'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:294: warning: Excess function parameter 'mm' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:294: warning: Excess function parameter 'start' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:294: warning: Excess function parameter 'end' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:343: warning: Excess function parameter 'mm' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:343: warning: Excess function parameter 'start' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:343: warning: Excess function parameter 'end' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:183: warning: Function parameter or member 'blockable' not described in 'amdgpu_mn_read_lock'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:295: warning: Function parameter or member 'range' not described in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:295: warning: Excess function parameter 'mm' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:295: warning: Excess function parameter 'start' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:295: warning: Excess function parameter 'end' description in 'amdgpu_mn_invalidate_range_start_hsa'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:344: warning: Function parameter or member 'range' not described in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:344: warning: Excess function parameter 'mm' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:344: warning: Excess function parameter 'start' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c:344: warning: Excess function parameter 'end' description in 'amdgpu_mn_invalidate_range_end'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:347: warning: cannot understand function prototype: 'struct amdgpu_vm_pt_cursor '
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:348: warning: cannot understand function prototype: 'struct amdgpu_vm_pt_cursor '
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:494: warning: Function parameter or member 'start' not described in 'amdgpu_vm_pt_first_dfs'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:546: warning: Function parameter or member 'adev' not described in 'for_each_amdgpu_vm_pt_dfs_safe'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:546: warning: Function parameter or member 'vm' not described in 'for_each_amdgpu_vm_pt_dfs_safe'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:546: warning: Function parameter or member 'start' not described in 'for_each_amdgpu_vm_pt_dfs_safe'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:546: warning: Function parameter or member 'cursor' not described in 'for_each_amdgpu_vm_pt_dfs_safe'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:546: warning: Function parameter or member 'entry' not described in 'for_each_amdgpu_vm_pt_dfs_safe'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:823: warning: Function parameter or member 'level' not described in 'amdgpu_vm_bo_param'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'params' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'bo' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'level' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'pe' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'addr' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'count' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'incr' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:1285: warning: Function parameter or member 'flags' not described in 'amdgpu_vm_update_flags'
   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2781: warning: Function parameter or member 'pasid' not described in 'amdgpu_vm_make_compute'
   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:375: warning: Excess function parameter 'entry' description in 'amdgpu_irq_dispatch'
   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: Function parameter or member 'ih' not described in 'amdgpu_irq_dispatch'
   drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:376: warning: Excess function parameter 'entry' description in 'amdgpu_irq_dispatch'
   drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c:1: warning: no structured comments found
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:128: warning: Incorrect use of kernel-doc format: Documentation Makefile include scripts source @atomic_obj
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:210: warning: Function parameter or member 'atomic_obj' not described in 'amdgpu_display_manager'
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:210: warning: Function parameter or member 'backlight_link' not described in 'amdgpu_display_manager'
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:210: warning: Function parameter or member 'backlight_caps' not described in 'amdgpu_display_manager'
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:210: warning: Function parameter or member 'freesync_module' not described in 'amdgpu_display_manager'
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:210: warning: Function parameter or member 'fw_dmcu' not described in 'amdgpu_display_manager'
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:210: warning: Function parameter or member 'dmcu_fw_version' not described in 'amdgpu_display_manager'
   drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:1: warning: no structured comments found
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_pin' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_unpin' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_res_obj' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_get_sg_table' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_import_sg_table' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_vmap' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_vunmap' not described in 'drm_driver'
   include/drm/drm_drv.h:715: warning: Function parameter or member 'gem_prime_mmap' not described in 'drm_driver'
   include/drm/drm_modeset_helper_vtables.h:1004: warning: Function parameter or member 'prepare_writeback_job' not described in 'drm_connector_helper_funcs'
   include/drm/drm_modeset_helper_vtables.h:1004: warning: Function parameter or member 'cleanup_writeback_job' not described in 'drm_connector_helper_funcs'
   include/drm/drm_atomic_state_helper.h:1: warning: no structured comments found
   drivers/gpu/drm/drm_gem.c:1384: warning: Function parameter or member 'fence_array' not described in 'drm_gem_fence_array_add'
   drivers/gpu/drm/drm_gem.c:1384: warning: Function parameter or member 'fence' not described in 'drm_gem_fence_array_add'
   drivers/gpu/drm/drm_gem.c:1435: warning: Function parameter or member 'fence_array' not described in 'drm_gem_fence_array_add_implicit'
   drivers/gpu/drm/drm_gem.c:1435: warning: Function parameter or member 'obj' not described in 'drm_gem_fence_array_add_implicit'
   drivers/gpu/drm/drm_gem.c:1435: warning: Function parameter or member 'write' not described in 'drm_gem_fence_array_add_implicit'
   drivers/gpu/drm/scheduler/sched_main.c:376: warning: Excess function parameter 'bad' description in 'drm_sched_stop'
   drivers/gpu/drm/scheduler/sched_main.c:377: warning: Excess function parameter 'bad' description in 'drm_sched_stop'
   drivers/gpu/drm/scheduler/sched_main.c:420: warning: Function parameter or member 'full_recovery' not described in 'drm_sched_start'
   Error: Cannot open file drivers/gpu/drm/i915/intel_workarounds.c
   WARNING: kernel-doc 'scripts/kernel-doc -rst -enable-lineno -function Hardware workarounds drivers/gpu/drm/i915/intel_workarounds.c' failed with return code 1
   drivers/gpu/drm/i915/i915_vma.h:50: warning: cannot understand function prototype: 'struct i915_vma '
   drivers/gpu/drm/i915/i915_vma.h:1: warning: no structured comments found
   Error: Cannot open file drivers/gpu/drm/i915/intel_lrc.c
   WARNING: kernel-doc 'scripts/kernel-doc -rst -enable-lineno -function Logical Rings, Logical Ring Contexts and Execlists drivers/gpu/drm/i915/intel_lrc.c' failed with return code 1
   Error: Cannot open file drivers/gpu/drm/i915/intel_lrc.c
   Error: Cannot open file drivers/gpu/drm/i915/intel_lrc.c
   WARNING: kernel-doc 'scripts/kernel-doc -rst -enable-lineno -internal drivers/gpu/drm/i915/intel_lrc.c' failed with return code 2
   drivers/gpu/drm/i915/intel_guc_fwif.h:536: warning: cannot understand function prototype: 'struct guc_log_buffer_state '
   drivers/gpu/drm/i915/i915_trace.h:1: warning: no structured comments found
>> drivers/gpu/drm/i915/i915_perf.c:367: warning: Function parameter or member 'context_disable_preemption' not described in 'perf_open_properties'
   drivers/gpu/drm/i915/i915_reg.h:156: warning: bad line:
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:126: warning: Function parameter or member 'hw_id' not described in 'komeda_component'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:126: warning: Function parameter or member 'max_active_outputs' not described in 'komeda_component'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:126: warning: Function parameter or member 'supported_outputs' not described in 'komeda_component'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:142: warning: Function parameter or member 'output_port' not described in 'komeda_component_output'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:196: warning: Function parameter or member 'component' not described in 'komeda_component_state'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:196: warning: Function parameter or member 'crtc' not described in 'komeda_component_state'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:196: warning: Function parameter or member 'plane' not described in 'komeda_component_state'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:196: warning: Function parameter or member 'wb_conn' not described in 'komeda_component_state'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:196: warning: Function parameter or member 'changed_active_inputs' not described in 'komeda_component_state'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:196: warning: Function parameter or member 'affected_inputs' not described in 'komeda_component_state'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'n_layers' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'layers' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'n_scalers' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'scalers' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'compiz' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'wb_layer' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'improc' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'ctrlr' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:323: warning: Function parameter or member 'funcs' not described in 'komeda_pipeline'
   drivers/gpu/drm/arm/display/komeda/komeda_pipeline.h:344: warning: Function parameter or member 'pipe' not described in 'komeda_pipeline_state'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'dev' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'reg_base' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'chip' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'mclk' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'n_pipelines' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'pipelines' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_dev.h:148: warning: Function parameter or member 'debugfs_root' not described in 'komeda_dev'
   drivers/gpu/drm/arm/display/komeda/komeda_framebuffer.h:1: warning: no structured comments found
   drivers/gpu/drm/arm/display/komeda/komeda_crtc.c:1: warning: no structured comments found
   drivers/gpu/drm/arm/display/komeda/komeda_plane.c:1: warning: no structured comments found
   include/linux/interconnect.h:1: warning: no structured comments found
   include/linux/skbuff.h:899: warning: Function parameter or member 'dev_scratch' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'list' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'ip_defrag_offset' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'skb_mstamp_ns' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member '__cloned_offset' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'head_frag' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member '__pkt_type_offset' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'encapsulation' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'encap_hdr_csum' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'csum_valid' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member '__pkt_vlan_present_offset' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'vlan_present' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'csum_complete_sw' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'csum_level' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'inner_protocol_type' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'remcsum_offload' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'sender_cpu' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'reserved_tailroom' not described in 'sk_buff'
   include/linux/skbuff.h:899: warning: Function parameter or member 'inner_ipproto' not described in 'sk_buff'
   include/net/sock.h:238: warning: Function parameter or member 'skc_addrpair' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_portpair' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_ipv6only' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_net_refcnt' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_v6_daddr' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_v6_rcv_saddr' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_cookie' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_listener' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_tw_dr' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_rcv_wnd' not described in 'sock_common'
   include/net/sock.h:238: warning: Function parameter or member 'skc_tw_rcv_nxt' not described in 'sock_common'
   include/net/sock.h:513: warning: Function parameter or member 'sk_wq_raw' not described in 'sock'
   include/net/sock.h:513: warning: Function parameter or member 'tcp_rtx_queue' not described in 'sock'
   include/net/sock.h:513: warning: Function parameter or member 'sk_route_forced_caps' not described in 'sock'
   include/net/sock.h:513: warning: Function parameter or member 'sk_txtime_report_errors' not described in 'sock'
   include/net/sock.h:513: warning: Function parameter or member 'sk_validate_xmit_skb' not described in 'sock'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'gso_partial_features' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'l3mdev_ops' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'xfrmdev_ops' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'tlsdev_ops' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'name_assign_type' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'ieee802154_ptr' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'mpls_ptr' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'xdp_prog' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'gro_flush_timeout' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'nf_hooks_ingress' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member '____cacheline_aligned_in_smp' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'qdisc_hash' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'xps_cpus_map' not described in 'net_device'
   include/linux/netdevice.h:2064: warning: Function parameter or member 'xps_rxqs_map' not described in 'net_device'
   include/linux/phylink.h:56: warning: Function parameter or member '__ETHTOOL_DECLARE_LINK_MODE_MASK(advertising' not described in 'phylink_link_state'
   include/linux/phylink.h:56: warning: Function parameter or member '__ETHTOOL_DECLARE_LINK_MODE_MASK(lp_advertising' not described in 'phylink_link_state'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'quotactl' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'quota_on' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'sb_free_mnt_opts' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'sb_eat_lsm_opts' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'sb_kern_mount' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'sb_show_options' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'sb_add_mnt_opt' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'task_setioprio' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'task_getioprio' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'task_movememory' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'd_instantiate' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'getprocattr' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'setprocattr' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'secmark_refcount_inc' not described in 'security_list_options'
   include/linux/lsm_hooks.h:1802: warning: Function parameter or member 'secmark_refcount_dec' not described in 'security_list_options'
   Documentation/bpf/btf.rst:152: ERROR: Unexpected indentation.
   Documentation/bpf/btf.rst:161: ERROR: Unexpected indentation.

vim +367 drivers/gpu/drm/i915/i915_perf.c

eec688e1 Robert Bragg 2016-11-07 @367  

:::::: The code at line 367 was first introduced by commit
:::::: eec688e1420da584afb36ffa5f0cad75f53cf286 drm/i915: Add i915 perf infrastructure

:::::: TO: Robert Bragg <robert@sixbynine.org>
:::::: CC: Daniel Vetter <daniel.vetter@ffwll.ch>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6690 bytes --]

[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 17:48       ` Chris Wilson
  2019-05-21 17:59         ` Lionel Landwerlin
@ 2019-05-22  9:19         ` Lionel Landwerlin
  2019-05-22  9:25           ` Chris Wilson
  1 sibling, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-22  9:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 18:48, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 18:19:50)
>> On 21/05/2019 18:07, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2019-05-21 15:08:54)
>>>> +       if (eb->oa_config &&
>>>> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
>>> But the oa_config is not ordered with respect to requests, and the
>>> registers changed here are not context saved and so may be changed by a
>>> third party before execution. The first party must presumably dropped
>>> the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
>>> a third party as the perf_fd owner may change their mind for different
>>> contexts and be surprised when the wrong set is used.
>>
>> The OA config batch should be ordered with regard to the MI_BBS we're
>> doing just below right?
> But you only emit if the oa_config has changed. Ergo, it may have
> changed between submission and execution.
>
>> It's written before in the ring buffer.
>>
>>
>> That essentially all we need so that as the perf queries run in the
>> batch supplied by the application runs with the configuration needed.
>>
>> If the application shares the perf FD and someone else runs another
>> configuration, it's the application fault.
>>
>> It needs to hold the perf FD for as long as it's doing perf queries and
>> not allow anybody else to interact with the OA configuration.
> If one app is trying to use different configs on different contexts
> (which seems reasonable if it is trying to sample different stats?) then
> it can be caught out. The order of execution is not the same as the
> order of submission (unless we enforce it by e.g. defining the
> perf.oa_config as a barrier).


Thinking about this a bit more, the use case here was always that the 
app is the single user of the OA unit.

In this scenario, the app is doing queries and should not share the 
configuration of the OA HW with anybody else.


So all the sampling should be ordered with regard to the context's timeline.


-Lionel


>
>
> Another way would be to unconditionally emit the BB_START for the
> oa_vma, and instead do the early exit with a MI_CONDITIONAL_BB_END by
> comparing against a value stashed in the engine hwsp. You could do a
> predicated BB_START instead, but that looks to be more of a nuisance.
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-22  9:19         ` Lionel Landwerlin
@ 2019-05-22  9:25           ` Chris Wilson
  2019-05-22  9:30             ` Lionel Landwerlin
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-22  9:25 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-22 10:19:46)
> On 21/05/2019 18:48, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-05-21 18:19:50)
> >> On 21/05/2019 18:07, Chris Wilson wrote:
> >>> Quoting Lionel Landwerlin (2019-05-21 15:08:54)
> >>>> +       if (eb->oa_config &&
> >>>> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
> >>> But the oa_config is not ordered with respect to requests, and the
> >>> registers changed here are not context saved and so may be changed by a
> >>> third party before execution. The first party must presumably dropped
> >>> the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
> >>> a third party as the perf_fd owner may change their mind for different
> >>> contexts and be surprised when the wrong set is used.
> >>
> >> The OA config batch should be ordered with regard to the MI_BBS we're
> >> doing just below right?
> > But you only emit if the oa_config has changed. Ergo, it may have
> > changed between submission and execution.
> >
> >> It's written before in the ring buffer.
> >>
> >>
> >> That essentially all we need so that as the perf queries run in the
> >> batch supplied by the application runs with the configuration needed.
> >>
> >> If the application shares the perf FD and someone else runs another
> >> configuration, it's the application fault.
> >>
> >> It needs to hold the perf FD for as long as it's doing perf queries and
> >> not allow anybody else to interact with the OA configuration.
> > If one app is trying to use different configs on different contexts
> > (which seems reasonable if it is trying to sample different stats?) then
> > it can be caught out. The order of execution is not the same as the
> > order of submission (unless we enforce it by e.g. defining the
> > perf.oa_config as a barrier).
> 
> 
> Thinking about this a bit more, the use case here was always that the 
> app is the single user of the OA unit.
> 
> In this scenario, the app is doing queries and should not share the 
> configuration of the OA HW with anybody else.

What about with itself? And does that excuse the kernel carrying a
TOCTOU bug?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-22  9:25           ` Chris Wilson
@ 2019-05-22  9:30             ` Lionel Landwerlin
  0 siblings, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-22  9:30 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 22/05/2019 10:25, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-22 10:19:46)
>> On 21/05/2019 18:48, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2019-05-21 18:19:50)
>>>> On 21/05/2019 18:07, Chris Wilson wrote:
>>>>> Quoting Lionel Landwerlin (2019-05-21 15:08:54)
>>>>>> +       if (eb->oa_config &&
>>>>>> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
>>>>> But the oa_config is not ordered with respect to requests, and the
>>>>> registers changed here are not context saved and so may be changed by a
>>>>> third party before execution. The first party must presumably dropped
>>>>> the perf_fd before then, so maybe you don't care? Hmm, doesn't even take
>>>>> a third party as the perf_fd owner may change their mind for different
>>>>> contexts and be surprised when the wrong set is used.
>>>> The OA config batch should be ordered with regard to the MI_BBS we're
>>>> doing just below right?
>>> But you only emit if the oa_config has changed. Ergo, it may have
>>> changed between submission and execution.
>>>
>>>> It's written before in the ring buffer.
>>>>
>>>>
>>>> That essentially all we need so that as the perf queries run in the
>>>> batch supplied by the application runs with the configuration needed.
>>>>
>>>> If the application shares the perf FD and someone else runs another
>>>> configuration, it's the application fault.
>>>>
>>>> It needs to hold the perf FD for as long as it's doing perf queries and
>>>> not allow anybody else to interact with the OA configuration.
>>> If one app is trying to use different configs on different contexts
>>> (which seems reasonable if it is trying to sample different stats?) then
>>> it can be caught out. The order of execution is not the same as the
>>> order of submission (unless we enforce it by e.g. defining the
>>> perf.oa_config as a barrier).
>>
>> Thinking about this a bit more, the use case here was always that the
>> app is the single user of the OA unit.
>>
>> In this scenario, the app is doing queries and should not share the
>> configuration of the OA HW with anybody else.
> What about with itself? And does that excuse the kernel carrying a
> TOCTOU bug?
> -Chris
>
You mean with something like Iris that uses 2 contexts?

I'm assuming things are properly synchronized.


There is also another problem with the 2 contexts which is that we only 
allow filtering a single ID at the moment...


Sorry, I'm not familiar with the TOCTOU bug :(


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/5] drm/i915: add support for perf configuration queries
  2019-05-21 14:08 ` [PATCH 5/5] drm/i915: add support for perf configuration queries Lionel Landwerlin
@ 2019-05-23 10:32   ` Dan Carpenter
  2019-05-23 11:25     ` Lionel Landwerlin
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Carpenter @ 2019-05-23 10:32 UTC (permalink / raw)
  To: kbuild, Lionel Landwerlin; +Cc: intel-gfx, kbuild-all

Hi Lionel,

Thank you for the patch! Perhaps something to improve:

url:    https://github.com/0day-ci/linux/commits/Lionel-Landwerlin/drm-i915-Vulkan-performance-query-support/20190522-083115
base:   git://anongit.freedesktop.org/drm-intel for-linux-next

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>

smatch warnings:
drivers/gpu/drm/i915/i915_query.c:290 query_perf_config_data() warn: inconsistent returns 'mutex:&i915->perf.metrics_lock'.
  Locked on:   line 220
  Unlocked on: line 170

# https://github.com/0day-ci/linux/commit/2df5c78741c44ada4e0b3b7b016923cbbb30ab76
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 2df5c78741c44ada4e0b3b7b016923cbbb30ab76
vim +290 drivers/gpu/drm/i915/i915_query.c

2df5c787 Lionel Landwerlin 2019-05-21  187  	if (__get_user(flags, &user_query_config_ptr->flags))
2df5c787 Lionel Landwerlin 2019-05-21  188  		return -EFAULT;
2df5c787 Lionel Landwerlin 2019-05-21  189  
2df5c787 Lionel Landwerlin 2019-05-21  190  	if (flags != 0)
2df5c787 Lionel Landwerlin 2019-05-21  191  		return -EINVAL;
2df5c787 Lionel Landwerlin 2019-05-21  192  
2df5c787 Lionel Landwerlin 2019-05-21  193  	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Lock

2df5c787 Lionel Landwerlin 2019-05-21  194  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  195  		return ret;
2df5c787 Lionel Landwerlin 2019-05-21  196  
2df5c787 Lionel Landwerlin 2019-05-21  197  	if (use_uuid) {
2df5c787 Lionel Landwerlin 2019-05-21  198  		char uuid[UUID_STRING_LEN + 1] = { 0, };
2df5c787 Lionel Landwerlin 2019-05-21  199  		struct i915_oa_config *tmp;
2df5c787 Lionel Landwerlin 2019-05-21  200  		int id;
2df5c787 Lionel Landwerlin 2019-05-21  201  
2df5c787 Lionel Landwerlin 2019-05-21  202  		BUILD_BUG_ON(sizeof(user_query_config_ptr->uuid) >= sizeof(uuid));
2df5c787 Lionel Landwerlin 2019-05-21  203  
2df5c787 Lionel Landwerlin 2019-05-21  204  		if (__copy_from_user(uuid, user_query_config_ptr->uuid,
2df5c787 Lionel Landwerlin 2019-05-21  205  				     sizeof(user_query_config_ptr->uuid))) {
2df5c787 Lionel Landwerlin 2019-05-21  206  			ret = -EFAULT;
2df5c787 Lionel Landwerlin 2019-05-21  207  			goto out;
2df5c787 Lionel Landwerlin 2019-05-21  208  		}
2df5c787 Lionel Landwerlin 2019-05-21  209  
2df5c787 Lionel Landwerlin 2019-05-21  210  		idr_for_each_entry(&i915->perf.metrics_idr, tmp, id) {
2df5c787 Lionel Landwerlin 2019-05-21  211  			if (!strcmp(tmp->uuid, uuid)) {
2df5c787 Lionel Landwerlin 2019-05-21  212  				oa_config = tmp;
2df5c787 Lionel Landwerlin 2019-05-21  213  				break;
2df5c787 Lionel Landwerlin 2019-05-21  214  			}
2df5c787 Lionel Landwerlin 2019-05-21  215  		}
2df5c787 Lionel Landwerlin 2019-05-21  216  	} else {
2df5c787 Lionel Landwerlin 2019-05-21  217  		u64 config_id;
2df5c787 Lionel Landwerlin 2019-05-21  218  
2df5c787 Lionel Landwerlin 2019-05-21  219  		if (__get_user(config_id, &user_query_config_ptr->config))
2df5c787 Lionel Landwerlin 2019-05-21  220  			return -EFAULT;
                                                                ^^^^^^^^^^^^^^
This needs to be "goto out;"

2df5c787 Lionel Landwerlin 2019-05-21  221  
2df5c787 Lionel Landwerlin 2019-05-21  222  		if (config_id == 1)
2df5c787 Lionel Landwerlin 2019-05-21  223  			oa_config = &i915->perf.oa.test_config;
2df5c787 Lionel Landwerlin 2019-05-21  224  		else
2df5c787 Lionel Landwerlin 2019-05-21  225  			oa_config = idr_find(&i915->perf.metrics_idr, config_id);
2df5c787 Lionel Landwerlin 2019-05-21  226  	}
2df5c787 Lionel Landwerlin 2019-05-21  227  
2df5c787 Lionel Landwerlin 2019-05-21  228  	if (!oa_config) {
2df5c787 Lionel Landwerlin 2019-05-21  229  		ret = -ENOENT;
2df5c787 Lionel Landwerlin 2019-05-21  230  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  231  	}
2df5c787 Lionel Landwerlin 2019-05-21  232  
2df5c787 Lionel Landwerlin 2019-05-21  233  	if (__copy_from_user(&user_config, user_config_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  234  			     sizeof(user_config))) {
2df5c787 Lionel Landwerlin 2019-05-21  235  		ret = -EFAULT;
2df5c787 Lionel Landwerlin 2019-05-21  236  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  237  	}
2df5c787 Lionel Landwerlin 2019-05-21  238  
2df5c787 Lionel Landwerlin 2019-05-21  239  	ret = can_copy_perf_config_registers_or_number(user_config.n_boolean_regs,
2df5c787 Lionel Landwerlin 2019-05-21  240  						       user_config.boolean_regs_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  241  						       oa_config->b_counter_regs_len);
2df5c787 Lionel Landwerlin 2019-05-21  242  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  243  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  244  
2df5c787 Lionel Landwerlin 2019-05-21  245  	ret = can_copy_perf_config_registers_or_number(user_config.n_flex_regs,
2df5c787 Lionel Landwerlin 2019-05-21  246  						       user_config.flex_regs_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  247  						       oa_config->flex_regs_len);
2df5c787 Lionel Landwerlin 2019-05-21  248  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  249  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  250  
2df5c787 Lionel Landwerlin 2019-05-21  251  	ret = can_copy_perf_config_registers_or_number(user_config.n_mux_regs,
2df5c787 Lionel Landwerlin 2019-05-21  252  						       user_config.mux_regs_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  253  						       oa_config->mux_regs_len);
2df5c787 Lionel Landwerlin 2019-05-21  254  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  255  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  256  
2df5c787 Lionel Landwerlin 2019-05-21  257  	ret = copy_perf_config_registers_or_number(oa_config->b_counter_regs,
2df5c787 Lionel Landwerlin 2019-05-21  258  						   oa_config->b_counter_regs_len,
2df5c787 Lionel Landwerlin 2019-05-21  259  						   user_config.boolean_regs_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  260  						   &user_config.n_boolean_regs);
2df5c787 Lionel Landwerlin 2019-05-21  261  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  262  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  263  
2df5c787 Lionel Landwerlin 2019-05-21  264  	ret = copy_perf_config_registers_or_number(oa_config->flex_regs,
2df5c787 Lionel Landwerlin 2019-05-21  265  						   oa_config->flex_regs_len,
2df5c787 Lionel Landwerlin 2019-05-21  266  						   user_config.flex_regs_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  267  						   &user_config.n_flex_regs);
2df5c787 Lionel Landwerlin 2019-05-21  268  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  269  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  270  
2df5c787 Lionel Landwerlin 2019-05-21  271  	ret = copy_perf_config_registers_or_number(oa_config->mux_regs,
2df5c787 Lionel Landwerlin 2019-05-21  272  						   oa_config->mux_regs_len,
2df5c787 Lionel Landwerlin 2019-05-21  273  						   user_config.mux_regs_ptr,
2df5c787 Lionel Landwerlin 2019-05-21  274  						   &user_config.n_mux_regs);
2df5c787 Lionel Landwerlin 2019-05-21  275  	if (ret)
2df5c787 Lionel Landwerlin 2019-05-21  276  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  277  
2df5c787 Lionel Landwerlin 2019-05-21  278  	memcpy(user_config.uuid, oa_config->uuid, sizeof(user_config.uuid));
2df5c787 Lionel Landwerlin 2019-05-21  279  
2df5c787 Lionel Landwerlin 2019-05-21  280  	if (__copy_to_user(user_config_ptr, &user_config,
2df5c787 Lionel Landwerlin 2019-05-21  281  			   sizeof(user_config))) {
2df5c787 Lionel Landwerlin 2019-05-21  282  		ret = -EFAULT;
2df5c787 Lionel Landwerlin 2019-05-21  283  		goto out;
2df5c787 Lionel Landwerlin 2019-05-21  284  	}
2df5c787 Lionel Landwerlin 2019-05-21  285  
2df5c787 Lionel Landwerlin 2019-05-21  286  	ret = total_size;
2df5c787 Lionel Landwerlin 2019-05-21  287  
2df5c787 Lionel Landwerlin 2019-05-21  288  out:
2df5c787 Lionel Landwerlin 2019-05-21  289  	mutex_unlock(&i915->perf.metrics_lock);
2df5c787 Lionel Landwerlin 2019-05-21 @290  	return ret;
2df5c787 Lionel Landwerlin 2019-05-21  291  }
2df5c787 Lionel Landwerlin 2019-05-21  292  

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/5] drm/i915: add support for perf configuration queries
  2019-05-23 10:32   ` Dan Carpenter
@ 2019-05-23 11:25     ` Lionel Landwerlin
  2019-05-23 11:38       ` Dan Carpenter
  0 siblings, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-23 11:25 UTC (permalink / raw)
  To: Dan Carpenter, kbuild; +Cc: intel-gfx, kbuild-all

Hi Dan,

Not quite sure if you read responses to what seems like an automated 
message.
I have a question below.

Thanks,

-Lionel

On 23/05/2019 11:32, Dan Carpenter wrote:
> Hi Lionel,
>
> Thank you for the patch! Perhaps something to improve:
>
> url:    https://github.com/0day-ci/linux/commits/Lionel-Landwerlin/drm-i915-Vulkan-performance-query-support/20190522-083115
> base:   git://anongit.freedesktop.org/drm-intel for-linux-next
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>
> smatch warnings:
> drivers/gpu/drm/i915/i915_query.c:290 query_perf_config_data() warn: inconsistent returns 'mutex:&i915->perf.metrics_lock'.
>    Locked on:   line 220
>    Unlocked on: line 170
>
> # https://github.com/0day-ci/linux/commit/2df5c78741c44ada4e0b3b7b016923cbbb30ab76
> git remote add linux-review https://github.com/0day-ci/linux
> git remote update linux-review
> git checkout 2df5c78741c44ada4e0b3b7b016923cbbb30ab76
> vim +290 drivers/gpu/drm/i915/i915_query.c
>
> 2df5c787 Lionel Landwerlin 2019-05-21  187  	if (__get_user(flags, &user_query_config_ptr->flags))
> 2df5c787 Lionel Landwerlin 2019-05-21  188  		return -EFAULT;
> 2df5c787 Lionel Landwerlin 2019-05-21  189
> 2df5c787 Lionel Landwerlin 2019-05-21  190  	if (flags != 0)
> 2df5c787 Lionel Landwerlin 2019-05-21  191  		return -EINVAL;
> 2df5c787 Lionel Landwerlin 2019-05-21  192
> 2df5c787 Lionel Landwerlin 2019-05-21  193  	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
>                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> Lock


What do you mean by that?


>
> 2df5c787 Lionel Landwerlin 2019-05-21  194  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  195  		return ret;
> 2df5c787 Lionel Landwerlin 2019-05-21  196
> 2df5c787 Lionel Landwerlin 2019-05-21  197  	if (use_uuid) {
> 2df5c787 Lionel Landwerlin 2019-05-21  198  		char uuid[UUID_STRING_LEN + 1] = { 0, };
> 2df5c787 Lionel Landwerlin 2019-05-21  199  		struct i915_oa_config *tmp;
> 2df5c787 Lionel Landwerlin 2019-05-21  200  		int id;
> 2df5c787 Lionel Landwerlin 2019-05-21  201
> 2df5c787 Lionel Landwerlin 2019-05-21  202  		BUILD_BUG_ON(sizeof(user_query_config_ptr->uuid) >= sizeof(uuid));
> 2df5c787 Lionel Landwerlin 2019-05-21  203
> 2df5c787 Lionel Landwerlin 2019-05-21  204  		if (__copy_from_user(uuid, user_query_config_ptr->uuid,
> 2df5c787 Lionel Landwerlin 2019-05-21  205  				     sizeof(user_query_config_ptr->uuid))) {
> 2df5c787 Lionel Landwerlin 2019-05-21  206  			ret = -EFAULT;
> 2df5c787 Lionel Landwerlin 2019-05-21  207  			goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  208  		}
> 2df5c787 Lionel Landwerlin 2019-05-21  209
> 2df5c787 Lionel Landwerlin 2019-05-21  210  		idr_for_each_entry(&i915->perf.metrics_idr, tmp, id) {
> 2df5c787 Lionel Landwerlin 2019-05-21  211  			if (!strcmp(tmp->uuid, uuid)) {
> 2df5c787 Lionel Landwerlin 2019-05-21  212  				oa_config = tmp;
> 2df5c787 Lionel Landwerlin 2019-05-21  213  				break;
> 2df5c787 Lionel Landwerlin 2019-05-21  214  			}
> 2df5c787 Lionel Landwerlin 2019-05-21  215  		}
> 2df5c787 Lionel Landwerlin 2019-05-21  216  	} else {
> 2df5c787 Lionel Landwerlin 2019-05-21  217  		u64 config_id;
> 2df5c787 Lionel Landwerlin 2019-05-21  218
> 2df5c787 Lionel Landwerlin 2019-05-21  219  		if (__get_user(config_id, &user_query_config_ptr->config))
> 2df5c787 Lionel Landwerlin 2019-05-21  220  			return -EFAULT;
>                                                                  ^^^^^^^^^^^^^^
> This needs to be "goto out;"


Nice catch thanks!


>
> 2df5c787 Lionel Landwerlin 2019-05-21  221
> 2df5c787 Lionel Landwerlin 2019-05-21  222  		if (config_id == 1)
> 2df5c787 Lionel Landwerlin 2019-05-21  223  			oa_config = &i915->perf.oa.test_config;
> 2df5c787 Lionel Landwerlin 2019-05-21  224  		else
> 2df5c787 Lionel Landwerlin 2019-05-21  225  			oa_config = idr_find(&i915->perf.metrics_idr, config_id);
> 2df5c787 Lionel Landwerlin 2019-05-21  226  	}
> 2df5c787 Lionel Landwerlin 2019-05-21  227
> 2df5c787 Lionel Landwerlin 2019-05-21  228  	if (!oa_config) {
> 2df5c787 Lionel Landwerlin 2019-05-21  229  		ret = -ENOENT;
> 2df5c787 Lionel Landwerlin 2019-05-21  230  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  231  	}
> 2df5c787 Lionel Landwerlin 2019-05-21  232
> 2df5c787 Lionel Landwerlin 2019-05-21  233  	if (__copy_from_user(&user_config, user_config_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  234  			     sizeof(user_config))) {
> 2df5c787 Lionel Landwerlin 2019-05-21  235  		ret = -EFAULT;
> 2df5c787 Lionel Landwerlin 2019-05-21  236  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  237  	}
> 2df5c787 Lionel Landwerlin 2019-05-21  238
> 2df5c787 Lionel Landwerlin 2019-05-21  239  	ret = can_copy_perf_config_registers_or_number(user_config.n_boolean_regs,
> 2df5c787 Lionel Landwerlin 2019-05-21  240  						       user_config.boolean_regs_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  241  						       oa_config->b_counter_regs_len);
> 2df5c787 Lionel Landwerlin 2019-05-21  242  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  243  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  244
> 2df5c787 Lionel Landwerlin 2019-05-21  245  	ret = can_copy_perf_config_registers_or_number(user_config.n_flex_regs,
> 2df5c787 Lionel Landwerlin 2019-05-21  246  						       user_config.flex_regs_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  247  						       oa_config->flex_regs_len);
> 2df5c787 Lionel Landwerlin 2019-05-21  248  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  249  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  250
> 2df5c787 Lionel Landwerlin 2019-05-21  251  	ret = can_copy_perf_config_registers_or_number(user_config.n_mux_regs,
> 2df5c787 Lionel Landwerlin 2019-05-21  252  						       user_config.mux_regs_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  253  						       oa_config->mux_regs_len);
> 2df5c787 Lionel Landwerlin 2019-05-21  254  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  255  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  256
> 2df5c787 Lionel Landwerlin 2019-05-21  257  	ret = copy_perf_config_registers_or_number(oa_config->b_counter_regs,
> 2df5c787 Lionel Landwerlin 2019-05-21  258  						   oa_config->b_counter_regs_len,
> 2df5c787 Lionel Landwerlin 2019-05-21  259  						   user_config.boolean_regs_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  260  						   &user_config.n_boolean_regs);
> 2df5c787 Lionel Landwerlin 2019-05-21  261  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  262  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  263
> 2df5c787 Lionel Landwerlin 2019-05-21  264  	ret = copy_perf_config_registers_or_number(oa_config->flex_regs,
> 2df5c787 Lionel Landwerlin 2019-05-21  265  						   oa_config->flex_regs_len,
> 2df5c787 Lionel Landwerlin 2019-05-21  266  						   user_config.flex_regs_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  267  						   &user_config.n_flex_regs);
> 2df5c787 Lionel Landwerlin 2019-05-21  268  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  269  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  270
> 2df5c787 Lionel Landwerlin 2019-05-21  271  	ret = copy_perf_config_registers_or_number(oa_config->mux_regs,
> 2df5c787 Lionel Landwerlin 2019-05-21  272  						   oa_config->mux_regs_len,
> 2df5c787 Lionel Landwerlin 2019-05-21  273  						   user_config.mux_regs_ptr,
> 2df5c787 Lionel Landwerlin 2019-05-21  274  						   &user_config.n_mux_regs);
> 2df5c787 Lionel Landwerlin 2019-05-21  275  	if (ret)
> 2df5c787 Lionel Landwerlin 2019-05-21  276  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  277
> 2df5c787 Lionel Landwerlin 2019-05-21  278  	memcpy(user_config.uuid, oa_config->uuid, sizeof(user_config.uuid));
> 2df5c787 Lionel Landwerlin 2019-05-21  279
> 2df5c787 Lionel Landwerlin 2019-05-21  280  	if (__copy_to_user(user_config_ptr, &user_config,
> 2df5c787 Lionel Landwerlin 2019-05-21  281  			   sizeof(user_config))) {
> 2df5c787 Lionel Landwerlin 2019-05-21  282  		ret = -EFAULT;
> 2df5c787 Lionel Landwerlin 2019-05-21  283  		goto out;
> 2df5c787 Lionel Landwerlin 2019-05-21  284  	}
> 2df5c787 Lionel Landwerlin 2019-05-21  285
> 2df5c787 Lionel Landwerlin 2019-05-21  286  	ret = total_size;
> 2df5c787 Lionel Landwerlin 2019-05-21  287
> 2df5c787 Lionel Landwerlin 2019-05-21  288  out:
> 2df5c787 Lionel Landwerlin 2019-05-21  289  	mutex_unlock(&i915->perf.metrics_lock);
> 2df5c787 Lionel Landwerlin 2019-05-21 @290  	return ret;
> 2df5c787 Lionel Landwerlin 2019-05-21  291  }
> 2df5c787 Lionel Landwerlin 2019-05-21  292
>
> ---
> 0-DAY kernel test infrastructure                Open Source Technology Center
> https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 5/5] drm/i915: add support for perf configuration queries
  2019-05-23 11:25     ` Lionel Landwerlin
@ 2019-05-23 11:38       ` Dan Carpenter
  0 siblings, 0 replies; 33+ messages in thread
From: Dan Carpenter @ 2019-05-23 11:38 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx, kbuild, kbuild-all

On Thu, May 23, 2019 at 12:25:44PM +0100, Lionel Landwerlin wrote:
> Hi Dan,
> 
> Not quite sure if you read responses to what seems like an automated
> message.
> I have a question below.
> 
> Thanks,
> 
> -Lionel
> 
> On 23/05/2019 11:32, Dan Carpenter wrote:
> > Hi Lionel,
> > 
> > Thank you for the patch! Perhaps something to improve:
> > 
> > url:    https://github.com/0day-ci/linux/commits/Lionel-Landwerlin/drm-i915-Vulkan-performance-query-support/20190522-083115
> > base:   git://anongit.freedesktop.org/drm-intel for-linux-next
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kbuild test robot <lkp@intel.com>
> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > 
> > smatch warnings:
> > drivers/gpu/drm/i915/i915_query.c:290 query_perf_config_data() warn: inconsistent returns 'mutex:&i915->perf.metrics_lock'.
> >    Locked on:   line 220
> >    Unlocked on: line 170
> > 
> > # https://github.com/0day-ci/linux/commit/2df5c78741c44ada4e0b3b7b016923cbbb30ab76
> > git remote add linux-review https://github.com/0day-ci/linux
> > git remote update linux-review
> > git checkout 2df5c78741c44ada4e0b3b7b016923cbbb30ab76
> > vim +290 drivers/gpu/drm/i915/i915_query.c
> > 
> > 2df5c787 Lionel Landwerlin 2019-05-21  187  	if (__get_user(flags, &user_query_config_ptr->flags))
> > 2df5c787 Lionel Landwerlin 2019-05-21  188  		return -EFAULT;
> > 2df5c787 Lionel Landwerlin 2019-05-21  189
> > 2df5c787 Lionel Landwerlin 2019-05-21  190  	if (flags != 0)
> > 2df5c787 Lionel Landwerlin 2019-05-21  191  		return -EINVAL;
> > 2df5c787 Lionel Landwerlin 2019-05-21  192
> > 2df5c787 Lionel Landwerlin 2019-05-21  193  	ret = mutex_lock_interruptible(&i915->perf.metrics_lock);
> >                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > Lock
> 
> 
> What do you mean by that?
> 

Just observing that we take the lock on this line.

:P

regards,
dan carpenter
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-21 16:36   ` Chris Wilson
  2019-05-21 16:50     ` Lionel Landwerlin
@ 2019-05-24  9:28     ` Lionel Landwerlin
  2019-05-24  9:42       ` Chris Wilson
  1 sibling, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-24  9:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 21/05/2019 17:36, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 15:08:52)
>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> index f263a8374273..2ad95977f7a8 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
>>          if (IS_ERR(cs))
>>                  return PTR_ERR(cs);
>>   
>> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
>> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
> My prediction is that this will result in this context being reset due
> to preemption timeouts and the context under profile being banned. Note
> that preemption timeouts will be the primary means for hang detection
> for endless batches.
> -Chris
>

Another thought :

What if we ran with the max priority?
It would be fine to have the hangcheck preempt the workload (it's pretty 
short and shouldn't affect perf counters from 3d/compute pipeline much) 
as long as ensure nothing else runs.

-Lionel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-24  9:28     ` Lionel Landwerlin
@ 2019-05-24  9:42       ` Chris Wilson
  2019-05-24  9:51         ` Lionel Landwerlin
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-24  9:42 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-24 10:28:16)
> On 21/05/2019 17:36, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-05-21 15:08:52)
> >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> index f263a8374273..2ad95977f7a8 100644
> >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
> >>          if (IS_ERR(cs))
> >>                  return PTR_ERR(cs);
> >>   
> >> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> >> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
> > My prediction is that this will result in this context being reset due
> > to preemption timeouts and the context under profile being banned. Note
> > that preemption timeouts will be the primary means for hang detection
> > for endless batches.
> > -Chris
> >
> 
> Another thought :
> 
> What if we ran with the max priority?
> It would be fine to have the hangcheck preempt the workload (it's pretty 
> short and shouldn't affect perf counters from 3d/compute pipeline much) 
> as long as ensure nothing else runs.

It's certainly safer from the pov that we don't block preemption and so
don't incur forced resets. Not keen on the system being perturbed by the
act of observing it, and I still dislike the notion of permitting one
client to hog the GPU so easily. Makes me think of RT throttling, and
generally throwing out the absolute priority system (in exchange for
computed deadlines or something).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-24  9:42       ` Chris Wilson
@ 2019-05-24  9:51         ` Lionel Landwerlin
  2019-05-24 10:07           ` Chris Wilson
  0 siblings, 1 reply; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-24  9:51 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 24/05/2019 10:42, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-24 10:28:16)
>> On 21/05/2019 17:36, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2019-05-21 15:08:52)
>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>> index f263a8374273..2ad95977f7a8 100644
>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
>>>>           if (IS_ERR(cs))
>>>>                   return PTR_ERR(cs);
>>>>    
>>>> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
>>>> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
>>> My prediction is that this will result in this context being reset due
>>> to preemption timeouts and the context under profile being banned. Note
>>> that preemption timeouts will be the primary means for hang detection
>>> for endless batches.
>>> -Chris
>>>
>> Another thought :
>>
>> What if we ran with the max priority?
>> It would be fine to have the hangcheck preempt the workload (it's pretty
>> short and shouldn't affect perf counters from 3d/compute pipeline much)
>> as long as ensure nothing else runs.
> It's certainly safer from the pov that we don't block preemption and so
> don't incur forced resets. Not keen on the system being perturbed by the
> act of observing it, and I still dislike the notion of permitting one
> client to hog the GPU so easily. Makes me think of RT throttling, and
> generally throwing out the absolute priority system (in exchange for
> computed deadlines or something).
> -Chris
>
I don't like it much either but I can't see how to do otherwise with the 
hardware we currently have.

I'm thinking of 2 priorities values one of scheduling, one once running.

Most contexts would have both values equal.

Could mitigate the issue a bit?


-Lionel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-24  9:51         ` Lionel Landwerlin
@ 2019-05-24 10:07           ` Chris Wilson
  2019-05-27 22:11             ` Lionel Landwerlin
  0 siblings, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-24 10:07 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-24 10:51:49)
> On 24/05/2019 10:42, Chris Wilson wrote:
> > Quoting Lionel Landwerlin (2019-05-24 10:28:16)
> >> On 21/05/2019 17:36, Chris Wilson wrote:
> >>> Quoting Lionel Landwerlin (2019-05-21 15:08:52)
> >>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>> index f263a8374273..2ad95977f7a8 100644
> >>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
> >>>> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
> >>>>           if (IS_ERR(cs))
> >>>>                   return PTR_ERR(cs);
> >>>>    
> >>>> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> >>>> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
> >>> My prediction is that this will result in this context being reset due
> >>> to preemption timeouts and the context under profile being banned. Note
> >>> that preemption timeouts will be the primary means for hang detection
> >>> for endless batches.
> >>> -Chris
> >>>
> >> Another thought :
> >>
> >> What if we ran with the max priority?
> >> It would be fine to have the hangcheck preempt the workload (it's pretty
> >> short and shouldn't affect perf counters from 3d/compute pipeline much)
> >> as long as ensure nothing else runs.
> > It's certainly safer from the pov that we don't block preemption and so
> > don't incur forced resets. Not keen on the system being perturbed by the
> > act of observing it, and I still dislike the notion of permitting one
> > client to hog the GPU so easily. Makes me think of RT throttling, and
> > generally throwing out the absolute priority system (in exchange for
> > computed deadlines or something).
> > -Chris
> >
> I don't like it much either but I can't see how to do otherwise with the 
> hardware we currently have.
> 
> I'm thinking of 2 priorities values one of scheduling, one once running.

It's not quite that easy as you may start running concurrently with one
of your dependencies and must therefore manage the priority inversion if
you boost yourself. And I've just gone through and thrown out the
current complexity of manipulating priority as they run because it made
timeslicing much harder (where the priority was changing between
evaluating the need for the context switch and the context switch
occurring -- such mistakes can be noticed in throughput sensitive
transcode workloads).

> Most contexts would have both values equal.
> 
> Could mitigate the issue a bit?

A bit, it gives you a soft notion of a no-preempt flag without queue
jumping. rq_prio(rq) | intel_context->effective_priority or somesuch.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx
  2019-05-24 10:07           ` Chris Wilson
@ 2019-05-27 22:11             ` Lionel Landwerlin
  0 siblings, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-27 22:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 24/05/2019 11:07, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-24 10:51:49)
>> On 24/05/2019 10:42, Chris Wilson wrote:
>>> Quoting Lionel Landwerlin (2019-05-24 10:28:16)
>>>> On 21/05/2019 17:36, Chris Wilson wrote:
>>>>> Quoting Lionel Landwerlin (2019-05-21 15:08:52)
>>>>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> index f263a8374273..2ad95977f7a8 100644
>>>>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
>>>>>> @@ -2085,7 +2085,7 @@ static int gen9_emit_bb_start(struct i915_request *rq,
>>>>>>            if (IS_ERR(cs))
>>>>>>                    return PTR_ERR(cs);
>>>>>>     
>>>>>> -       *cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
>>>>>> +       *cs++ = MI_ARB_ON_OFF | rq->hw_context->arb_enable;
>>>>> My prediction is that this will result in this context being reset due
>>>>> to preemption timeouts and the context under profile being banned. Note
>>>>> that preemption timeouts will be the primary means for hang detection
>>>>> for endless batches.
>>>>> -Chris
>>>>>
>>>> Another thought :
>>>>
>>>> What if we ran with the max priority?
>>>> It would be fine to have the hangcheck preempt the workload (it's pretty
>>>> short and shouldn't affect perf counters from 3d/compute pipeline much)
>>>> as long as ensure nothing else runs.
>>> It's certainly safer from the pov that we don't block preemption and so
>>> don't incur forced resets. Not keen on the system being perturbed by the
>>> act of observing it, and I still dislike the notion of permitting one
>>> client to hog the GPU so easily. Makes me think of RT throttling, and
>>> generally throwing out the absolute priority system (in exchange for
>>> computed deadlines or something).
>>> -Chris
>>>
>> I don't like it much either but I can't see how to do otherwise with the
>> hardware we currently have.
>>
>> I'm thinking of 2 priorities values one of scheduling, one once running.
> It's not quite that easy as you may start running concurrently with one
> of your dependencies and must therefore manage the priority inversion if
> you boost yourself. And I've just gone through and thrown out the
> current complexity of manipulating priority as they run because it made
> timeslicing much harder (where the priority was changing between
> evaluating the need for the context switch and the context switch
> occurring -- such mistakes can be noticed in throughput sensitive
> transcode workloads).


It's like you wrote a scheduler before!


Here is how I could see this work.

I can see the 3 different stages of a request :

   - waiting on dependencies

   - in the engine queue

   - in the HW


The request would maintain is normal/default priority until it hits the HW.

When hitting the HW for the first time, its priority is upgraded to perf 
priority so that it sticks to the HW until completition (or some other 
timeout kicks it off the HW).


Does that still sound broken?


Thanks a lot,


-Lionel


>
>> Most contexts would have both values equal.
>>
>> Could mitigate the issue a bit?
> A bit, it gives you a soft notion of a no-preempt flag without queue
> jumping. rq_prio(rq) | intel_context->effective_priority or somesuch.
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-21 14:08 ` [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
  2019-05-21 17:07   ` Chris Wilson
@ 2019-05-28 10:52   ` Chris Wilson
  2019-05-30 18:41     ` Lionel Landwerlin
  1 sibling, 1 reply; 33+ messages in thread
From: Chris Wilson @ 2019-05-28 10:52 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx

Quoting Lionel Landwerlin (2019-05-21 15:08:54)
> @@ -2048,6 +2081,42 @@ static int eb_submit(struct i915_execbuffer *eb)
>                         return err;
>         }

if (eb->oa_config) {
	err = i915_active_request_set(&eb->i915->perf.oa.oa_config_active,
				      eb->request);
	if (err)
		return err;
}

with the addition of
	struct i915_active_request oa_config_active;
to i915->perf.oa, and i915_active_init; That will ensure that the
oa_config can't be changed before execution (and the ordering restriction
is essentially a no-op if only one context has a specified oa_config).

> +       if (eb->oa_config &&
> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {

Fwiw, I would move these to eb_oa_config().

if (eb->oa_config) {
	err = eb_oa_config(eb);
	if (err)
		return err;
}

How does eb_oa_config mix with the global gen8_configure_all_contexts()?
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter
  2019-05-28 10:52   ` Chris Wilson
@ 2019-05-30 18:41     ` Lionel Landwerlin
  0 siblings, 0 replies; 33+ messages in thread
From: Lionel Landwerlin @ 2019-05-30 18:41 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 28/05/2019 11:52, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2019-05-21 15:08:54)
>> @@ -2048,6 +2081,42 @@ static int eb_submit(struct i915_execbuffer *eb)
>>                          return err;
>>          }
> if (eb->oa_config) {
> 	err = i915_active_request_set(&eb->i915->perf.oa.oa_config_active,
> 				      eb->request);
> 	if (err)
> 		return err;
> }
>
> with the addition of
> 	struct i915_active_request oa_config_active;
> to i915->perf.oa, and i915_active_init; That will ensure that the
> oa_config can't be changed before execution (and the ordering restriction
> is essentially a no-op if only one context has a specified oa_config).
>
>> +       if (eb->oa_config &&
>> +           eb->oa_config != eb->i915->perf.oa.exclusive_stream->oa_config) {
> Fwiw, I would move these to eb_oa_config().
>
> if (eb->oa_config) {
> 	err = eb_oa_config(eb);
> 	if (err)
> 		return err;
> }
>
> How does eb_oa_config mix with the global gen8_configure_all_contexts()?


Excellent point, I should document this.


HW configurations have roughly 3 parts :

    - NOA configuration, network configuration to source specific data 
from anywhere (global, non power saved/restored)

    - Boolean counters, filters signals brought by NOA (global, can't 
remember whether saved/restored)

    - Flex counters, filters on events happening within the EUs (per 
context, saved/restored)


First two will affect all running contexts (because global), last one 
will only be applied to the context that triggered the execbuf.

That should be fine because of the other requirement we need (don't 
preempt the context running a performance query).


-Lionel


> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2019-05-30 18:41 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-21 14:08 [PATCH 0/5] drm/i915: Vulkan performance query support Lionel Landwerlin
2019-05-21 14:08 ` [PATCH 1/5] drm/i915/perf: introduce a versioning of the i915-perf uapi Lionel Landwerlin
2019-05-21 14:08 ` [PATCH 2/5] drm/i915/perf: allow holding preemption on filtered ctx Lionel Landwerlin
2019-05-21 16:36   ` Chris Wilson
2019-05-21 16:50     ` Lionel Landwerlin
2019-05-21 17:17       ` Chris Wilson
2019-05-21 17:52         ` Lionel Landwerlin
2019-05-24  9:28     ` Lionel Landwerlin
2019-05-24  9:42       ` Chris Wilson
2019-05-24  9:51         ` Lionel Landwerlin
2019-05-24 10:07           ` Chris Wilson
2019-05-27 22:11             ` Lionel Landwerlin
2019-05-22  4:33   ` kbuild test robot
2019-05-21 14:08 ` [PATCH 3/5] drm/i915/perf: allow for CS OA configs to be created lazily Lionel Landwerlin
2019-05-21 16:43   ` Chris Wilson
2019-05-21 16:55     ` Lionel Landwerlin
2019-05-21 14:08 ` [PATCH 4/5] drm/i915: add a new perf configuration execbuf parameter Lionel Landwerlin
2019-05-21 17:07   ` Chris Wilson
2019-05-21 17:19     ` Lionel Landwerlin
2019-05-21 17:48       ` Chris Wilson
2019-05-21 17:59         ` Lionel Landwerlin
2019-05-22  9:19         ` Lionel Landwerlin
2019-05-22  9:25           ` Chris Wilson
2019-05-22  9:30             ` Lionel Landwerlin
2019-05-28 10:52   ` Chris Wilson
2019-05-30 18:41     ` Lionel Landwerlin
2019-05-21 14:08 ` [PATCH 5/5] drm/i915: add support for perf configuration queries Lionel Landwerlin
2019-05-23 10:32   ` Dan Carpenter
2019-05-23 11:25     ` Lionel Landwerlin
2019-05-23 11:38       ` Dan Carpenter
2019-05-21 16:37 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Vulkan performance query support Patchwork
2019-05-21 16:40 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-05-21 17:26 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.