All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v9 0/7] drm/i915: per context slice/subslice powergating
@ 2018-05-30 14:33 Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 1/7] drm/i915: Program RPCS for Broadwell Lionel Landwerlin
                   ` (10 more replies)
  0 siblings, 11 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

Hi again,

Here are some updates following Chris & Michel's comments.

I folded the patch 6 added in v8 into patch 6 of this series.

Many thanks,

Chris Wilson (3):
  drm/i915: Program RPCS for Broadwell
  drm/i915: Record the sseu configuration per-context & engine
  drm/i915: Expose RPCS (SSEU) configuration to userspace

Lionel Landwerlin (4):
  drm/i915/perf: simplify configure all context function
  drm/i915/perf: reuse intel_lrc ctx regs macro
  drm/i915/perf: lock powergating configuration to default when active
  drm/i915: add a sysfs entry to let users set sseu configs

 drivers/gpu/drm/i915/i915_drv.h         |  50 ++++++
 drivers/gpu/drm/i915/i915_gem.c         |   2 +
 drivers/gpu/drm/i915/i915_gem_context.c | 216 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.h |   4 +
 drivers/gpu/drm/i915/i915_perf.c        |  68 ++++----
 drivers/gpu/drm/i915/i915_request.c     |  20 +++
 drivers/gpu/drm/i915/i915_request.h     |  10 ++
 drivers/gpu/drm/i915/i915_sysfs.c       |  40 +++++
 drivers/gpu/drm/i915/intel_lrc.c        | 137 ++++++++++-----
 drivers/gpu/drm/i915/intel_lrc.h        |   3 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   4 +
 include/uapi/drm/i915_drm.h             |  43 +++++
 13 files changed, 525 insertions(+), 74 deletions(-)

--
2.17.0
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v9 1/7] drm/i915: Program RPCS for Broadwell
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 2/7] drm/i915: Record the sseu configuration per-context & engine Lionel Landwerlin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

Currently we only configure the power gating for Skylake and above, but
the configuration should equally apply to Broadwell and Braswell. Even
though, there is not as much variation as for later generations, we want
to expose control over the configuration to userspace and may want to
opt out of the "always-enabled" setting.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8a6058b4102f..05e069a5c836 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2517,13 +2517,6 @@ make_rpcs(struct drm_i915_private *dev_priv)
 {
 	u32 rpcs = 0;
 
-	/*
-	 * No explicit RPCS request is needed to ensure full
-	 * slice/subslice/EU enablement prior to Gen9.
-	*/
-	if (INTEL_GEN(dev_priv) < 9)
-		return 0;
-
 	/*
 	 * Starting in Gen9, render power gating can leave
 	 * slice/subslice/EU in a partially enabled state. We
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 2/7] drm/i915: Record the sseu configuration per-context & engine
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 1/7] drm/i915: Program RPCS for Broadwell Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 3/7] drm/i915/perf: simplify configure all context function Lionel Landwerlin
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

We want to expose the ability to reconfigure the slices, subslice and
eu per context and per engine. To facilitate that, store the current
configuration on the context for each engine, which is initially set
to the device default upon creation.

v2: record sseu configuration per context & engine (Chris)

v3: introduce the i915_gem_context_sseu to store powergating
    programming, sseu_dev_info has grown quite a bit (Lionel)

v4: rename i915_gem_sseu into intel_sseu (Chris)
    use to_intel_context() (Chris)

v5: More to_intel_context() (Tvrtko)
    Switch intel_sseu from union to struct (Tvrtko)
    Move context default sseu in existing loop (Chris)

v6: s/intel_sseu_from_device_sseu/intel_device_default_sseu/ (Tvrtko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.c |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.h |  4 ++++
 drivers/gpu/drm/i915/i915_request.h     | 10 ++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 22 +++++++++++-----------
 5 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 487922f88b76..62ababfd39aa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3427,6 +3427,20 @@ mkwrite_device_info(struct drm_i915_private *dev_priv)
 	return (struct intel_device_info *)&dev_priv->info;
 }
 
+static inline struct intel_sseu
+intel_device_default_sseu(struct drm_i915_private *i915)
+{
+	const struct sseu_dev_info *sseu = &INTEL_INFO(i915)->sseu;
+	struct intel_sseu value = {
+		.slice_mask = sseu->slice_mask,
+		.subslice_mask = sseu->subslice_mask[0],
+		.min_eus_per_subslice = sseu->max_eus_per_subslice,
+		.max_eus_per_subslice = sseu->max_eus_per_subslice,
+	};
+
+	return value;
+}
+
 /* modesetting */
 extern void intel_modeset_init_hw(struct drm_device *dev);
 extern int intel_modeset_init(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 45393f6e0208..ee2ee6b4e5b0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -282,6 +282,8 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 		struct intel_context *ce = &ctx->__engine[n];
 
 		ce->gem_context = ctx;
+		/* Use the whole device by default */
+		ce->sseu = intel_device_default_sseu(dev_priv);
 	}
 
 	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index b116e4942c10..f40d85448a28 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -31,6 +31,7 @@
 
 #include "i915_gem.h"
 #include "i915_scheduler.h"
+#include "intel_device_info.h"
 
 struct pid;
 
@@ -160,6 +161,9 @@ struct i915_gem_context {
 		int pin_count;
 
 		const struct intel_context_ops *ops;
+
+		/** sseu: Control eu/slice partitioning */
+		struct intel_sseu sseu;
 	} __engine[I915_NUM_ENGINES];
 
 	/** ring_size: size for allocating the per-engine ring buffer */
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h
index 1bbbb7a9fa03..82c5dd153bfd 100644
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -39,6 +39,16 @@ struct drm_i915_gem_object;
 struct i915_request;
 struct i915_timeline;
 
+/*
+ * Powergating configuration for a particular (context,engine).
+ */
+struct intel_sseu {
+	u8 slice_mask;
+	u8 subslice_mask;
+	u8 min_eus_per_subslice;
+	u8 max_eus_per_subslice;
+};
+
 struct intel_wait {
 	struct rb_node node;
 	struct task_struct *tsk;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 05e069a5c836..8bac0483e784 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2512,8 +2512,8 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine)
 	return logical_ring_init(engine);
 }
 
-static u32
-make_rpcs(struct drm_i915_private *dev_priv)
+static u32 make_rpcs(const struct sseu_dev_info *sseu,
+		     struct intel_sseu ctx_sseu)
 {
 	u32 rpcs = 0;
 
@@ -2523,24 +2523,23 @@ make_rpcs(struct drm_i915_private *dev_priv)
 	 * must make an explicit request through RPCS for full
 	 * enablement.
 	*/
-	if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
+	if (sseu->has_slice_pg) {
 		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
-		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask) <<
-			GEN8_RPCS_S_CNT_SHIFT;
+		rpcs |= hweight8(ctx_sseu.slice_mask) << GEN8_RPCS_S_CNT_SHIFT;
 		rpcs |= GEN8_RPCS_ENABLE;
 	}
 
-	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
+	if (sseu->has_subslice_pg) {
 		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
-		rpcs |= hweight8(INTEL_INFO(dev_priv)->sseu.subslice_mask[0]) <<
+		rpcs |= hweight8(ctx_sseu.subslice_mask) <<
 			GEN8_RPCS_SS_CNT_SHIFT;
 		rpcs |= GEN8_RPCS_ENABLE;
 	}
 
-	if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
-		rpcs |= INTEL_INFO(dev_priv)->sseu.eu_per_subslice <<
+	if (sseu->has_eu_pg) {
+		rpcs |= ctx_sseu.min_eus_per_subslice <<
 			GEN8_RPCS_EU_MIN_SHIFT;
-		rpcs |= INTEL_INFO(dev_priv)->sseu.eu_per_subslice <<
+		rpcs |= ctx_sseu.max_eus_per_subslice <<
 			GEN8_RPCS_EU_MAX_SHIFT;
 		rpcs |= GEN8_RPCS_ENABLE;
 	}
@@ -2664,7 +2663,8 @@ static void execlists_init_reg_state(u32 *regs,
 	if (rcs) {
 		regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
 		CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
-			make_rpcs(dev_priv));
+			make_rpcs(&INTEL_INFO(dev_priv)->sseu,
+				  to_intel_context(ctx, engine)->sseu));
 
 		i915_oa_init_reg_state(engine, ctx, regs);
 	}
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 3/7] drm/i915/perf: simplify configure all context function
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 1/7] drm/i915: Program RPCS for Broadwell Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 2/7] drm/i915: Record the sseu configuration per-context & engine Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 4/7] drm/i915/perf: reuse intel_lrc ctx regs macro Lionel Landwerlin
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

We don't need any special treatment on error so just return as soon as
possible.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 4f0eb84b3c00..805dfc732bba 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1762,7 +1762,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	/* Switch away from any user context. */
 	ret = gen8_switch_to_updated_kernel_context(dev_priv, oa_config);
 	if (ret)
-		goto out;
+		return ret;
 
 	/*
 	 * The OA register config is setup through the context image. This image
@@ -1779,7 +1779,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	 */
 	ret = i915_gem_wait_for_idle(dev_priv, wait_flags);
 	if (ret)
-		goto out;
+		return ret;
 
 	/* Update all contexts now that we've stalled the submission. */
 	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
@@ -1791,10 +1791,8 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 			continue;
 
 		regs = i915_gem_object_pin_map(ce->state->obj, I915_MAP_WB);
-		if (IS_ERR(regs)) {
-			ret = PTR_ERR(regs);
-			goto out;
-		}
+		if (IS_ERR(regs))
+			return PTR_ERR(regs);
 
 		ce->state->obj->mm.dirty = true;
 		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
@@ -1804,7 +1802,6 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 		i915_gem_object_unpin_map(ce->state->obj);
 	}
 
- out:
 	return ret;
 }
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 4/7] drm/i915/perf: reuse intel_lrc ctx regs macro
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (2 preceding siblings ...)
  2018-05-30 14:33 ` [PATCH v9 3/7] drm/i915/perf: simplify configure all context function Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 5/7] drm/i915/perf: lock powergating configuration to default when active Lionel Landwerlin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

Abstract the context image access a bit.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_perf.c | 34 +++++++++++++++-----------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 805dfc732bba..a5d98bda5c2e 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -210,6 +210,7 @@
 #include "i915_oa_cflgt3.h"
 #include "i915_oa_cnl.h"
 #include "i915_oa_icl.h"
+#include "intel_lrc_reg.h"
 
 /* HW requires this to be a power of two, between 128k and 16M, though driver
  * is currently generally designed assuming the largest 16M size is used such
@@ -1579,27 +1580,25 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
 	u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset;
 	u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset;
 	/* The MMIO offsets for Flex EU registers aren't contiguous */
-	u32 flex_mmio[] = {
-		i915_mmio_reg_offset(EU_PERF_CNTL0),
-		i915_mmio_reg_offset(EU_PERF_CNTL1),
-		i915_mmio_reg_offset(EU_PERF_CNTL2),
-		i915_mmio_reg_offset(EU_PERF_CNTL3),
-		i915_mmio_reg_offset(EU_PERF_CNTL4),
-		i915_mmio_reg_offset(EU_PERF_CNTL5),
-		i915_mmio_reg_offset(EU_PERF_CNTL6),
+	i915_reg_t flex_regs[] = {
+		EU_PERF_CNTL0,
+		EU_PERF_CNTL1,
+		EU_PERF_CNTL2,
+		EU_PERF_CNTL3,
+		EU_PERF_CNTL4,
+		EU_PERF_CNTL5,
+		EU_PERF_CNTL6,
 	};
 	int i;
 
-	reg_state[ctx_oactxctrl] = i915_mmio_reg_offset(GEN8_OACTXCONTROL);
-	reg_state[ctx_oactxctrl+1] = (dev_priv->perf.oa.period_exponent <<
-				      GEN8_OA_TIMER_PERIOD_SHIFT) |
-				     (dev_priv->perf.oa.periodic ?
-				      GEN8_OA_TIMER_ENABLE : 0) |
-				     GEN8_OA_COUNTER_RESUME;
+	CTX_REG(reg_state, ctx_oactxctrl, GEN8_OACTXCONTROL,
+		(dev_priv->perf.oa.period_exponent << GEN8_OA_TIMER_PERIOD_SHIFT) |
+		(dev_priv->perf.oa.periodic ? GEN8_OA_TIMER_ENABLE : 0) |
+		GEN8_OA_COUNTER_RESUME);
 
-	for (i = 0; i < ARRAY_SIZE(flex_mmio); i++) {
+	for (i = 0; i < ARRAY_SIZE(flex_regs); i++) {
 		u32 state_offset = ctx_flexeu0 + i * 2;
-		u32 mmio = flex_mmio[i];
+		u32 mmio = i915_mmio_reg_offset(flex_regs[i]);
 
 		/*
 		 * This arbitrary default will select the 'EU FPU0 Pipeline
@@ -1619,8 +1618,7 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
 			}
 		}
 
-		reg_state[state_offset] = mmio;
-		reg_state[state_offset+1] = value;
+		CTX_REG(reg_state, state_offset, flex_regs[i], value);
 	}
 }
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 5/7] drm/i915/perf: lock powergating configuration to default when active
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (3 preceding siblings ...)
  2018-05-30 14:33 ` [PATCH v9 4/7] drm/i915/perf: reuse intel_lrc ctx regs macro Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 6/7] drm/i915: Expose RPCS (SSEU) configuration to userspace Lionel Landwerlin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

If some of the contexts submitting workloads to the GPU have been
configured to shutdown slices/subslices, we might loose the NOA
configurations written in the NOA muxes.

One possible solution to this problem is to reprogram the NOA muxes
when we switch to a new context. We initially tried this in the
workaround batchbuffer but some concerns where raised about the cost
of reprogramming at every context switch. This solution is also not
without consequences from the userspace point of view. Reprogramming
of the muxes can only happen once the powergating configuration has
changed (which happens after context switch). This means for a window
of time during the recording, counters recorded by the OA unit might
be invalid. This requires userspace dealing with OA reports to discard
the invalid values.

Minimizing the reprogramming could be implemented by tracking of the
last programmed configuration somewhere in GGTT and use MI_PREDICATE
to discard some of the programming commands, but the command streamer
would still have to parse all the MI_LRI instructions in the
workaround batchbuffer.

Another solution, which this change implements, is to simply disregard
the user requested configuration for the period of time when i915/perf
is active. There is no known issue with this apart from a performance
penality for some media workloads that benefit from running on a
partially powergated GPU. We already prevent RC6 from affecting the
programming so it doesn't sound completely unreasonable to hold on
powergating for the same reason.

v2: Leave RPCS programming in intel_lrc.c (Lionel)

v3: Update for s/union intel_sseu/struct intel_sseu/ (Lionel)
    More to_intel_context() (Tvrtko)
    s/dev_priv/i915/ (Tvrtko)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  | 15 +++++++++++++++
 drivers/gpu/drm/i915/i915_perf.c | 23 ++++++++++++++++++-----
 drivers/gpu/drm/i915/intel_lrc.c | 11 +++++++----
 drivers/gpu/drm/i915/intel_lrc.h |  3 +++
 4 files changed, 43 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 62ababfd39aa..25ffadfcd04b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3839,4 +3839,19 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 		return I915_HWS_CSB_WRITE_INDEX;
 }
 
+static inline struct intel_sseu
+intel_engine_prepare_sseu(struct intel_engine_cs *engine,
+			  struct intel_sseu sseu)
+{
+	struct drm_i915_private *i915 = engine->i915;
+
+	/*
+	 * If i915/perf is active, we want a stable powergating configuration
+	 * on the system. The most natural configuration to take in that case
+	 * is the default (i.e maximum the hardware can do).
+	 */
+	return i915->perf.oa.exclusive_stream ?
+		intel_device_default_sseu(i915) : sseu;
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index a5d98bda5c2e..4a62024cbf85 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1574,7 +1574,8 @@ static void hsw_disable_metric_set(struct drm_i915_private *dev_priv)
  */
 static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
 					   u32 *reg_state,
-					   const struct i915_oa_config *oa_config)
+					   const struct i915_oa_config *oa_config,
+					   struct intel_sseu sseu)
 {
 	struct drm_i915_private *dev_priv = ctx->i915;
 	u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset;
@@ -1620,6 +1621,9 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
 
 		CTX_REG(reg_state, state_offset, flex_regs[i], value);
 	}
+
+	CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
+		gen8_make_rpcs(&INTEL_INFO(dev_priv)->sseu, sseu));
 }
 
 /*
@@ -1750,6 +1754,7 @@ static int gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_pr
 static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 				       const struct i915_oa_config *oa_config)
 {
+	struct intel_sseu default_sseu = intel_device_default_sseu(dev_priv);
 	struct intel_engine_cs *engine = dev_priv->engine[RCS];
 	struct i915_gem_context *ctx;
 	int ret;
@@ -1795,7 +1800,8 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 		ce->state->obj->mm.dirty = true;
 		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
 
-		gen8_update_reg_state_unlocked(ctx, regs, oa_config);
+		gen8_update_reg_state_unlocked(ctx, regs, oa_config,
+					       oa_config ? default_sseu : ce->sseu);
 
 		i915_gem_object_unpin_map(ce->state->obj);
 	}
@@ -2167,14 +2173,21 @@ void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct i915_gem_context *ctx,
 			    u32 *reg_state)
 {
+	struct drm_i915_private *i915 = engine->i915;
 	struct i915_perf_stream *stream;
 
 	if (engine->id != RCS)
 		return;
 
-	stream = engine->i915->perf.oa.exclusive_stream;
-	if (stream)
-		gen8_update_reg_state_unlocked(ctx, reg_state, stream->oa_config);
+	stream = i915->perf.oa.exclusive_stream;
+	if (stream) {
+		struct intel_sseu default_sseu =
+			intel_device_default_sseu(i915);
+
+		gen8_update_reg_state_unlocked(ctx, reg_state,
+					       stream->oa_config,
+					       default_sseu);
+	}
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 8bac0483e784..7314e548fb4e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2512,8 +2512,8 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine)
 	return logical_ring_init(engine);
 }
 
-static u32 make_rpcs(const struct sseu_dev_info *sseu,
-		     struct intel_sseu ctx_sseu)
+u32 gen8_make_rpcs(const struct sseu_dev_info *sseu,
+		   struct intel_sseu ctx_sseu)
 {
 	u32 rpcs = 0;
 
@@ -2661,10 +2661,13 @@ static void execlists_init_reg_state(u32 *regs,
 	}
 
 	if (rcs) {
+		struct intel_sseu sseu = to_intel_context(ctx, engine)->sseu;
+
 		regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
 		CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
-			make_rpcs(&INTEL_INFO(dev_priv)->sseu,
-				  to_intel_context(ctx, engine)->sseu));
+			gen8_make_rpcs(&INTEL_INFO(dev_priv)->sseu,
+				       intel_engine_prepare_sseu(engine,
+								 sseu)));
 
 		i915_oa_init_reg_state(engine, ctx, regs);
 	}
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 1593194e930c..6e6b2df95780 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -104,4 +104,7 @@ struct i915_gem_context;
 
 void intel_lr_context_resume(struct drm_i915_private *dev_priv);
 
+u32 gen8_make_rpcs(const struct sseu_dev_info *sseu,
+		   struct intel_sseu ctx_sseu);
+
 #endif /* _INTEL_LRC_H_ */
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 6/7] drm/i915: Expose RPCS (SSEU) configuration to userspace
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (4 preceding siblings ...)
  2018-05-30 14:33 ` [PATCH v9 5/7] drm/i915/perf: lock powergating configuration to default when active Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-05-30 14:33 ` [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs Lionel Landwerlin
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

From: Chris Wilson <chris@chris-wilson.co.uk>

We want to allow userspace to reconfigure the subslice configuration for
its own use case. To do so, we expose a context parameter to allow
adjustment of the RPCS register stored within the context image (and
currently not accessible via LRI). If the context is adjusted before
first use, the adjustment is for "free"; otherwise if the context is
active we flush the context off the GPU (stalling all users) and forcing
the GPU to save the context to memory where we can modify it and so
ensure that the register is reloaded on next execution.

The overhead of managing additional EU subslices can be significant,
especially in multi-context workloads. Non-GPGPU contexts should
preferably disable the subslices it is not using, and others should
fine-tune the number to match their workload.

We expose complete control over the RPCS register, allowing
configuration of slice/subslice, via masks packed into a u64 for
simplicity. For example,

	struct drm_i915_gem_context_param arg;
	struct drm_i915_gem_context_param_sseu sseu = { .class = 0,
	                                                .instance = 0, };

	memset(&arg, 0, sizeof(arg));
	arg.ctx_id = ctx;
	arg.param = I915_CONTEXT_PARAM_SSEU;
	arg.value = (uintptr_t) &sseu;
	if (drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM, &arg) == 0) {
		sseu.packed.subslice_mask = 0;

		drmIoctl(fd, DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM, &arg);
	}

could be used to disable all subslices where supported.

v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel)

v3: Add ability to program this per engine (Chris)

v4: Move most get_sseu() into i915_gem_context.c (Lionel)

v5: Validate sseu configuration against the device's capabilities (Lionel)

v6: Change context powergating settings through MI_SDM on kernel context (Chris)

v7: Synchronize the requests following a powergating setting change using a global
    dependency (Chris)
    Iterate timelines through dev_priv.gt.active_rings (Tvrtko)
    Disable RPCS configuration setting for non capable users (Lionel/Tvrtko)

v8: s/union intel_sseu/struct intel_sseu/ (Lionel)
    s/dev_priv/i915/ (Tvrtko)
    Change uapi class/instance fields to u16 (Tvrtko)
    Bump mask fields to 64bits (Lionel)
    Don't return EPERM when dynamic sseu is disabled (Tvrtko)

v9: Import context image into kernel context's ppgtt only when
    reconfiguring powergated slice/subslices (Chris)
    Use aliasing ppgtt when needed (Michel)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100899
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
c: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
CC: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
CC: Zhipeng Gong <zhipeng.gong@intel.com>
CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  13 ++
 drivers/gpu/drm/i915/i915_gem.c         |   2 +
 drivers/gpu/drm/i915/i915_gem_context.c | 176 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_request.c     |  20 +++
 drivers/gpu/drm/i915/intel_lrc.c        | 123 ++++++++++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   4 +
 include/uapi/drm/i915_drm.h             |  43 ++++++
 8 files changed, 348 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 25ffadfcd04b..b2386c37437d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2066,6 +2066,12 @@ struct drm_i915_private {
 		u32 active_requests;
 		u32 request_serial;
 
+		/**
+		 * Global barrier to ensuring ordering of sseu transitions
+		 * requests.
+		 */
+		struct i915_gem_active global_barrier;
+
 		/**
 		 * Is the GPU currently considered idle, or busy executing
 		 * userspace requests? Whilst idle, we allow runtime power
@@ -3212,6 +3218,13 @@ i915_vm_to_ppgtt(struct i915_address_space *vm)
 	return container_of(vm, struct i915_hw_ppgtt, base);
 }
 
+static inline void i915_gem_set_global_barrier(struct drm_i915_private *i915,
+					       struct i915_request *rq)
+{
+	lockdep_assert_held(&i915->drm.struct_mutex);
+	i915_gem_active_set(&i915->gt.global_barrier, rq);
+}
+
 /* i915_gem_fence_reg.c */
 struct drm_i915_fence_reg *
 i915_reserve_fence(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 530d6d0109b4..bbc141337b14 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5554,6 +5554,8 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 	if (!dev_priv->priorities)
 		goto err_dependencies;
 
+	init_request_active(&dev_priv->gt.global_barrier, NULL);
+
 	INIT_LIST_HEAD(&dev_priv->gt.timelines);
 	INIT_LIST_HEAD(&dev_priv->gt.active_rings);
 	INIT_LIST_HEAD(&dev_priv->gt.closed_vma);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index ee2ee6b4e5b0..79029815c21f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -781,6 +781,79 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int
+intel_sseu_from_user_sseu(const struct sseu_dev_info *sseu,
+			  const struct drm_i915_gem_context_param_sseu *user_sseu,
+			  struct intel_sseu *ctx_sseu)
+{
+	if ((user_sseu->slice_mask & ~sseu->slice_mask) != 0 ||
+	    user_sseu->slice_mask == 0)
+		return -EINVAL;
+
+	if ((user_sseu->subslice_mask & ~sseu->subslice_mask[0]) != 0 ||
+	    user_sseu->subslice_mask == 0)
+		return -EINVAL;
+
+	if (user_sseu->min_eus_per_subslice > sseu->max_eus_per_subslice)
+		return -EINVAL;
+
+	if (user_sseu->max_eus_per_subslice > sseu->max_eus_per_subslice ||
+	    user_sseu->max_eus_per_subslice < user_sseu->min_eus_per_subslice ||
+	    user_sseu->max_eus_per_subslice == 0)
+		return -EINVAL;
+
+	ctx_sseu->slice_mask = user_sseu->slice_mask;
+	ctx_sseu->subslice_mask = user_sseu->subslice_mask;
+	ctx_sseu->min_eus_per_subslice = user_sseu->min_eus_per_subslice;
+	ctx_sseu->max_eus_per_subslice = user_sseu->max_eus_per_subslice;
+
+	return 0;
+}
+
+static int
+i915_gem_context_reconfigure_sseu(struct i915_gem_context *ctx,
+				  struct intel_engine_cs *engine,
+				  struct intel_sseu sseu)
+{
+	struct drm_i915_private *i915 = ctx->i915;
+	struct i915_request *rq;
+	struct intel_ring *ring;
+	int ret;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	i915_retire_requests(i915);
+
+	/* Now use the RCS to actually reconfigure. */
+	engine = i915->engine[RCS];
+
+	rq = i915_request_alloc(engine, i915->kernel_context);
+	if (IS_ERR(rq))
+		return PTR_ERR(rq);
+
+	ret = engine->emit_rpcs_config(rq, ctx, sseu);
+	if (ret) {
+		__i915_request_add(rq, true);
+		return ret;
+	}
+
+	/* Queue this switch after all other activity */
+	list_for_each_entry(ring, &i915->gt.active_rings, active_link) {
+		struct i915_request *prev;
+
+		prev = last_request_on_engine(ring->timeline, engine);
+		if (prev)
+			i915_sw_fence_await_sw_fence_gfp(&rq->submit,
+							 &prev->submit,
+							 I915_FENCE_GFP);
+	}
+
+	i915_gem_set_global_barrier(i915, rq);
+	__i915_request_add(rq, true);
+
+	return 0;
+}
+
 int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file)
 {
@@ -818,6 +891,46 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	case I915_CONTEXT_PARAM_PRIORITY:
 		args->value = ctx->sched.priority;
 		break;
+	case I915_CONTEXT_PARAM_SSEU:
+		{
+			struct drm_i915_gem_context_param_sseu user_sseu;
+			struct intel_engine_cs *engine;
+			struct intel_context *ce;
+
+			if (copy_from_user(&user_sseu,
+					   u64_to_user_ptr(args->value),
+					   sizeof(user_sseu))) {
+				ret = -EFAULT;
+				break;
+			}
+
+			if (user_sseu.rsvd1 != 0 || user_sseu.rsvd2 != 0) {
+				ret = -EINVAL;
+				break;
+			}
+
+			engine = intel_engine_lookup_user(to_i915(dev),
+							  user_sseu.class,
+							  user_sseu.instance);
+			if (!engine) {
+				ret = -EINVAL;
+				break;
+			}
+
+			ce = to_intel_context(ctx, engine);
+
+			user_sseu.slice_mask = ce->sseu.slice_mask;
+			user_sseu.subslice_mask = ce->sseu.subslice_mask;
+			user_sseu.min_eus_per_subslice =
+				ce->sseu.min_eus_per_subslice;
+			user_sseu.max_eus_per_subslice =
+				ce->sseu.max_eus_per_subslice;
+
+			if (copy_to_user(u64_to_user_ptr(args->value),
+					 &user_sseu, sizeof(user_sseu)))
+				ret = -EFAULT;
+			break;
+		}
 	default:
 		ret = -EINVAL;
 		break;
@@ -892,7 +1005,70 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				ctx->sched.priority = priority;
 		}
 		break;
+	case I915_CONTEXT_PARAM_SSEU:
+		{
+			struct drm_i915_private *i915 = to_i915(dev);
+			struct drm_i915_gem_context_param_sseu user_sseu;
+			struct intel_engine_cs *engine;
+			struct intel_sseu ctx_sseu;
+			enum intel_engine_id id;
+
+			if (args->size) {
+				ret = -EINVAL;
+				break;
+			}
+
+			if (copy_from_user(&user_sseu, u64_to_user_ptr(args->value),
+					   sizeof(user_sseu))) {
+				ret = -EFAULT;
+				break;
+			}
 
+			if (user_sseu.rsvd1 != 0 || user_sseu.rsvd2 != 0) {
+				ret = -EINVAL;
+				break;
+			}
+
+			engine = intel_engine_lookup_user(i915,
+							  user_sseu.class,
+							  user_sseu.instance);
+			if (!engine) {
+				ret = -EINVAL;
+				break;
+			}
+
+			if (!engine->emit_rpcs_config) {
+				ret = -ENODEV;
+				break;
+			}
+
+			ret = intel_sseu_from_user_sseu(&INTEL_INFO(i915)->sseu,
+							&user_sseu, &ctx_sseu);
+			if (ret)
+				break;
+
+			if (memcmp(&to_intel_context(ctx, engine)->sseu,
+				   &ctx_sseu, sizeof(ctx_sseu)) != 0) {
+				DRM_ERROR("reconfiguring ctx=%p\n", ctx);
+				ret = i915_gem_context_reconfigure_sseu(ctx,
+									engine,
+									ctx_sseu);
+				if (ret)
+					break;
+			}
+
+			/*
+			 * Apply the configuration to all engine. Our hardware
+			 * doesn't currently support different configurations
+			 * for each engine.
+			 */
+			for_each_engine(engine, i915, id) {
+				struct intel_context *ce = to_intel_context(ctx, engine);
+
+				ce->sseu = ctx_sseu;
+			}
+		}
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index f187250e60c6..aa2ec0de1cb1 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -644,6 +644,22 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state)
 	return NOTIFY_DONE;
 }
 
+static int
+i915_request_await_request(struct i915_request *to, struct i915_request *from);
+
+static int add_global_barrier(struct i915_request *rq)
+{
+	struct i915_request *barrier;
+	int ret = 0;
+
+	barrier = i915_gem_active_raw(&rq->i915->gt.global_barrier,
+				      &rq->i915->drm.struct_mutex);
+	if (barrier)
+		ret = i915_request_await_request(rq, barrier);
+
+	return ret;
+}
+
 /**
  * i915_request_alloc - allocate a request structure
  *
@@ -805,6 +821,10 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	rq->head = rq->ring->emit;
 
+	ret = add_global_barrier(rq);
+	if (ret)
+		goto err_unwind;
+
 	/* Unconditionally invalidate GPU caches and TLBs. */
 	ret = engine->emit_flush(rq, EMIT_INVALIDATE);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7314e548fb4e..8605a69ded0a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2271,6 +2271,92 @@ static void gen8_emit_breadcrumb_rcs(struct i915_request *request, u32 *cs)
 }
 static const int gen8_emit_breadcrumb_rcs_sz = 8 + WA_TAIL_DWORDS;
 
+u32 gen8_make_rpcs(const struct sseu_dev_info *sseu,
+		   struct intel_sseu ctx_sseu)
+{
+	u32 rpcs = 0;
+
+	/*
+	 * Starting in Gen9, render power gating can leave
+	 * slice/subslice/EU in a partially enabled state. We
+	 * must make an explicit request through RPCS for full
+	 * enablement.
+	 */
+	if (sseu->has_slice_pg) {
+		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
+		rpcs |= hweight8(ctx_sseu.slice_mask) << GEN8_RPCS_S_CNT_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	if (sseu->has_subslice_pg) {
+		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
+		rpcs |= hweight8(ctx_sseu.subslice_mask) <<
+			GEN8_RPCS_SS_CNT_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	if (sseu->has_eu_pg) {
+		rpcs |= ctx_sseu.min_eus_per_subslice <<
+			GEN8_RPCS_EU_MIN_SHIFT;
+		rpcs |= ctx_sseu.max_eus_per_subslice <<
+			GEN8_RPCS_EU_MAX_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	return rpcs;
+}
+
+static int gen8_emit_rpcs_config(struct i915_request *rq,
+				 struct i915_gem_context *ctx,
+				 struct intel_sseu sseu)
+{
+	struct drm_i915_private *i915 = rq->i915;
+	struct intel_context *ce = to_intel_context(ctx, i915->engine[RCS]);
+	struct i915_hw_ppgtt *ppgtt;
+	struct i915_vma *vma;
+	u64 offset;
+	u32 *cs;
+	int err;
+
+	/* Let the deferred state allocation take care of this. */
+	if (!ce->state)
+		return 0;
+
+	ppgtt = ctx->i915->kernel_context->ppgtt ?:
+		ctx->i915->mm.aliasing_ppgtt;
+
+	vma = i915_vma_instance(ce->state->obj, &ppgtt->base, NULL);
+	if (IS_ERR(vma))
+		return PTR_ERR(vma);
+
+	err = i915_vma_pin(vma, 0, GEN8_LR_CONTEXT_ALIGN, PIN_USER);
+	if (err) {
+		i915_vma_close(vma);
+		return err;
+	}
+
+	i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+	i915_vma_unpin(vma);
+
+	cs = intel_ring_begin(rq, 4);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	offset = vma->node.start +
+		LRC_STATE_PN * PAGE_SIZE +
+		(CTX_R_PWR_CLK_STATE + 1) * 4;
+
+	*cs++ = MI_STORE_DWORD_IMM_GEN4;
+	*cs++ = lower_32_bits(offset);
+	*cs++ = upper_32_bits(offset);
+	*cs++ = gen8_make_rpcs(&INTEL_INFO(i915)->sseu,
+			       intel_engine_prepare_sseu(rq->engine, sseu));
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
 static int gen8_init_rcs_context(struct i915_request *rq)
 {
 	int ret;
@@ -2364,6 +2450,8 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
 	engine->emit_breadcrumb = gen8_emit_breadcrumb;
 	engine->emit_breadcrumb_sz = gen8_emit_breadcrumb_sz;
 
+	engine->emit_rpcs_config = gen8_emit_rpcs_config;
+
 	engine->set_default_submission = execlists_set_default_submission;
 
 	if (INTEL_GEN(engine->i915) < 11) {
@@ -2512,41 +2600,6 @@ int logical_xcs_ring_init(struct intel_engine_cs *engine)
 	return logical_ring_init(engine);
 }
 
-u32 gen8_make_rpcs(const struct sseu_dev_info *sseu,
-		   struct intel_sseu ctx_sseu)
-{
-	u32 rpcs = 0;
-
-	/*
-	 * Starting in Gen9, render power gating can leave
-	 * slice/subslice/EU in a partially enabled state. We
-	 * must make an explicit request through RPCS for full
-	 * enablement.
-	*/
-	if (sseu->has_slice_pg) {
-		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
-		rpcs |= hweight8(ctx_sseu.slice_mask) << GEN8_RPCS_S_CNT_SHIFT;
-		rpcs |= GEN8_RPCS_ENABLE;
-	}
-
-	if (sseu->has_subslice_pg) {
-		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
-		rpcs |= hweight8(ctx_sseu.subslice_mask) <<
-			GEN8_RPCS_SS_CNT_SHIFT;
-		rpcs |= GEN8_RPCS_ENABLE;
-	}
-
-	if (sseu->has_eu_pg) {
-		rpcs |= ctx_sseu.min_eus_per_subslice <<
-			GEN8_RPCS_EU_MIN_SHIFT;
-		rpcs |= ctx_sseu.max_eus_per_subslice <<
-			GEN8_RPCS_EU_MAX_SHIFT;
-		rpcs |= GEN8_RPCS_ENABLE;
-	}
-
-	return rpcs;
-}
-
 static u32 intel_lr_indirect_ctx_offset(struct intel_engine_cs *engine)
 {
 	u32 indirect_ctx_offset;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 97b38bbb7ce2..50616d0190e9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2063,6 +2063,8 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 			engine->emit_breadcrumb_sz++;
 	}
 
+	engine->emit_rpcs_config = NULL; /* Only supported on Gen8+ */
+
 	engine->set_default_submission = i9xx_set_default_submission;
 
 	if (INTEL_GEN(dev_priv) >= 6)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index acef385c4c80..ab75afdc6fd6 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -456,6 +456,10 @@ struct intel_engine_cs {
 	void		(*emit_breadcrumb)(struct i915_request *rq, u32 *cs);
 	int		emit_breadcrumb_sz;
 
+	int		(*emit_rpcs_config)(struct i915_request *rq,
+					    struct i915_gem_context *ctx,
+					    struct intel_sseu sseu);
+
 	/* Pass the request to the hardware queue (e.g. directly into
 	 * the legacy ringbuffer or to the end of an execlist).
 	 *
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7f5634ce8e88..21df158056a2 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -1456,9 +1456,52 @@ struct drm_i915_gem_context_param {
 #define   I915_CONTEXT_MAX_USER_PRIORITY	1023 /* inclusive */
 #define   I915_CONTEXT_DEFAULT_PRIORITY		0
 #define   I915_CONTEXT_MIN_USER_PRIORITY	-1023 /* inclusive */
+	/*
+	 * When using the following param, value should be a pointer to
+	 * drm_i915_gem_context_param_sseu.
+	 */
+#define I915_CONTEXT_PARAM_SSEU		0x7
 	__u64 value;
 };
 
+struct drm_i915_gem_context_param_sseu {
+	/*
+	 * Engine class & instance to be configured or queried.
+	 */
+	__u16 class;
+	__u16 instance;
+
+	/*
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 rsvd1;
+
+	/*
+	 * Mask of slices to enable for the context. Valid values are a subset
+	 * of the bitmask value returned for I915_PARAM_SLICE_MASK.
+	 */
+	__u64 slice_mask;
+
+	/*
+	 * Mask of subslices to enable for the context. Valid values are a
+	 * subset of the bitmask value return by I915_PARAM_SUBSLICE_MASK.
+	 */
+	__u64 subslice_mask;
+
+	/*
+	 * Minimum/Maximum number of EUs to enable per subslice for the
+	 * context. min_eus_per_subslice must be inferior or equal to
+	 * max_eus_per_subslice.
+	 */
+	__u16 min_eus_per_subslice;
+	__u16 max_eus_per_subslice;
+
+	/*
+	 * Unused for now. Must be cleared to zero.
+	 */
+	__u32 rsvd2;
+};
+
 enum drm_i915_oa_format {
 	I915_OA_FORMAT_A13 = 1,	    /* HSW only */
 	I915_OA_FORMAT_A29,	    /* HSW only */
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (5 preceding siblings ...)
  2018-05-30 14:33 ` [PATCH v9 6/7] drm/i915: Expose RPCS (SSEU) configuration to userspace Lionel Landwerlin
@ 2018-05-30 14:33 ` Lionel Landwerlin
  2018-06-11 12:10   ` Tvrtko Ursulin
  2018-05-30 16:24 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: per context slice/subslice powergating (rev8) Patchwork
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 29+ messages in thread
From: Lionel Landwerlin @ 2018-05-30 14:33 UTC (permalink / raw)
  To: intel-gfx

There are concerns about denial of service around the per context sseu
configuration capability. In a previous commit introducing the
capability we allowed it only for capable users. This changes adds a
new debugfs entry to let any user configure its own context
powergating setup.

v2: Rename sysfs entry (Tvrtko)
    Lock interruptible the device in sysfs (Tvrtko)
    Fix dropped error code in getting dynamic sseu value (Tvrtko)
    s/dev_priv/i915/ (Tvrtko)

v3: Drop locked function suffix (Tvrtko)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 16 +++++++---
 drivers/gpu/drm/i915/i915_gem_context.c | 38 +++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_sysfs.c       | 40 +++++++++++++++++++++++++
 3 files changed, 90 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b2386c37437d..5aa96a6650b4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1842,6 +1842,8 @@ struct drm_i915_private {
 		struct ida hw_ida;
 #define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
 #define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
+
+		bool dynamic_sseu;
 	} contexts;
 
 	u32 fdi_rx_config;
@@ -3259,6 +3261,10 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 	return ctx;
 }
 
+int i915_gem_contexts_set_dynamic_sseu(struct drm_i915_private *i915,
+				       bool allowed);
+bool i915_gem_contexts_get_dynamic_sseu(struct drm_i915_private *i915);
+
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file);
 int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
@@ -3859,11 +3865,13 @@ intel_engine_prepare_sseu(struct intel_engine_cs *engine,
 	struct drm_i915_private *i915 = engine->i915;
 
 	/*
-	 * If i915/perf is active, we want a stable powergating configuration
-	 * on the system. The most natural configuration to take in that case
-	 * is the default (i.e maximum the hardware can do).
+	 * If i915/perf is active or dynamic sseu configuration is not allowed
+	 * (through sysfs), we want a stable powergating configuration on the
+	 * system. The most natural configuration to take in that case is the
+	 * default (i.e maximum the hardware can do).
 	 */
-	return i915->perf.oa.exclusive_stream ?
+	return (i915->perf.oa.exclusive_stream ||
+		!i915->contexts.dynamic_sseu) ?
 		intel_device_default_sseu(i915) : sseu;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 79029815c21f..453bdc976be3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -1118,6 +1118,44 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 	return ret;
 }
 
+int i915_gem_contexts_set_dynamic_sseu(struct drm_i915_private *i915,
+				       bool allowed)
+{
+	struct intel_engine_cs *engine = i915->engine[RCS];
+	struct i915_gem_context *ctx;
+	int ret = 0;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	if (!engine->emit_rpcs_config)
+		return -ENODEV;
+
+	if (allowed == i915->contexts.dynamic_sseu)
+		return 0;
+
+	i915->contexts.dynamic_sseu = allowed;
+
+	list_for_each_entry(ctx, &i915->contexts.list, link) {
+		struct intel_context *ce = to_intel_context(ctx, engine);
+
+		ret = i915_gem_context_reconfigure_sseu(ctx, engine, ce->sseu);
+		if (ret)
+			break;
+	}
+
+	return ret;
+}
+
+bool i915_gem_contexts_get_dynamic_sseu(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine = i915->engine[RCS];
+
+	if (!engine->emit_rpcs_config)
+		return false;
+
+	return i915->contexts.dynamic_sseu;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/mock_context.c"
 #include "selftests/i915_gem_context.c"
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index e5e6f6bb2b05..d895054d245e 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -483,6 +483,44 @@ static ssize_t gt_rp_mhz_show(struct device *kdev, struct device_attribute *attr
 	return snprintf(buf, PAGE_SIZE, "%d\n", val);
 }
 
+static ssize_t allow_dynamic_sseu_show(struct device *kdev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct drm_i915_private *i915 = kdev_minor_to_i915(kdev);
+	bool value = i915_gem_contexts_get_dynamic_sseu(i915);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", value);
+}
+
+static ssize_t allow_dynamic_sseu_store(struct device *kdev,
+					struct device_attribute *attr,
+					const char *buf, size_t count)
+{
+	struct drm_i915_private *i915 = kdev_minor_to_i915(kdev);
+	ssize_t ret;
+	u32 val;
+
+	ret = kstrtou32(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val != 0 && val != 1)
+		return -EINVAL;
+
+	ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
+	if (ret)
+		return ret;
+
+	ret = i915_gem_contexts_set_dynamic_sseu(i915, val != 0);
+
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	return ret ?: count;
+}
+
+static DEVICE_ATTR_RW(allow_dynamic_sseu);
+
 static const struct attribute *gen6_attrs[] = {
 	&dev_attr_gt_act_freq_mhz.attr,
 	&dev_attr_gt_cur_freq_mhz.attr,
@@ -492,6 +530,7 @@ static const struct attribute *gen6_attrs[] = {
 	&dev_attr_gt_RP0_freq_mhz.attr,
 	&dev_attr_gt_RP1_freq_mhz.attr,
 	&dev_attr_gt_RPn_freq_mhz.attr,
+	&dev_attr_allow_dynamic_sseu.attr,
 	NULL,
 };
 
@@ -505,6 +544,7 @@ static const struct attribute *vlv_attrs[] = {
 	&dev_attr_gt_RP1_freq_mhz.attr,
 	&dev_attr_gt_RPn_freq_mhz.attr,
 	&dev_attr_vlv_rpe_freq_mhz.attr,
+	&dev_attr_allow_dynamic_sseu.attr,
 	NULL,
 };
 
-- 
2.17.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for drm/i915: per context slice/subslice powergating (rev8)
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (6 preceding siblings ...)
  2018-05-30 14:33 ` [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs Lionel Landwerlin
@ 2018-05-30 16:24 ` Patchwork
  2018-05-30 16:26 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2018-05-30 16:24 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: per context slice/subslice powergating (rev8)
URL   : https://patchwork.freedesktop.org/series/42285/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
11a6079c0062 drm/i915: Program RPCS for Broadwell
bc248ff938fc drm/i915: Record the sseu configuration per-context & engine
76d7453744fb drm/i915/perf: simplify configure all context function
4c85932001c4 drm/i915/perf: reuse intel_lrc ctx regs macro
211bc25308d6 drm/i915/perf: lock powergating configuration to default when active
26456fb5413e drm/i915: Expose RPCS (SSEU) configuration to userspace
-:40: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#40: 
v2: Fix offset of CTX_R_PWR_CLK_STATE in intel_lr_context_set_sseu() (Lionel)

total: 0 errors, 1 warnings, 0 checks, 471 lines checked
ab350a0f9e28 drm/i915: add a sysfs entry to let users set sseu configs

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* ✗ Fi.CI.SPARSE: warning for drm/i915: per context slice/subslice powergating (rev8)
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (7 preceding siblings ...)
  2018-05-30 16:24 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: per context slice/subslice powergating (rev8) Patchwork
@ 2018-05-30 16:26 ` Patchwork
  2018-05-30 16:40 ` ✓ Fi.CI.BAT: success " Patchwork
  2018-05-30 17:30 ` ✗ Fi.CI.IGT: failure " Patchwork
  10 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2018-05-30 16:26 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: per context slice/subslice powergating (rev8)
URL   : https://patchwork.freedesktop.org/series/42285/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Commit: drm/i915: Program RPCS for Broadwell
Okay!

Commit: drm/i915: Record the sseu configuration per-context & engine
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3664:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3678:16: warning: expression using sizeof(void)

Commit: drm/i915/perf: simplify configure all context function
Okay!

Commit: drm/i915/perf: reuse intel_lrc ctx regs macro
Okay!

Commit: drm/i915/perf: lock powergating configuration to default when active
Okay!

Commit: drm/i915: Expose RPCS (SSEU) configuration to userspace
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3678:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3691:16: warning: expression using sizeof(void)

Commit: drm/i915: add a sysfs entry to let users set sseu configs
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3691:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3697:16: warning: expression using sizeof(void)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* ✓ Fi.CI.BAT: success for drm/i915: per context slice/subslice powergating (rev8)
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (8 preceding siblings ...)
  2018-05-30 16:26 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2018-05-30 16:40 ` Patchwork
  2018-05-30 17:30 ` ✗ Fi.CI.IGT: failure " Patchwork
  10 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2018-05-30 16:40 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: per context slice/subslice powergating (rev8)
URL   : https://patchwork.freedesktop.org/series/42285/
State : success

== Summary ==

= CI Bug Log - changes from CI_DRM_4257 -> Patchwork_9151 =

== Summary - SUCCESS ==

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/42285/revisions/8/mbox/

== Known issues ==

  Here are the changes found in Patchwork_9151 that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@kms_flip@basic-flip-vs-dpms:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106000)

    igt@kms_flip@basic-flip-vs-modeset:
      fi-glk-j4005:       PASS -> DMESG-WARN (fdo#106097, fdo#106000)

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-glk-j4005:       PASS -> FAIL (fdo#100368)

    
    ==== Possible fixes ====

    igt@kms_flip@basic-flip-vs-wf_vblank:
      fi-cnl-psr:         FAIL (fdo#100368) -> PASS

    igt@kms_frontbuffer_tracking@basic:
      fi-bsw-n3050:       INCOMPLETE (fdo#106729) -> PASS

    igt@kms_pipe_crc_basic@nonblocking-crc-pipe-c:
      fi-skl-guc:         FAIL (fdo#104724, fdo#103191) -> PASS

    igt@kms_pipe_crc_basic@read-crc-pipe-c:
      fi-glk-j4005:       DMESG-WARN (fdo#106097, fdo#106000) -> PASS

    igt@prime_vgem@basic-fence-flip:
      fi-ilk-650:         FAIL (fdo#104008) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#103191 https://bugs.freedesktop.org/show_bug.cgi?id=103191
  fdo#104008 https://bugs.freedesktop.org/show_bug.cgi?id=104008
  fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
  fdo#106000 https://bugs.freedesktop.org/show_bug.cgi?id=106000
  fdo#106097 https://bugs.freedesktop.org/show_bug.cgi?id=106097
  fdo#106729 https://bugs.freedesktop.org/show_bug.cgi?id=106729


== Participating hosts (45 -> 39) ==

  Missing    (6): fi-ilk-m540 fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-cfl-u2 fi-skl-6700hq 


== Build changes ==

    * Linux: CI_DRM_4257 -> Patchwork_9151

  CI_DRM_4257: 8aac35d26057479982a346c0e9cd57c2e930b7e1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4501: 6796a604bab6df9c84af149e799902360afdd157 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9151: ab350a0f9e28cf98c5c305fc7dd273bb35e68c5b @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ab350a0f9e28 drm/i915: add a sysfs entry to let users set sseu configs
26456fb5413e drm/i915: Expose RPCS (SSEU) configuration to userspace
211bc25308d6 drm/i915/perf: lock powergating configuration to default when active
4c85932001c4 drm/i915/perf: reuse intel_lrc ctx regs macro
76d7453744fb drm/i915/perf: simplify configure all context function
bc248ff938fc drm/i915: Record the sseu configuration per-context & engine
11a6079c0062 drm/i915: Program RPCS for Broadwell

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9151/issues.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* ✗ Fi.CI.IGT: failure for drm/i915: per context slice/subslice powergating (rev8)
  2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
                   ` (9 preceding siblings ...)
  2018-05-30 16:40 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-05-30 17:30 ` Patchwork
  10 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2018-05-30 17:30 UTC (permalink / raw)
  To: Lionel Landwerlin; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: per context slice/subslice powergating (rev8)
URL   : https://patchwork.freedesktop.org/series/42285/
State : failure

== Summary ==

= CI Bug Log - changes from CI_DRM_4257_full -> Patchwork_9151_full =

== Summary - FAILURE ==

  Serious unknown changes coming with Patchwork_9151_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_9151_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/42285/revisions/8/mbox/

== Possible new issues ==

  Here are the unknown changes that may have been introduced in Patchwork_9151_full:

  === IGT changes ===

    ==== Possible regressions ====

    igt@gem_ctx_param@invalid-param-get:
      shard-apl:          PASS -> FAIL
      shard-glk:          PASS -> FAIL
      shard-snb:          PASS -> FAIL
      shard-hsw:          PASS -> FAIL
      shard-kbl:          PASS -> FAIL

    igt@gem_ctx_param@invalid-param-set:
      shard-kbl:          PASS -> DMESG-FAIL
      shard-hsw:          PASS -> DMESG-FAIL
      shard-snb:          PASS -> DMESG-FAIL +1
      shard-glk:          PASS -> DMESG-FAIL
      shard-apl:          PASS -> DMESG-FAIL

    
== Known issues ==

  Here are the changes found in Patchwork_9151_full that come from known issues:

  === IGT changes ===

    ==== Issues hit ====

    igt@drv_selftest@live_gtt:
      shard-kbl:          PASS -> INCOMPLETE (fdo#103665)

    igt@kms_flip@2x-flip-vs-expired-vblank-interruptible:
      shard-glk:          PASS -> FAIL (fdo#105363)

    igt@kms_flip@flip-vs-panning-vs-hang-interruptible:
      shard-snb:          PASS -> DMESG-WARN (fdo#103821)

    igt@kms_flip@flip-vs-wf_vblank-interruptible:
      shard-hsw:          PASS -> FAIL (fdo#100368)

    igt@kms_flip@modeset-vs-vblank-race:
      shard-glk:          PASS -> FAIL (fdo#103060)

    igt@kms_flip_tiling@flip-x-tiled:
      shard-glk:          PASS -> FAIL (fdo#103822, fdo#104724)

    igt@perf_pmu@render-node-busy-idle-vcs0:
      shard-snb:          PASS -> FAIL (fdo#105286)

    
    ==== Possible fixes ====

    igt@kms_flip@flip-vs-expired-vblank-interruptible:
      shard-glk:          FAIL (fdo#105363) -> PASS

    igt@kms_flip@plain-flip-ts-check:
      shard-glk:          FAIL (fdo#100368) -> PASS

    igt@perf@blocking:
      shard-hsw:          FAIL (fdo#102252) -> PASS

    igt@perf_pmu@rc6-runtime-pm:
      shard-kbl:          FAIL (fdo#105010) -> PASS

    
  fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
  fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
  fdo#103060 https://bugs.freedesktop.org/show_bug.cgi?id=103060
  fdo#103665 https://bugs.freedesktop.org/show_bug.cgi?id=103665
  fdo#103821 https://bugs.freedesktop.org/show_bug.cgi?id=103821
  fdo#103822 https://bugs.freedesktop.org/show_bug.cgi?id=103822
  fdo#104724 https://bugs.freedesktop.org/show_bug.cgi?id=104724
  fdo#105010 https://bugs.freedesktop.org/show_bug.cgi?id=105010
  fdo#105286 https://bugs.freedesktop.org/show_bug.cgi?id=105286
  fdo#105363 https://bugs.freedesktop.org/show_bug.cgi?id=105363


== Participating hosts (5 -> 5) ==

  No changes in participating hosts


== Build changes ==

    * Linux: CI_DRM_4257 -> Patchwork_9151

  CI_DRM_4257: 8aac35d26057479982a346c0e9cd57c2e930b7e1 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4501: 6796a604bab6df9c84af149e799902360afdd157 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_9151: ab350a0f9e28cf98c5c305fc7dd273bb35e68c5b @ git://anongit.freedesktop.org/gfx-ci/linux

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9151/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-05-30 14:33 ` [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs Lionel Landwerlin
@ 2018-06-11 12:10   ` Tvrtko Ursulin
  2018-06-11 13:46     ` Lionel Landwerlin
  0 siblings, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2018-06-11 12:10 UTC (permalink / raw)
  To: Lionel Landwerlin, intel-gfx


On 30/05/2018 15:33, Lionel Landwerlin wrote:
> There are concerns about denial of service around the per context sseu
> configuration capability. In a previous commit introducing the
> capability we allowed it only for capable users. This changes adds a
> new debugfs entry to let any user configure its own context
> powergating setup.

As far as I understood it, Joonas' concerns here are:

1) That in the containers use case individual containers wouldn't be 
able to turn on the sysfs toggle for them.

2) That also in the containers use case if box admin turns on the 
feature, some containers would potentially start negatively affecting 
the others (via the accumulated cost of slice re-configuration on 
context switching).

I am not familiar with typical container setups to be authoritative 
here, but intuitively I find it reasonable that a low-level hardware 
switch like this would be under the control of a master domain 
administrator. ("If you are installing our product in the container 
environment, make sure your system administrator enables this hardware 
feature.", "Note to system administrators: Enabling this features may 
negatively affect the performance of other containers.")

Alternative proposal is for the i915 to apply an "or" filter on all 
requested masks and in that way ensure dynamic re-configuration doesn't 
happen on context switches, but driven from userspace via ioctls.

In other words, should _all_ userspace agree between themselves that 
they want to turn off a slice, they would then need to send out a 
concerted ioctl storm, where number of needed ioctls equals the number 
of currently active contexts. (This may have its own performance 
consequences caused by the barriers needed to modify all context images.)

This was deemed acceptable the the media use case, but my concern is the 
approach is not elegant and will tie us with the "or" policy in the ABI. 
(Performance concerns I haven't evaluated yet, but they also may be 
significant.)

If we go back thinking about the containers use case, then it transpires 
that even though the "or" policy does prevent one container from 
affecting the other from one angle, it also prevents one container from 
exercising the feature unless all containers co-operate.

As such, we can view the original problem statement where we have an 
issue if not everyone co-operates, as conceptually the same just from an 
opposite angle. (Rather than one container incurring the increased cost 
of context switches to the rest, we would have one container preventing 
the optimized slice configuration to the other.)

 From this follows that both proposals require complete co-operation 
from all running userspace to avoid complete control of the feature.

Since the balance between the benefit of optimized slice configuration 
(or penalty of suboptimal one), versus the penalty of increased context 
switch times, cannot be know by the driver (barring venturing into the 
heuristics territory), that is another reason why I find the "or" policy 
in the driver questionable.

We can also ask a question of - If we go with the "or" policy, why 
require N per-context ioctls to modify the global GPU configuration and 
not instead add a global driver ioctl to modify the state?

If a future hardware requires, or enables, the per-context behaviour in 
a more efficient way, we could then revisit the problem space.

In the mean time I see the "or" policy solution as adding some ABI which 
doesn't do anything for many use cases without any way for the sysadmin 
to enable it. At the same time master sysfs knob at least enables the 
sysadmin to make a decision. Here I am thinking about a random client 
environment where not all userspace co-operates, but for instance user 
is running the feature aware media stack, and non-feature aware 
OpenCL/3d stack.

I guess the complete story boils down to - is the master sysfs knob 
really a problem in container use cases.

Regards,

Tvrtko

> 
> v2: Rename sysfs entry (Tvrtko)
>      Lock interruptible the device in sysfs (Tvrtko)
>      Fix dropped error code in getting dynamic sseu value (Tvrtko)
>      s/dev_priv/i915/ (Tvrtko)
> 
> v3: Drop locked function suffix (Tvrtko)
> 
> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         | 16 +++++++---
>   drivers/gpu/drm/i915/i915_gem_context.c | 38 +++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_sysfs.c       | 40 +++++++++++++++++++++++++
>   3 files changed, 90 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b2386c37437d..5aa96a6650b4 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1842,6 +1842,8 @@ struct drm_i915_private {
>   		struct ida hw_ida;
>   #define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
>   #define GEN11_MAX_CONTEXT_HW_ID (1<<11) /* exclusive */
> +
> +		bool dynamic_sseu;
>   	} contexts;
>   
>   	u32 fdi_rx_config;
> @@ -3259,6 +3261,10 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
>   	return ctx;
>   }
>   
> +int i915_gem_contexts_set_dynamic_sseu(struct drm_i915_private *i915,
> +				       bool allowed);
> +bool i915_gem_contexts_get_dynamic_sseu(struct drm_i915_private *i915);
> +
>   int i915_perf_open_ioctl(struct drm_device *dev, void *data,
>   			 struct drm_file *file);
>   int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
> @@ -3859,11 +3865,13 @@ intel_engine_prepare_sseu(struct intel_engine_cs *engine,
>   	struct drm_i915_private *i915 = engine->i915;
>   
>   	/*
> -	 * If i915/perf is active, we want a stable powergating configuration
> -	 * on the system. The most natural configuration to take in that case
> -	 * is the default (i.e maximum the hardware can do).
> +	 * If i915/perf is active or dynamic sseu configuration is not allowed
> +	 * (through sysfs), we want a stable powergating configuration on the
> +	 * system. The most natural configuration to take in that case is the
> +	 * default (i.e maximum the hardware can do).
>   	 */
> -	return i915->perf.oa.exclusive_stream ?
> +	return (i915->perf.oa.exclusive_stream ||
> +		!i915->contexts.dynamic_sseu) ?
>   		intel_device_default_sseu(i915) : sseu;
>   }
>   
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 79029815c21f..453bdc976be3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -1118,6 +1118,44 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
>   	return ret;
>   }
>   
> +int i915_gem_contexts_set_dynamic_sseu(struct drm_i915_private *i915,
> +				       bool allowed)
> +{
> +	struct intel_engine_cs *engine = i915->engine[RCS];
> +	struct i915_gem_context *ctx;
> +	int ret = 0;
> +
> +	lockdep_assert_held(&i915->drm.struct_mutex);
> +
> +	if (!engine->emit_rpcs_config)
> +		return -ENODEV;
> +
> +	if (allowed == i915->contexts.dynamic_sseu)
> +		return 0;
> +
> +	i915->contexts.dynamic_sseu = allowed;
> +
> +	list_for_each_entry(ctx, &i915->contexts.list, link) {
> +		struct intel_context *ce = to_intel_context(ctx, engine);
> +
> +		ret = i915_gem_context_reconfigure_sseu(ctx, engine, ce->sseu);
> +		if (ret)
> +			break;
> +	}
> +
> +	return ret;
> +}
> +
> +bool i915_gem_contexts_get_dynamic_sseu(struct drm_i915_private *i915)
> +{
> +	struct intel_engine_cs *engine = i915->engine[RCS];
> +
> +	if (!engine->emit_rpcs_config)
> +		return false;
> +
> +	return i915->contexts.dynamic_sseu;
> +}
> +
>   #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
>   #include "selftests/mock_context.c"
>   #include "selftests/i915_gem_context.c"
> diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
> index e5e6f6bb2b05..d895054d245e 100644
> --- a/drivers/gpu/drm/i915/i915_sysfs.c
> +++ b/drivers/gpu/drm/i915/i915_sysfs.c
> @@ -483,6 +483,44 @@ static ssize_t gt_rp_mhz_show(struct device *kdev, struct device_attribute *attr
>   	return snprintf(buf, PAGE_SIZE, "%d\n", val);
>   }
>   
> +static ssize_t allow_dynamic_sseu_show(struct device *kdev,
> +				       struct device_attribute *attr,
> +				       char *buf)
> +{
> +	struct drm_i915_private *i915 = kdev_minor_to_i915(kdev);
> +	bool value = i915_gem_contexts_get_dynamic_sseu(i915);
> +
> +	return snprintf(buf, PAGE_SIZE, "%d\n", value);
> +}
> +
> +static ssize_t allow_dynamic_sseu_store(struct device *kdev,
> +					struct device_attribute *attr,
> +					const char *buf, size_t count)
> +{
> +	struct drm_i915_private *i915 = kdev_minor_to_i915(kdev);
> +	ssize_t ret;
> +	u32 val;
> +
> +	ret = kstrtou32(buf, 0, &val);
> +	if (ret)
> +		return ret;
> +
> +	if (val != 0 && val != 1)
> +		return -EINVAL;
> +
> +	ret = mutex_lock_interruptible(&i915->drm.struct_mutex);
> +	if (ret)
> +		return ret;
> +
> +	ret = i915_gem_contexts_set_dynamic_sseu(i915, val != 0);
> +
> +	mutex_unlock(&i915->drm.struct_mutex);
> +
> +	return ret ?: count;
> +}
> +
> +static DEVICE_ATTR_RW(allow_dynamic_sseu);
> +
>   static const struct attribute *gen6_attrs[] = {
>   	&dev_attr_gt_act_freq_mhz.attr,
>   	&dev_attr_gt_cur_freq_mhz.attr,
> @@ -492,6 +530,7 @@ static const struct attribute *gen6_attrs[] = {
>   	&dev_attr_gt_RP0_freq_mhz.attr,
>   	&dev_attr_gt_RP1_freq_mhz.attr,
>   	&dev_attr_gt_RPn_freq_mhz.attr,
> +	&dev_attr_allow_dynamic_sseu.attr,
>   	NULL,
>   };
>   
> @@ -505,6 +544,7 @@ static const struct attribute *vlv_attrs[] = {
>   	&dev_attr_gt_RP1_freq_mhz.attr,
>   	&dev_attr_gt_RPn_freq_mhz.attr,
>   	&dev_attr_vlv_rpe_freq_mhz.attr,
> +	&dev_attr_allow_dynamic_sseu.attr,
>   	NULL,
>   };
>   
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-11 12:10   ` Tvrtko Ursulin
@ 2018-06-11 13:46     ` Lionel Landwerlin
  2018-06-11 15:02       ` Chris Wilson
  0 siblings, 1 reply; 29+ messages in thread
From: Lionel Landwerlin @ 2018-06-11 13:46 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

On 11/06/18 13:10, Tvrtko Ursulin wrote:
>
> On 30/05/2018 15:33, Lionel Landwerlin wrote:
>> There are concerns about denial of service around the per context sseu
>> configuration capability. In a previous commit introducing the
>> capability we allowed it only for capable users. This changes adds a
>> new debugfs entry to let any user configure its own context
>> powergating setup.
>
> As far as I understood it, Joonas' concerns here are:
>
> 1) That in the containers use case individual containers wouldn't be 
> able to turn on the sysfs toggle for them.
>
> 2) That also in the containers use case if box admin turns on the 
> feature, some containers would potentially start negatively affecting 
> the others (via the accumulated cost of slice re-configuration on 
> context switching).
>
> I am not familiar with typical container setups to be authoritative 
> here, but intuitively I find it reasonable that a low-level hardware 
> switch like this would be under the control of a master domain 
> administrator. ("If you are installing our product in the container 
> environment, make sure your system administrator enables this hardware 
> feature.", "Note to system administrators: Enabling this features may 
> negatively affect the performance of other containers.")
>
> Alternative proposal is for the i915 to apply an "or" filter on all 
> requested masks and in that way ensure dynamic re-configuration 
> doesn't happen on context switches, but driven from userspace via ioctls.
>
> In other words, should _all_ userspace agree between themselves that 
> they want to turn off a slice, they would then need to send out a 
> concerted ioctl storm, where number of needed ioctls equals the number 
> of currently active contexts. (This may have its own performance 
> consequences caused by the barriers needed to modify all context images.)
>
> This was deemed acceptable the the media use case, but my concern is 
> the approach is not elegant and will tie us with the "or" policy in 
> the ABI. (Performance concerns I haven't evaluated yet, but they also 
> may be significant.)
>
> If we go back thinking about the containers use case, then it 
> transpires that even though the "or" policy does prevent one container 
> from affecting the other from one angle, it also prevents one 
> container from exercising the feature unless all containers co-operate.
>
> As such, we can view the original problem statement where we have an 
> issue if not everyone co-operates, as conceptually the same just from 
> an opposite angle. (Rather than one container incurring the increased 
> cost of context switches to the rest, we would have one container 
> preventing the optimized slice configuration to the other.)
>
> From this follows that both proposals require complete co-operation 
> from all running userspace to avoid complete control of the feature.
>
> Since the balance between the benefit of optimized slice configuration 
> (or penalty of suboptimal one), versus the penalty of increased 
> context switch times, cannot be know by the driver (barring venturing 
> into the heuristics territory), that is another reason why I find the 
> "or" policy in the driver questionable.
>
> We can also ask a question of - If we go with the "or" policy, why 
> require N per-context ioctls to modify the global GPU configuration 
> and not instead add a global driver ioctl to modify the state?
>
> If a future hardware requires, or enables, the per-context behaviour 
> in a more efficient way, we could then revisit the problem space.
>
> In the mean time I see the "or" policy solution as adding some ABI 
> which doesn't do anything for many use cases without any way for the 
> sysadmin to enable it. At the same time master sysfs knob at least 
> enables the sysadmin to make a decision. Here I am thinking about a 
> random client environment where not all userspace co-operates, but for 
> instance user is running the feature aware media stack, and 
> non-feature aware OpenCL/3d stack.
>
> I guess the complete story boils down to - is the master sysfs knob 
> really a problem in container use cases.
>
> Regards,
>
> Tvrtko 

Hey Tvrtko,

Thanks for summarizing a bunch of discussions.
Essentially I agree with every you wrote above.

If we have a global setting (determined by the OR policy), what's the 
point of per context settings?

In Dmitry's scenario, all userspace applications will work together to 
reach the consensus so it sounds like we're reimplementing the policy 
that is already existing in userspace.

Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else 
than me pick one or the other :)

Cheers,

-
Lionel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-11 13:46     ` Lionel Landwerlin
@ 2018-06-11 15:02       ` Chris Wilson
  2018-06-12  9:20         ` Joonas Lahtinen
  0 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2018-06-11 15:02 UTC (permalink / raw)
  To: Lionel Landwerlin, Tvrtko Ursulin, intel-gfx

Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> On 11/06/18 13:10, Tvrtko Ursulin wrote:
> >
> > On 30/05/2018 15:33, Lionel Landwerlin wrote:
> >> There are concerns about denial of service around the per context sseu
> >> configuration capability. In a previous commit introducing the
> >> capability we allowed it only for capable users. This changes adds a
> >> new debugfs entry to let any user configure its own context
> >> powergating setup.
> >
> > As far as I understood it, Joonas' concerns here are:
> >
> > 1) That in the containers use case individual containers wouldn't be 
> > able to turn on the sysfs toggle for them.
> >
> > 2) That also in the containers use case if box admin turns on the 
> > feature, some containers would potentially start negatively affecting 
> > the others (via the accumulated cost of slice re-configuration on 
> > context switching).
> >
> > I am not familiar with typical container setups to be authoritative 
> > here, but intuitively I find it reasonable that a low-level hardware 
> > switch like this would be under the control of a master domain 
> > administrator. ("If you are installing our product in the container 
> > environment, make sure your system administrator enables this hardware 
> > feature.", "Note to system administrators: Enabling this features may 
> > negatively affect the performance of other containers.")
> >
> > Alternative proposal is for the i915 to apply an "or" filter on all 
> > requested masks and in that way ensure dynamic re-configuration 
> > doesn't happen on context switches, but driven from userspace via ioctls.
> >
> > In other words, should _all_ userspace agree between themselves that 
> > they want to turn off a slice, they would then need to send out a 
> > concerted ioctl storm, where number of needed ioctls equals the number 
> > of currently active contexts. (This may have its own performance 
> > consequences caused by the barriers needed to modify all context images.)
> >
> > This was deemed acceptable the the media use case, but my concern is 
> > the approach is not elegant and will tie us with the "or" policy in 
> > the ABI. (Performance concerns I haven't evaluated yet, but they also 
> > may be significant.)
> >
> > If we go back thinking about the containers use case, then it 
> > transpires that even though the "or" policy does prevent one container 
> > from affecting the other from one angle, it also prevents one 
> > container from exercising the feature unless all containers co-operate.
> >
> > As such, we can view the original problem statement where we have an 
> > issue if not everyone co-operates, as conceptually the same just from 
> > an opposite angle. (Rather than one container incurring the increased 
> > cost of context switches to the rest, we would have one container 
> > preventing the optimized slice configuration to the other.)
> >
> > From this follows that both proposals require complete co-operation 
> > from all running userspace to avoid complete control of the feature.
> >
> > Since the balance between the benefit of optimized slice configuration 
> > (or penalty of suboptimal one), versus the penalty of increased 
> > context switch times, cannot be know by the driver (barring venturing 
> > into the heuristics territory), that is another reason why I find the 
> > "or" policy in the driver questionable.
> >
> > We can also ask a question of - If we go with the "or" policy, why 
> > require N per-context ioctls to modify the global GPU configuration 
> > and not instead add a global driver ioctl to modify the state?
> >
> > If a future hardware requires, or enables, the per-context behaviour 
> > in a more efficient way, we could then revisit the problem space.
> >
> > In the mean time I see the "or" policy solution as adding some ABI 
> > which doesn't do anything for many use cases without any way for the 
> > sysadmin to enable it. At the same time master sysfs knob at least 
> > enables the sysadmin to make a decision. Here I am thinking about a 
> > random client environment where not all userspace co-operates, but for 
> > instance user is running the feature aware media stack, and 
> > non-feature aware OpenCL/3d stack.
> >
> > I guess the complete story boils down to - is the master sysfs knob 
> > really a problem in container use cases.
> >
> > Regards,
> >
> > Tvrtko 
> 
> Hey Tvrtko,
> 
> Thanks for summarizing a bunch of discussions.
> Essentially I agree with every you wrote above.
> 
> If we have a global setting (determined by the OR policy), what's the 
> point of per context settings?
> 
> In Dmitry's scenario, all userspace applications will work together to 
> reach the consensus so it sounds like we're reimplementing the policy 
> that is already existing in userspace.
> 
> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else 
> than me pick one or the other :)

I'll just mention the voting/consensus approach to see if anyone else
likes it.

Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
(or some other abstract names).

Then whenever the host cares, they can evaluate the set of hints
provided and make a choice on sseu config. One presumes a simple greater
good method (but you could extends that to include batch
frequency/duration to try and determine system impact on one setting or
another). Keeping it a hint helps reduce the effect of policy, though it
may still be policy and merit a switch for different implementations (or
BPF!).
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-11 15:02       ` Chris Wilson
@ 2018-06-12  9:20         ` Joonas Lahtinen
  2018-06-12 10:33           ` Lionel Landwerlin
  0 siblings, 1 reply; 29+ messages in thread
From: Joonas Lahtinen @ 2018-06-12  9:20 UTC (permalink / raw)
  To: Chris Wilson, Lionel Landwerlin, Tvrtko Ursulin, intel-gfx

Quoting Chris Wilson (2018-06-11 18:02:37)
> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> > On 11/06/18 13:10, Tvrtko Ursulin wrote:
> > >
> > > On 30/05/2018 15:33, Lionel Landwerlin wrote:
> > >> There are concerns about denial of service around the per context sseu
> > >> configuration capability. In a previous commit introducing the
> > >> capability we allowed it only for capable users. This changes adds a
> > >> new debugfs entry to let any user configure its own context
> > >> powergating setup.
> > >
> > > As far as I understood it, Joonas' concerns here are:
> > >
> > > 1) That in the containers use case individual containers wouldn't be 
> > > able to turn on the sysfs toggle for them.
> > >
> > > 2) That also in the containers use case if box admin turns on the 
> > > feature, some containers would potentially start negatively affecting 
> > > the others (via the accumulated cost of slice re-configuration on 
> > > context switching).
> > >
> > > I am not familiar with typical container setups to be authoritative 
> > > here, but intuitively I find it reasonable that a low-level hardware 
> > > switch like this would be under the control of a master domain 
> > > administrator. ("If you are installing our product in the container 
> > > environment, make sure your system administrator enables this hardware 
> > > feature.", "Note to system administrators: Enabling this features may 
> > > negatively affect the performance of other containers.")
> > >
> > > Alternative proposal is for the i915 to apply an "or" filter on all 
> > > requested masks and in that way ensure dynamic re-configuration 
> > > doesn't happen on context switches, but driven from userspace via ioctls.
> > >
> > > In other words, should _all_ userspace agree between themselves that 
> > > they want to turn off a slice, they would then need to send out a 
> > > concerted ioctl storm, where number of needed ioctls equals the number 
> > > of currently active contexts. (This may have its own performance 
> > > consequences caused by the barriers needed to modify all context images.)
> > >
> > > This was deemed acceptable the the media use case, but my concern is 
> > > the approach is not elegant and will tie us with the "or" policy in 
> > > the ABI. (Performance concerns I haven't evaluated yet, but they also 
> > > may be significant.)
> > >
> > > If we go back thinking about the containers use case, then it 
> > > transpires that even though the "or" policy does prevent one container 
> > > from affecting the other from one angle, it also prevents one 
> > > container from exercising the feature unless all containers co-operate.
> > >
> > > As such, we can view the original problem statement where we have an 
> > > issue if not everyone co-operates, as conceptually the same just from 
> > > an opposite angle. (Rather than one container incurring the increased 
> > > cost of context switches to the rest, we would have one container 
> > > preventing the optimized slice configuration to the other.)
> > >
> > > From this follows that both proposals require complete co-operation 
> > > from all running userspace to avoid complete control of the feature.
> > >
> > > Since the balance between the benefit of optimized slice configuration 
> > > (or penalty of suboptimal one), versus the penalty of increased 
> > > context switch times, cannot be know by the driver (barring venturing 
> > > into the heuristics territory), that is another reason why I find the 
> > > "or" policy in the driver questionable.
> > >
> > > We can also ask a question of - If we go with the "or" policy, why 
> > > require N per-context ioctls to modify the global GPU configuration 
> > > and not instead add a global driver ioctl to modify the state?
> > >
> > > If a future hardware requires, or enables, the per-context behaviour 
> > > in a more efficient way, we could then revisit the problem space.
> > >
> > > In the mean time I see the "or" policy solution as adding some ABI 
> > > which doesn't do anything for many use cases without any way for the 
> > > sysadmin to enable it. At the same time master sysfs knob at least 
> > > enables the sysadmin to make a decision. Here I am thinking about a 
> > > random client environment where not all userspace co-operates, but for 
> > > instance user is running the feature aware media stack, and 
> > > non-feature aware OpenCL/3d stack.
> > >
> > > I guess the complete story boils down to - is the master sysfs knob 
> > > really a problem in container use cases.
> > >
> > > Regards,
> > >
> > > Tvrtko 
> > 
> > Hey Tvrtko,
> > 
> > Thanks for summarizing a bunch of discussions.
> > Essentially I agree with every you wrote above.
> > 
> > If we have a global setting (determined by the OR policy), what's the 
> > point of per context settings?
> > 
> > In Dmitry's scenario, all userspace applications will work together to 
> > reach the consensus so it sounds like we're reimplementing the policy 
> > that is already existing in userspace.
> > 
> > Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else 
> > than me pick one or the other :)
> 
> I'll just mention the voting/consensus approach to see if anyone else
> likes it.
> 
> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
> (or some other abstract names).

Yeah, the param name should have the word _HINT_ in it when it's not a
definitive set.

There's no global setter across containers, only a scenario when
everyone agrees or not. Tallying up the votes and going with a majority
vote might be an option, too.

Regards, Joonas

> 
> Then whenever the host cares, they can evaluate the set of hints
> provided and make a choice on sseu config. One presumes a simple greater
> good method (but you could extends that to include batch
> frequency/duration to try and determine system impact on one setting or
> another). Keeping it a hint helps reduce the effect of policy, though it
> may still be policy and merit a switch for different implementations (or
> BPF!).
> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-12  9:20         ` Joonas Lahtinen
@ 2018-06-12 10:33           ` Lionel Landwerlin
  2018-06-12 10:37             ` Chris Wilson
  0 siblings, 1 reply; 29+ messages in thread
From: Lionel Landwerlin @ 2018-06-12 10:33 UTC (permalink / raw)
  To: Joonas Lahtinen, Chris Wilson, Tvrtko Ursulin, intel-gfx

On 12/06/18 10:20, Joonas Lahtinen wrote:
> Quoting Chris Wilson (2018-06-11 18:02:37)
>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
>>>>> There are concerns about denial of service around the per context sseu
>>>>> configuration capability. In a previous commit introducing the
>>>>> capability we allowed it only for capable users. This changes adds a
>>>>> new debugfs entry to let any user configure its own context
>>>>> powergating setup.
>>>> As far as I understood it, Joonas' concerns here are:
>>>>
>>>> 1) That in the containers use case individual containers wouldn't be
>>>> able to turn on the sysfs toggle for them.
>>>>
>>>> 2) That also in the containers use case if box admin turns on the
>>>> feature, some containers would potentially start negatively affecting
>>>> the others (via the accumulated cost of slice re-configuration on
>>>> context switching).
>>>>
>>>> I am not familiar with typical container setups to be authoritative
>>>> here, but intuitively I find it reasonable that a low-level hardware
>>>> switch like this would be under the control of a master domain
>>>> administrator. ("If you are installing our product in the container
>>>> environment, make sure your system administrator enables this hardware
>>>> feature.", "Note to system administrators: Enabling this features may
>>>> negatively affect the performance of other containers.")
>>>>
>>>> Alternative proposal is for the i915 to apply an "or" filter on all
>>>> requested masks and in that way ensure dynamic re-configuration
>>>> doesn't happen on context switches, but driven from userspace via ioctls.
>>>>
>>>> In other words, should _all_ userspace agree between themselves that
>>>> they want to turn off a slice, they would then need to send out a
>>>> concerted ioctl storm, where number of needed ioctls equals the number
>>>> of currently active contexts. (This may have its own performance
>>>> consequences caused by the barriers needed to modify all context images.)
>>>>
>>>> This was deemed acceptable the the media use case, but my concern is
>>>> the approach is not elegant and will tie us with the "or" policy in
>>>> the ABI. (Performance concerns I haven't evaluated yet, but they also
>>>> may be significant.)
>>>>
>>>> If we go back thinking about the containers use case, then it
>>>> transpires that even though the "or" policy does prevent one container
>>>> from affecting the other from one angle, it also prevents one
>>>> container from exercising the feature unless all containers co-operate.
>>>>
>>>> As such, we can view the original problem statement where we have an
>>>> issue if not everyone co-operates, as conceptually the same just from
>>>> an opposite angle. (Rather than one container incurring the increased
>>>> cost of context switches to the rest, we would have one container
>>>> preventing the optimized slice configuration to the other.)
>>>>
>>>>  From this follows that both proposals require complete co-operation
>>>> from all running userspace to avoid complete control of the feature.
>>>>
>>>> Since the balance between the benefit of optimized slice configuration
>>>> (or penalty of suboptimal one), versus the penalty of increased
>>>> context switch times, cannot be know by the driver (barring venturing
>>>> into the heuristics territory), that is another reason why I find the
>>>> "or" policy in the driver questionable.
>>>>
>>>> We can also ask a question of - If we go with the "or" policy, why
>>>> require N per-context ioctls to modify the global GPU configuration
>>>> and not instead add a global driver ioctl to modify the state?
>>>>
>>>> If a future hardware requires, or enables, the per-context behaviour
>>>> in a more efficient way, we could then revisit the problem space.
>>>>
>>>> In the mean time I see the "or" policy solution as adding some ABI
>>>> which doesn't do anything for many use cases without any way for the
>>>> sysadmin to enable it. At the same time master sysfs knob at least
>>>> enables the sysadmin to make a decision. Here I am thinking about a
>>>> random client environment where not all userspace co-operates, but for
>>>> instance user is running the feature aware media stack, and
>>>> non-feature aware OpenCL/3d stack.
>>>>
>>>> I guess the complete story boils down to - is the master sysfs knob
>>>> really a problem in container use cases.
>>>>
>>>> Regards,
>>>>
>>>> Tvrtko
>>> Hey Tvrtko,
>>>
>>> Thanks for summarizing a bunch of discussions.
>>> Essentially I agree with every you wrote above.
>>>
>>> If we have a global setting (determined by the OR policy), what's the
>>> point of per context settings?
>>>
>>> In Dmitry's scenario, all userspace applications will work together to
>>> reach the consensus so it sounds like we're reimplementing the policy
>>> that is already existing in userspace.
>>>
>>> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
>>> than me pick one or the other :)
>> I'll just mention the voting/consensus approach to see if anyone else
>> likes it.
>>
>> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
>> (or some other abstract names).
> Yeah, the param name should have the word _HINT_ in it when it's not a
> definitive set.
>
> There's no global setter across containers, only a scenario when
> everyone agrees or not. Tallying up the votes and going with a majority
> vote might be an option, too.
>
> Regards, Joonas

Trying to test the "everyone agrees" approach here.
There are a number of processes that can hold onto a gem context and 
therefore prevent agreement.
On my system plymouthd & systemd-login have a number of contexts opened.

A process simply opening&closing a render node could flip the system 
back and forth between 2 configurations.

Does that change people's mind about how we should go about this?

-
Lionel

>
>> Then whenever the host cares, they can evaluate the set of hints
>> provided and make a choice on sseu config. One presumes a simple greater
>> good method (but you could extends that to include batch
>> frequency/duration to try and determine system impact on one setting or
>> another). Keeping it a hint helps reduce the effect of policy, though it
>> may still be policy and merit a switch for different implementations (or
>> BPF!).
>> -Chris
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-12 10:33           ` Lionel Landwerlin
@ 2018-06-12 10:37             ` Chris Wilson
  2018-06-12 10:52               ` Lionel Landwerlin
  0 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2018-06-12 10:37 UTC (permalink / raw)
  To: Lionel Landwerlin, Joonas Lahtinen, Tvrtko Ursulin, intel-gfx

Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> On 12/06/18 10:20, Joonas Lahtinen wrote:
> > Quoting Chris Wilson (2018-06-11 18:02:37)
> >> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> >>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
> >>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
> >>>>> There are concerns about denial of service around the per context sseu
> >>>>> configuration capability. In a previous commit introducing the
> >>>>> capability we allowed it only for capable users. This changes adds a
> >>>>> new debugfs entry to let any user configure its own context
> >>>>> powergating setup.
> >>>> As far as I understood it, Joonas' concerns here are:
> >>>>
> >>>> 1) That in the containers use case individual containers wouldn't be
> >>>> able to turn on the sysfs toggle for them.
> >>>>
> >>>> 2) That also in the containers use case if box admin turns on the
> >>>> feature, some containers would potentially start negatively affecting
> >>>> the others (via the accumulated cost of slice re-configuration on
> >>>> context switching).
> >>>>
> >>>> I am not familiar with typical container setups to be authoritative
> >>>> here, but intuitively I find it reasonable that a low-level hardware
> >>>> switch like this would be under the control of a master domain
> >>>> administrator. ("If you are installing our product in the container
> >>>> environment, make sure your system administrator enables this hardware
> >>>> feature.", "Note to system administrators: Enabling this features may
> >>>> negatively affect the performance of other containers.")
> >>>>
> >>>> Alternative proposal is for the i915 to apply an "or" filter on all
> >>>> requested masks and in that way ensure dynamic re-configuration
> >>>> doesn't happen on context switches, but driven from userspace via ioctls.
> >>>>
> >>>> In other words, should _all_ userspace agree between themselves that
> >>>> they want to turn off a slice, they would then need to send out a
> >>>> concerted ioctl storm, where number of needed ioctls equals the number
> >>>> of currently active contexts. (This may have its own performance
> >>>> consequences caused by the barriers needed to modify all context images.)
> >>>>
> >>>> This was deemed acceptable the the media use case, but my concern is
> >>>> the approach is not elegant and will tie us with the "or" policy in
> >>>> the ABI. (Performance concerns I haven't evaluated yet, but they also
> >>>> may be significant.)
> >>>>
> >>>> If we go back thinking about the containers use case, then it
> >>>> transpires that even though the "or" policy does prevent one container
> >>>> from affecting the other from one angle, it also prevents one
> >>>> container from exercising the feature unless all containers co-operate.
> >>>>
> >>>> As such, we can view the original problem statement where we have an
> >>>> issue if not everyone co-operates, as conceptually the same just from
> >>>> an opposite angle. (Rather than one container incurring the increased
> >>>> cost of context switches to the rest, we would have one container
> >>>> preventing the optimized slice configuration to the other.)
> >>>>
> >>>>  From this follows that both proposals require complete co-operation
> >>>> from all running userspace to avoid complete control of the feature.
> >>>>
> >>>> Since the balance between the benefit of optimized slice configuration
> >>>> (or penalty of suboptimal one), versus the penalty of increased
> >>>> context switch times, cannot be know by the driver (barring venturing
> >>>> into the heuristics territory), that is another reason why I find the
> >>>> "or" policy in the driver questionable.
> >>>>
> >>>> We can also ask a question of - If we go with the "or" policy, why
> >>>> require N per-context ioctls to modify the global GPU configuration
> >>>> and not instead add a global driver ioctl to modify the state?
> >>>>
> >>>> If a future hardware requires, or enables, the per-context behaviour
> >>>> in a more efficient way, we could then revisit the problem space.
> >>>>
> >>>> In the mean time I see the "or" policy solution as adding some ABI
> >>>> which doesn't do anything for many use cases without any way for the
> >>>> sysadmin to enable it. At the same time master sysfs knob at least
> >>>> enables the sysadmin to make a decision. Here I am thinking about a
> >>>> random client environment where not all userspace co-operates, but for
> >>>> instance user is running the feature aware media stack, and
> >>>> non-feature aware OpenCL/3d stack.
> >>>>
> >>>> I guess the complete story boils down to - is the master sysfs knob
> >>>> really a problem in container use cases.
> >>>>
> >>>> Regards,
> >>>>
> >>>> Tvrtko
> >>> Hey Tvrtko,
> >>>
> >>> Thanks for summarizing a bunch of discussions.
> >>> Essentially I agree with every you wrote above.
> >>>
> >>> If we have a global setting (determined by the OR policy), what's the
> >>> point of per context settings?
> >>>
> >>> In Dmitry's scenario, all userspace applications will work together to
> >>> reach the consensus so it sounds like we're reimplementing the policy
> >>> that is already existing in userspace.
> >>>
> >>> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
> >>> than me pick one or the other :)
> >> I'll just mention the voting/consensus approach to see if anyone else
> >> likes it.
> >>
> >> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
> >> (or some other abstract names).
> > Yeah, the param name should have the word _HINT_ in it when it's not a
> > definitive set.
> >
> > There's no global setter across containers, only a scenario when
> > everyone agrees or not. Tallying up the votes and going with a majority
> > vote might be an option, too.
> >
> > Regards, Joonas
> 
> Trying to test the "everyone agrees" approach here.

It's not everyone agrees, but the greater good. 

> There are a number of processes that can hold onto a gem context and 
> therefore prevent agreement.
> On my system plymouthd & systemd-login have a number of contexts opened.
But they should be dontcare?

There should only be a few processes that insist on a particular
configuration, afui.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-12 10:37             ` Chris Wilson
@ 2018-06-12 10:52               ` Lionel Landwerlin
  2018-06-12 12:02                 ` Tvrtko Ursulin
  2018-06-12 12:02                 ` Chris Wilson
  0 siblings, 2 replies; 29+ messages in thread
From: Lionel Landwerlin @ 2018-06-12 10:52 UTC (permalink / raw)
  To: Chris Wilson, Joonas Lahtinen, Tvrtko Ursulin, intel-gfx

On 12/06/18 11:37, Chris Wilson wrote:
> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
>> On 12/06/18 10:20, Joonas Lahtinen wrote:
>>> Quoting Chris Wilson (2018-06-11 18:02:37)
>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
>>>>>>> There are concerns about denial of service around the per context sseu
>>>>>>> configuration capability. In a previous commit introducing the
>>>>>>> capability we allowed it only for capable users. This changes adds a
>>>>>>> new debugfs entry to let any user configure its own context
>>>>>>> powergating setup.
>>>>>> As far as I understood it, Joonas' concerns here are:
>>>>>>
>>>>>> 1) That in the containers use case individual containers wouldn't be
>>>>>> able to turn on the sysfs toggle for them.
>>>>>>
>>>>>> 2) That also in the containers use case if box admin turns on the
>>>>>> feature, some containers would potentially start negatively affecting
>>>>>> the others (via the accumulated cost of slice re-configuration on
>>>>>> context switching).
>>>>>>
>>>>>> I am not familiar with typical container setups to be authoritative
>>>>>> here, but intuitively I find it reasonable that a low-level hardware
>>>>>> switch like this would be under the control of a master domain
>>>>>> administrator. ("If you are installing our product in the container
>>>>>> environment, make sure your system administrator enables this hardware
>>>>>> feature.", "Note to system administrators: Enabling this features may
>>>>>> negatively affect the performance of other containers.")
>>>>>>
>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
>>>>>> requested masks and in that way ensure dynamic re-configuration
>>>>>> doesn't happen on context switches, but driven from userspace via ioctls.
>>>>>>
>>>>>> In other words, should _all_ userspace agree between themselves that
>>>>>> they want to turn off a slice, they would then need to send out a
>>>>>> concerted ioctl storm, where number of needed ioctls equals the number
>>>>>> of currently active contexts. (This may have its own performance
>>>>>> consequences caused by the barriers needed to modify all context images.)
>>>>>>
>>>>>> This was deemed acceptable the the media use case, but my concern is
>>>>>> the approach is not elegant and will tie us with the "or" policy in
>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but they also
>>>>>> may be significant.)
>>>>>>
>>>>>> If we go back thinking about the containers use case, then it
>>>>>> transpires that even though the "or" policy does prevent one container
>>>>>> from affecting the other from one angle, it also prevents one
>>>>>> container from exercising the feature unless all containers co-operate.
>>>>>>
>>>>>> As such, we can view the original problem statement where we have an
>>>>>> issue if not everyone co-operates, as conceptually the same just from
>>>>>> an opposite angle. (Rather than one container incurring the increased
>>>>>> cost of context switches to the rest, we would have one container
>>>>>> preventing the optimized slice configuration to the other.)
>>>>>>
>>>>>>   From this follows that both proposals require complete co-operation
>>>>>> from all running userspace to avoid complete control of the feature.
>>>>>>
>>>>>> Since the balance between the benefit of optimized slice configuration
>>>>>> (or penalty of suboptimal one), versus the penalty of increased
>>>>>> context switch times, cannot be know by the driver (barring venturing
>>>>>> into the heuristics territory), that is another reason why I find the
>>>>>> "or" policy in the driver questionable.
>>>>>>
>>>>>> We can also ask a question of - If we go with the "or" policy, why
>>>>>> require N per-context ioctls to modify the global GPU configuration
>>>>>> and not instead add a global driver ioctl to modify the state?
>>>>>>
>>>>>> If a future hardware requires, or enables, the per-context behaviour
>>>>>> in a more efficient way, we could then revisit the problem space.
>>>>>>
>>>>>> In the mean time I see the "or" policy solution as adding some ABI
>>>>>> which doesn't do anything for many use cases without any way for the
>>>>>> sysadmin to enable it. At the same time master sysfs knob at least
>>>>>> enables the sysadmin to make a decision. Here I am thinking about a
>>>>>> random client environment where not all userspace co-operates, but for
>>>>>> instance user is running the feature aware media stack, and
>>>>>> non-feature aware OpenCL/3d stack.
>>>>>>
>>>>>> I guess the complete story boils down to - is the master sysfs knob
>>>>>> really a problem in container use cases.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Tvrtko
>>>>> Hey Tvrtko,
>>>>>
>>>>> Thanks for summarizing a bunch of discussions.
>>>>> Essentially I agree with every you wrote above.
>>>>>
>>>>> If we have a global setting (determined by the OR policy), what's the
>>>>> point of per context settings?
>>>>>
>>>>> In Dmitry's scenario, all userspace applications will work together to
>>>>> reach the consensus so it sounds like we're reimplementing the policy
>>>>> that is already existing in userspace.
>>>>>
>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
>>>>> than me pick one or the other :)
>>>> I'll just mention the voting/consensus approach to see if anyone else
>>>> likes it.
>>>>
>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
>>>> (or some other abstract names).
>>> Yeah, the param name should have the word _HINT_ in it when it's not a
>>> definitive set.
>>>
>>> There's no global setter across containers, only a scenario when
>>> everyone agrees or not. Tallying up the votes and going with a majority
>>> vote might be an option, too.
>>>
>>> Regards, Joonas
>> Trying to test the "everyone agrees" approach here.
> It's not everyone agrees, but the greater good.

I'm looking forward to the definition of the greater good :)
Tvrtko wanted to avoid the heuristic territory, it seems like we're just 
stepping into it.

-
Lionel

>> There are a number of processes that can hold onto a gem context and
>> therefore prevent agreement.
>> On my system plymouthd & systemd-login have a number of contexts opened.
> But they should be dontcare?
>
> There should only be a few processes that insist on a particular
> configuration, afui.
> -Chris
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-12 10:52               ` Lionel Landwerlin
@ 2018-06-12 12:02                 ` Tvrtko Ursulin
  2018-06-13 12:49                   ` Joonas Lahtinen
  2018-06-12 12:02                 ` Chris Wilson
  1 sibling, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2018-06-12 12:02 UTC (permalink / raw)
  To: Lionel Landwerlin, Chris Wilson, Joonas Lahtinen, intel-gfx


On 12/06/2018 11:52, Lionel Landwerlin wrote:
> On 12/06/18 11:37, Chris Wilson wrote:
>> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
>>> On 12/06/18 10:20, Joonas Lahtinen wrote:
>>>> Quoting Chris Wilson (2018-06-11 18:02:37)
>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
>>>>>>>> There are concerns about denial of service around the per 
>>>>>>>> context sseu
>>>>>>>> configuration capability. In a previous commit introducing the
>>>>>>>> capability we allowed it only for capable users. This changes 
>>>>>>>> adds a
>>>>>>>> new debugfs entry to let any user configure its own context
>>>>>>>> powergating setup.
>>>>>>> As far as I understood it, Joonas' concerns here are:
>>>>>>>
>>>>>>> 1) That in the containers use case individual containers wouldn't be
>>>>>>> able to turn on the sysfs toggle for them.
>>>>>>>
>>>>>>> 2) That also in the containers use case if box admin turns on the
>>>>>>> feature, some containers would potentially start negatively 
>>>>>>> affecting
>>>>>>> the others (via the accumulated cost of slice re-configuration on
>>>>>>> context switching).
>>>>>>>
>>>>>>> I am not familiar with typical container setups to be authoritative
>>>>>>> here, but intuitively I find it reasonable that a low-level hardware
>>>>>>> switch like this would be under the control of a master domain
>>>>>>> administrator. ("If you are installing our product in the container
>>>>>>> environment, make sure your system administrator enables this 
>>>>>>> hardware
>>>>>>> feature.", "Note to system administrators: Enabling this features 
>>>>>>> may
>>>>>>> negatively affect the performance of other containers.")
>>>>>>>
>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
>>>>>>> requested masks and in that way ensure dynamic re-configuration
>>>>>>> doesn't happen on context switches, but driven from userspace via 
>>>>>>> ioctls.
>>>>>>>
>>>>>>> In other words, should _all_ userspace agree between themselves that
>>>>>>> they want to turn off a slice, they would then need to send out a
>>>>>>> concerted ioctl storm, where number of needed ioctls equals the 
>>>>>>> number
>>>>>>> of currently active contexts. (This may have its own performance
>>>>>>> consequences caused by the barriers needed to modify all context 
>>>>>>> images.)
>>>>>>>
>>>>>>> This was deemed acceptable the the media use case, but my concern is
>>>>>>> the approach is not elegant and will tie us with the "or" policy in
>>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but they 
>>>>>>> also
>>>>>>> may be significant.)
>>>>>>>
>>>>>>> If we go back thinking about the containers use case, then it
>>>>>>> transpires that even though the "or" policy does prevent one 
>>>>>>> container
>>>>>>> from affecting the other from one angle, it also prevents one
>>>>>>> container from exercising the feature unless all containers 
>>>>>>> co-operate.
>>>>>>>
>>>>>>> As such, we can view the original problem statement where we have an
>>>>>>> issue if not everyone co-operates, as conceptually the same just 
>>>>>>> from
>>>>>>> an opposite angle. (Rather than one container incurring the 
>>>>>>> increased
>>>>>>> cost of context switches to the rest, we would have one container
>>>>>>> preventing the optimized slice configuration to the other.)
>>>>>>>
>>>>>>>   From this follows that both proposals require complete 
>>>>>>> co-operation
>>>>>>> from all running userspace to avoid complete control of the feature.
>>>>>>>
>>>>>>> Since the balance between the benefit of optimized slice 
>>>>>>> configuration
>>>>>>> (or penalty of suboptimal one), versus the penalty of increased
>>>>>>> context switch times, cannot be know by the driver (barring 
>>>>>>> venturing
>>>>>>> into the heuristics territory), that is another reason why I find 
>>>>>>> the
>>>>>>> "or" policy in the driver questionable.
>>>>>>>
>>>>>>> We can also ask a question of - If we go with the "or" policy, why
>>>>>>> require N per-context ioctls to modify the global GPU configuration
>>>>>>> and not instead add a global driver ioctl to modify the state?
>>>>>>>
>>>>>>> If a future hardware requires, or enables, the per-context behaviour
>>>>>>> in a more efficient way, we could then revisit the problem space.
>>>>>>>
>>>>>>> In the mean time I see the "or" policy solution as adding some ABI
>>>>>>> which doesn't do anything for many use cases without any way for the
>>>>>>> sysadmin to enable it. At the same time master sysfs knob at least
>>>>>>> enables the sysadmin to make a decision. Here I am thinking about a
>>>>>>> random client environment where not all userspace co-operates, 
>>>>>>> but for
>>>>>>> instance user is running the feature aware media stack, and
>>>>>>> non-feature aware OpenCL/3d stack.
>>>>>>>
>>>>>>> I guess the complete story boils down to - is the master sysfs knob
>>>>>>> really a problem in container use cases.
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> Tvrtko
>>>>>> Hey Tvrtko,
>>>>>>
>>>>>> Thanks for summarizing a bunch of discussions.
>>>>>> Essentially I agree with every you wrote above.
>>>>>>
>>>>>> If we have a global setting (determined by the OR policy), what's the
>>>>>> point of per context settings?
>>>>>>
>>>>>> In Dmitry's scenario, all userspace applications will work 
>>>>>> together to
>>>>>> reach the consensus so it sounds like we're reimplementing the policy
>>>>>> that is already existing in userspace.
>>>>>>
>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
>>>>>> than me pick one or the other :)
>>>>> I'll just mention the voting/consensus approach to see if anyone else
>>>>> likes it.
>>>>>
>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
>>>>> (or some other abstract names).
>>>> Yeah, the param name should have the word _HINT_ in it when it's not a
>>>> definitive set.
>>>>
>>>> There's no global setter across containers, only a scenario when
>>>> everyone agrees or not. Tallying up the votes and going with a majority
>>>> vote might be an option, too.
>>>>
>>>> Regards, Joonas
>>> Trying to test the "everyone agrees" approach here.
>> It's not everyone agrees, but the greater good.
> 
> I'm looking forward to the definition of the greater good :)
> Tvrtko wanted to avoid the heuristic territory, it seems like we're just 
> stepping into it.

I am not sure that "small, dontcare, large" models brings much. No one 
would probably set "dontcare" since we have to default new contexts to 
large to be compatible.

Don't know, I still prefer the master knob option. Honestly don't yet 
see the containers use case as a problem. There is always a master 
domain in any system where the knob can be enabled if the customers on 
the system are such to warrant it. On mixed systems enable it or not 
someone always suffers. And with the knob we are free to add heuristics 
later, keep the uapi, and just default the knob to on.

I think the focus should be reaching a consensus whether the containers 
use case is a problem with the master knob or not.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-12 10:52               ` Lionel Landwerlin
  2018-06-12 12:02                 ` Tvrtko Ursulin
@ 2018-06-12 12:02                 ` Chris Wilson
  1 sibling, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2018-06-12 12:02 UTC (permalink / raw)
  To: Lionel Landwerlin, Joonas Lahtinen, Tvrtko Ursulin, intel-gfx

Quoting Lionel Landwerlin (2018-06-12 11:52:10)
> I'm looking forward to the definition of the greater good :)
> Tvrtko wanted to avoid the heuristic territory, it seems like we're just 
> stepping into it.

If we have to make any choice, we have not just stepped, but dived
head first into it. Tbh, I quite like the idea of having BPF policy
hooks. (Devil will definitely be in the details!)
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-12 12:02                 ` Tvrtko Ursulin
@ 2018-06-13 12:49                   ` Joonas Lahtinen
  2018-06-14  8:28                     ` Tvrtko Ursulin
  0 siblings, 1 reply; 29+ messages in thread
From: Joonas Lahtinen @ 2018-06-13 12:49 UTC (permalink / raw)
  To: Chris Wilson, Lionel Landwerlin, Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
> 
> On 12/06/2018 11:52, Lionel Landwerlin wrote:
> > On 12/06/18 11:37, Chris Wilson wrote:
> >> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> >>> On 12/06/18 10:20, Joonas Lahtinen wrote:
> >>>> Quoting Chris Wilson (2018-06-11 18:02:37)
> >>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> >>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
> >>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
> >>>>>>>> There are concerns about denial of service around the per 
> >>>>>>>> context sseu
> >>>>>>>> configuration capability. In a previous commit introducing the
> >>>>>>>> capability we allowed it only for capable users. This changes 
> >>>>>>>> adds a
> >>>>>>>> new debugfs entry to let any user configure its own context
> >>>>>>>> powergating setup.
> >>>>>>> As far as I understood it, Joonas' concerns here are:
> >>>>>>>
> >>>>>>> 1) That in the containers use case individual containers wouldn't be
> >>>>>>> able to turn on the sysfs toggle for them.
> >>>>>>>
> >>>>>>> 2) That also in the containers use case if box admin turns on the
> >>>>>>> feature, some containers would potentially start negatively 
> >>>>>>> affecting
> >>>>>>> the others (via the accumulated cost of slice re-configuration on
> >>>>>>> context switching).
> >>>>>>>
> >>>>>>> I am not familiar with typical container setups to be authoritative
> >>>>>>> here, but intuitively I find it reasonable that a low-level hardware
> >>>>>>> switch like this would be under the control of a master domain
> >>>>>>> administrator. ("If you are installing our product in the container
> >>>>>>> environment, make sure your system administrator enables this 
> >>>>>>> hardware
> >>>>>>> feature.", "Note to system administrators: Enabling this features 
> >>>>>>> may
> >>>>>>> negatively affect the performance of other containers.")
> >>>>>>>
> >>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
> >>>>>>> requested masks and in that way ensure dynamic re-configuration
> >>>>>>> doesn't happen on context switches, but driven from userspace via 
> >>>>>>> ioctls.
> >>>>>>>
> >>>>>>> In other words, should _all_ userspace agree between themselves that
> >>>>>>> they want to turn off a slice, they would then need to send out a
> >>>>>>> concerted ioctl storm, where number of needed ioctls equals the 
> >>>>>>> number
> >>>>>>> of currently active contexts. (This may have its own performance
> >>>>>>> consequences caused by the barriers needed to modify all context 
> >>>>>>> images.)
> >>>>>>>
> >>>>>>> This was deemed acceptable the the media use case, but my concern is
> >>>>>>> the approach is not elegant and will tie us with the "or" policy in
> >>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but they 
> >>>>>>> also
> >>>>>>> may be significant.)
> >>>>>>>
> >>>>>>> If we go back thinking about the containers use case, then it
> >>>>>>> transpires that even though the "or" policy does prevent one 
> >>>>>>> container
> >>>>>>> from affecting the other from one angle, it also prevents one
> >>>>>>> container from exercising the feature unless all containers 
> >>>>>>> co-operate.
> >>>>>>>
> >>>>>>> As such, we can view the original problem statement where we have an
> >>>>>>> issue if not everyone co-operates, as conceptually the same just 
> >>>>>>> from
> >>>>>>> an opposite angle. (Rather than one container incurring the 
> >>>>>>> increased
> >>>>>>> cost of context switches to the rest, we would have one container
> >>>>>>> preventing the optimized slice configuration to the other.)
> >>>>>>>
> >>>>>>>   From this follows that both proposals require complete 
> >>>>>>> co-operation
> >>>>>>> from all running userspace to avoid complete control of the feature.
> >>>>>>>
> >>>>>>> Since the balance between the benefit of optimized slice 
> >>>>>>> configuration
> >>>>>>> (or penalty of suboptimal one), versus the penalty of increased
> >>>>>>> context switch times, cannot be know by the driver (barring 
> >>>>>>> venturing
> >>>>>>> into the heuristics territory), that is another reason why I find 
> >>>>>>> the
> >>>>>>> "or" policy in the driver questionable.
> >>>>>>>
> >>>>>>> We can also ask a question of - If we go with the "or" policy, why
> >>>>>>> require N per-context ioctls to modify the global GPU configuration
> >>>>>>> and not instead add a global driver ioctl to modify the state?
> >>>>>>>
> >>>>>>> If a future hardware requires, or enables, the per-context behaviour
> >>>>>>> in a more efficient way, we could then revisit the problem space.
> >>>>>>>
> >>>>>>> In the mean time I see the "or" policy solution as adding some ABI
> >>>>>>> which doesn't do anything for many use cases without any way for the
> >>>>>>> sysadmin to enable it. At the same time master sysfs knob at least
> >>>>>>> enables the sysadmin to make a decision. Here I am thinking about a
> >>>>>>> random client environment where not all userspace co-operates, 
> >>>>>>> but for
> >>>>>>> instance user is running the feature aware media stack, and
> >>>>>>> non-feature aware OpenCL/3d stack.
> >>>>>>>
> >>>>>>> I guess the complete story boils down to - is the master sysfs knob
> >>>>>>> really a problem in container use cases.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>>
> >>>>>>> Tvrtko
> >>>>>> Hey Tvrtko,
> >>>>>>
> >>>>>> Thanks for summarizing a bunch of discussions.
> >>>>>> Essentially I agree with every you wrote above.
> >>>>>>
> >>>>>> If we have a global setting (determined by the OR policy), what's the
> >>>>>> point of per context settings?
> >>>>>>
> >>>>>> In Dmitry's scenario, all userspace applications will work 
> >>>>>> together to
> >>>>>> reach the consensus so it sounds like we're reimplementing the policy
> >>>>>> that is already existing in userspace.
> >>>>>>
> >>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
> >>>>>> than me pick one or the other :)
> >>>>> I'll just mention the voting/consensus approach to see if anyone else
> >>>>> likes it.
> >>>>>
> >>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
> >>>>> (or some other abstract names).
> >>>> Yeah, the param name should have the word _HINT_ in it when it's not a
> >>>> definitive set.
> >>>>
> >>>> There's no global setter across containers, only a scenario when
> >>>> everyone agrees or not. Tallying up the votes and going with a majority
> >>>> vote might be an option, too.
> >>>>
> >>>> Regards, Joonas
> >>> Trying to test the "everyone agrees" approach here.
> >> It's not everyone agrees, but the greater good.
> > 
> > I'm looking forward to the definition of the greater good :)
> > Tvrtko wanted to avoid the heuristic territory, it seems like we're just 
> > stepping into it.
> 
> I am not sure that "small, dontcare, large" models brings much. No one 
> would probably set "dontcare" since we have to default new contexts to 
> large to be compatible.
> 
> Don't know, I still prefer the master knob option. Honestly don't yet 
> see the containers use case as a problem. There is always a master 
> domain in any system where the knob can be enabled if the customers on 
> the system are such to warrant it. On mixed systems enable it or not 
> someone always suffers. And with the knob we are free to add heuristics 
> later, keep the uapi, and just default the knob to on.

Master knob effectively means dead code behind a switch, that's not very
upstream friendly.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-13 12:49                   ` Joonas Lahtinen
@ 2018-06-14  8:28                     ` Tvrtko Ursulin
  2018-07-18 16:44                       ` Bloomfield, Jon
  0 siblings, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2018-06-14  8:28 UTC (permalink / raw)
  To: Joonas Lahtinen, Chris Wilson, Lionel Landwerlin, intel-gfx


On 13/06/2018 13:49, Joonas Lahtinen wrote:
> Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
>>
>> On 12/06/2018 11:52, Lionel Landwerlin wrote:
>>> On 12/06/18 11:37, Chris Wilson wrote:
>>>> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
>>>>> On 12/06/18 10:20, Joonas Lahtinen wrote:
>>>>>> Quoting Chris Wilson (2018-06-11 18:02:37)
>>>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
>>>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
>>>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
>>>>>>>>>> There are concerns about denial of service around the per
>>>>>>>>>> context sseu
>>>>>>>>>> configuration capability. In a previous commit introducing the
>>>>>>>>>> capability we allowed it only for capable users. This changes
>>>>>>>>>> adds a
>>>>>>>>>> new debugfs entry to let any user configure its own context
>>>>>>>>>> powergating setup.
>>>>>>>>> As far as I understood it, Joonas' concerns here are:
>>>>>>>>>
>>>>>>>>> 1) That in the containers use case individual containers wouldn't be
>>>>>>>>> able to turn on the sysfs toggle for them.
>>>>>>>>>
>>>>>>>>> 2) That also in the containers use case if box admin turns on the
>>>>>>>>> feature, some containers would potentially start negatively
>>>>>>>>> affecting
>>>>>>>>> the others (via the accumulated cost of slice re-configuration on
>>>>>>>>> context switching).
>>>>>>>>>
>>>>>>>>> I am not familiar with typical container setups to be authoritative
>>>>>>>>> here, but intuitively I find it reasonable that a low-level hardware
>>>>>>>>> switch like this would be under the control of a master domain
>>>>>>>>> administrator. ("If you are installing our product in the container
>>>>>>>>> environment, make sure your system administrator enables this
>>>>>>>>> hardware
>>>>>>>>> feature.", "Note to system administrators: Enabling this features
>>>>>>>>> may
>>>>>>>>> negatively affect the performance of other containers.")
>>>>>>>>>
>>>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
>>>>>>>>> requested masks and in that way ensure dynamic re-configuration
>>>>>>>>> doesn't happen on context switches, but driven from userspace via
>>>>>>>>> ioctls.
>>>>>>>>>
>>>>>>>>> In other words, should _all_ userspace agree between themselves that
>>>>>>>>> they want to turn off a slice, they would then need to send out a
>>>>>>>>> concerted ioctl storm, where number of needed ioctls equals the
>>>>>>>>> number
>>>>>>>>> of currently active contexts. (This may have its own performance
>>>>>>>>> consequences caused by the barriers needed to modify all context
>>>>>>>>> images.)
>>>>>>>>>
>>>>>>>>> This was deemed acceptable the the media use case, but my concern is
>>>>>>>>> the approach is not elegant and will tie us with the "or" policy in
>>>>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but they
>>>>>>>>> also
>>>>>>>>> may be significant.)
>>>>>>>>>
>>>>>>>>> If we go back thinking about the containers use case, then it
>>>>>>>>> transpires that even though the "or" policy does prevent one
>>>>>>>>> container
>>>>>>>>> from affecting the other from one angle, it also prevents one
>>>>>>>>> container from exercising the feature unless all containers
>>>>>>>>> co-operate.
>>>>>>>>>
>>>>>>>>> As such, we can view the original problem statement where we have an
>>>>>>>>> issue if not everyone co-operates, as conceptually the same just
>>>>>>>>> from
>>>>>>>>> an opposite angle. (Rather than one container incurring the
>>>>>>>>> increased
>>>>>>>>> cost of context switches to the rest, we would have one container
>>>>>>>>> preventing the optimized slice configuration to the other.)
>>>>>>>>>
>>>>>>>>>    From this follows that both proposals require complete
>>>>>>>>> co-operation
>>>>>>>>> from all running userspace to avoid complete control of the feature.
>>>>>>>>>
>>>>>>>>> Since the balance between the benefit of optimized slice
>>>>>>>>> configuration
>>>>>>>>> (or penalty of suboptimal one), versus the penalty of increased
>>>>>>>>> context switch times, cannot be know by the driver (barring
>>>>>>>>> venturing
>>>>>>>>> into the heuristics territory), that is another reason why I find
>>>>>>>>> the
>>>>>>>>> "or" policy in the driver questionable.
>>>>>>>>>
>>>>>>>>> We can also ask a question of - If we go with the "or" policy, why
>>>>>>>>> require N per-context ioctls to modify the global GPU configuration
>>>>>>>>> and not instead add a global driver ioctl to modify the state?
>>>>>>>>>
>>>>>>>>> If a future hardware requires, or enables, the per-context behaviour
>>>>>>>>> in a more efficient way, we could then revisit the problem space.
>>>>>>>>>
>>>>>>>>> In the mean time I see the "or" policy solution as adding some ABI
>>>>>>>>> which doesn't do anything for many use cases without any way for the
>>>>>>>>> sysadmin to enable it. At the same time master sysfs knob at least
>>>>>>>>> enables the sysadmin to make a decision. Here I am thinking about a
>>>>>>>>> random client environment where not all userspace co-operates,
>>>>>>>>> but for
>>>>>>>>> instance user is running the feature aware media stack, and
>>>>>>>>> non-feature aware OpenCL/3d stack.
>>>>>>>>>
>>>>>>>>> I guess the complete story boils down to - is the master sysfs knob
>>>>>>>>> really a problem in container use cases.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>>
>>>>>>>>> Tvrtko
>>>>>>>> Hey Tvrtko,
>>>>>>>>
>>>>>>>> Thanks for summarizing a bunch of discussions.
>>>>>>>> Essentially I agree with every you wrote above.
>>>>>>>>
>>>>>>>> If we have a global setting (determined by the OR policy), what's the
>>>>>>>> point of per context settings?
>>>>>>>>
>>>>>>>> In Dmitry's scenario, all userspace applications will work
>>>>>>>> together to
>>>>>>>> reach the consensus so it sounds like we're reimplementing the policy
>>>>>>>> that is already existing in userspace.
>>>>>>>>
>>>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully somebody else
>>>>>>>> than me pick one or the other :)
>>>>>>> I'll just mention the voting/consensus approach to see if anyone else
>>>>>>> likes it.
>>>>>>>
>>>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small, dontcare, large }
>>>>>>> (or some other abstract names).
>>>>>> Yeah, the param name should have the word _HINT_ in it when it's not a
>>>>>> definitive set.
>>>>>>
>>>>>> There's no global setter across containers, only a scenario when
>>>>>> everyone agrees or not. Tallying up the votes and going with a majority
>>>>>> vote might be an option, too.
>>>>>>
>>>>>> Regards, Joonas
>>>>> Trying to test the "everyone agrees" approach here.
>>>> It's not everyone agrees, but the greater good.
>>>
>>> I'm looking forward to the definition of the greater good :)
>>> Tvrtko wanted to avoid the heuristic territory, it seems like we're just
>>> stepping into it.
>>
>> I am not sure that "small, dontcare, large" models brings much. No one
>> would probably set "dontcare" since we have to default new contexts to
>> large to be compatible.
>>
>> Don't know, I still prefer the master knob option. Honestly don't yet
>> see the containers use case as a problem. There is always a master
>> domain in any system where the knob can be enabled if the customers on
>> the system are such to warrant it. On mixed systems enable it or not
>> someone always suffers. And with the knob we are free to add heuristics
>> later, keep the uapi, and just default the knob to on.
> 
> Master knob effectively means dead code behind a switch, that's not very
> upstream friendly.

Hey at least I wasn't proposing a modparam! :)))

Yes it is not the best software practice, upstream or not, however I am 
trying to be pragmatical here and choose the simplest, smallest, good 
enough, and least headache inducing in the future solution.

One way of of looking at the master switch could be like tune your 
system for XYZ - change CPU frequency governor, disable SATA link 
saving, allow i915 media optimized mode. Some live code out, some dead 
code in.

But perhaps discussion is moot since we don't have userspace anyway.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-06-14  8:28                     ` Tvrtko Ursulin
@ 2018-07-18 16:44                       ` Bloomfield, Jon
  2018-07-19  7:04                         ` Joonas Lahtinen
  2018-07-24 21:50                         ` Bloomfield, Jon
  0 siblings, 2 replies; 29+ messages in thread
From: Bloomfield, Jon @ 2018-07-18 16:44 UTC (permalink / raw)
  To: Tvrtko Ursulin, Joonas Lahtinen, Chris Wilson, Landwerlin,
	Lionel G, intel-gfx

> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Tvrtko Ursulin
> Sent: Thursday, June 14, 2018 1:29 AM
> To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Chris Wilson
> <chris@chris-wilson.co.uk>; Landwerlin, Lionel G
> <lionel.g.landwerlin@intel.com>; intel-gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let users
> set sseu configs
> 
> 
> On 13/06/2018 13:49, Joonas Lahtinen wrote:
> > Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
> >>
> >> On 12/06/2018 11:52, Lionel Landwerlin wrote:
> >>> On 12/06/18 11:37, Chris Wilson wrote:
> >>>> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> >>>>> On 12/06/18 10:20, Joonas Lahtinen wrote:
> >>>>>> Quoting Chris Wilson (2018-06-11 18:02:37)
> >>>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> >>>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
> >>>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
> >>>>>>>>>> There are concerns about denial of service around the per
> >>>>>>>>>> context sseu
> >>>>>>>>>> configuration capability. In a previous commit introducing the
> >>>>>>>>>> capability we allowed it only for capable users. This changes
> >>>>>>>>>> adds a
> >>>>>>>>>> new debugfs entry to let any user configure its own context
> >>>>>>>>>> powergating setup.
> >>>>>>>>> As far as I understood it, Joonas' concerns here are:
> >>>>>>>>>
> >>>>>>>>> 1) That in the containers use case individual containers wouldn't
> be
> >>>>>>>>> able to turn on the sysfs toggle for them.
> >>>>>>>>>
> >>>>>>>>> 2) That also in the containers use case if box admin turns on the
> >>>>>>>>> feature, some containers would potentially start negatively
> >>>>>>>>> affecting
> >>>>>>>>> the others (via the accumulated cost of slice re-configuration on
> >>>>>>>>> context switching).
> >>>>>>>>>
> >>>>>>>>> I am not familiar with typical container setups to be authoritative
> >>>>>>>>> here, but intuitively I find it reasonable that a low-level hardware
> >>>>>>>>> switch like this would be under the control of a master domain
> >>>>>>>>> administrator. ("If you are installing our product in the container
> >>>>>>>>> environment, make sure your system administrator enables this
> >>>>>>>>> hardware
> >>>>>>>>> feature.", "Note to system administrators: Enabling this features
> >>>>>>>>> may
> >>>>>>>>> negatively affect the performance of other containers.")
> >>>>>>>>>
> >>>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
> >>>>>>>>> requested masks and in that way ensure dynamic re-
> configuration
> >>>>>>>>> doesn't happen on context switches, but driven from userspace
> via
> >>>>>>>>> ioctls.
> >>>>>>>>>
> >>>>>>>>> In other words, should _all_ userspace agree between
> themselves that
> >>>>>>>>> they want to turn off a slice, they would then need to send out a
> >>>>>>>>> concerted ioctl storm, where number of needed ioctls equals
> the
> >>>>>>>>> number
> >>>>>>>>> of currently active contexts. (This may have its own performance
> >>>>>>>>> consequences caused by the barriers needed to modify all
> context
> >>>>>>>>> images.)
> >>>>>>>>>
> >>>>>>>>> This was deemed acceptable the the media use case, but my
> concern is
> >>>>>>>>> the approach is not elegant and will tie us with the "or" policy in
> >>>>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but
> they
> >>>>>>>>> also
> >>>>>>>>> may be significant.)
> >>>>>>>>>
> >>>>>>>>> If we go back thinking about the containers use case, then it
> >>>>>>>>> transpires that even though the "or" policy does prevent one
> >>>>>>>>> container
> >>>>>>>>> from affecting the other from one angle, it also prevents one
> >>>>>>>>> container from exercising the feature unless all containers
> >>>>>>>>> co-operate.
> >>>>>>>>>
> >>>>>>>>> As such, we can view the original problem statement where we
> have an
> >>>>>>>>> issue if not everyone co-operates, as conceptually the same just
> >>>>>>>>> from
> >>>>>>>>> an opposite angle. (Rather than one container incurring the
> >>>>>>>>> increased
> >>>>>>>>> cost of context switches to the rest, we would have one
> container
> >>>>>>>>> preventing the optimized slice configuration to the other.)
> >>>>>>>>>
> >>>>>>>>>    From this follows that both proposals require complete
> >>>>>>>>> co-operation
> >>>>>>>>> from all running userspace to avoid complete control of the
> feature.
> >>>>>>>>>
> >>>>>>>>> Since the balance between the benefit of optimized slice
> >>>>>>>>> configuration
> >>>>>>>>> (or penalty of suboptimal one), versus the penalty of increased
> >>>>>>>>> context switch times, cannot be know by the driver (barring
> >>>>>>>>> venturing
> >>>>>>>>> into the heuristics territory), that is another reason why I find
> >>>>>>>>> the
> >>>>>>>>> "or" policy in the driver questionable.
> >>>>>>>>>
> >>>>>>>>> We can also ask a question of - If we go with the "or" policy, why
> >>>>>>>>> require N per-context ioctls to modify the global GPU
> configuration
> >>>>>>>>> and not instead add a global driver ioctl to modify the state?
> >>>>>>>>>
> >>>>>>>>> If a future hardware requires, or enables, the per-context
> behaviour
> >>>>>>>>> in a more efficient way, we could then revisit the problem space.
> >>>>>>>>>
> >>>>>>>>> In the mean time I see the "or" policy solution as adding some
> ABI
> >>>>>>>>> which doesn't do anything for many use cases without any way
> for the
> >>>>>>>>> sysadmin to enable it. At the same time master sysfs knob at
> least
> >>>>>>>>> enables the sysadmin to make a decision. Here I am thinking
> about a
> >>>>>>>>> random client environment where not all userspace co-
> operates,
> >>>>>>>>> but for
> >>>>>>>>> instance user is running the feature aware media stack, and
> >>>>>>>>> non-feature aware OpenCL/3d stack.
> >>>>>>>>>
> >>>>>>>>> I guess the complete story boils down to - is the master sysfs
> knob
> >>>>>>>>> really a problem in container use cases.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>>
> >>>>>>>>> Tvrtko
> >>>>>>>> Hey Tvrtko,
> >>>>>>>>
> >>>>>>>> Thanks for summarizing a bunch of discussions.
> >>>>>>>> Essentially I agree with every you wrote above.
> >>>>>>>>
> >>>>>>>> If we have a global setting (determined by the OR policy), what's
> the
> >>>>>>>> point of per context settings?
> >>>>>>>>
> >>>>>>>> In Dmitry's scenario, all userspace applications will work
> >>>>>>>> together to
> >>>>>>>> reach the consensus so it sounds like we're reimplementing the
> policy
> >>>>>>>> that is already existing in userspace.
> >>>>>>>>
> >>>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully
> somebody else
> >>>>>>>> than me pick one or the other :)
> >>>>>>> I'll just mention the voting/consensus approach to see if anyone
> else
> >>>>>>> likes it.
> >>>>>>>
> >>>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small,
> dontcare, large }
> >>>>>>> (or some other abstract names).
> >>>>>> Yeah, the param name should have the word _HINT_ in it when it's
> not a
> >>>>>> definitive set.
> >>>>>>
> >>>>>> There's no global setter across containers, only a scenario when
> >>>>>> everyone agrees or not. Tallying up the votes and going with a
> majority
> >>>>>> vote might be an option, too.
> >>>>>>
> >>>>>> Regards, Joonas
> >>>>> Trying to test the "everyone agrees" approach here.
> >>>> It's not everyone agrees, but the greater good.
> >>>
> >>> I'm looking forward to the definition of the greater good :)
> >>> Tvrtko wanted to avoid the heuristic territory, it seems like we're just
> >>> stepping into it.
> >>
> >> I am not sure that "small, dontcare, large" models brings much. No one
> >> would probably set "dontcare" since we have to default new contexts to
> >> large to be compatible.
> >>
> >> Don't know, I still prefer the master knob option. Honestly don't yet
> >> see the containers use case as a problem. There is always a master
> >> domain in any system where the knob can be enabled if the customers on
> >> the system are such to warrant it. On mixed systems enable it or not
> >> someone always suffers. And with the knob we are free to add heuristics
> >> later, keep the uapi, and just default the knob to on.
> >
> > Master knob effectively means dead code behind a switch, that's not very
> > upstream friendly.
> 
> Hey at least I wasn't proposing a modparam! :)))
> 
> Yes it is not the best software practice, upstream or not, however I am
> trying to be pragmatical here and choose the simplest, smallest, good
> enough, and least headache inducing in the future solution.
> 
> One way of of looking at the master switch could be like tune your
> system for XYZ - change CPU frequency governor, disable SATA link
> saving, allow i915 media optimized mode. Some live code out, some dead
> code in.
> 
> But perhaps discussion is moot since we don't have userspace anyway.
> 
> Regards,
> 
> Tvrtko

I'm surprised by the "no deadcode behind knobs comment". We do this all
the time - "display=0" anyone? Or "enable_cmdparser=false"?

Allowing user space to reduce EU performance for the system as a whole
is not a great idea imho. Only sysadmin should have the right to let that
happen, and an admin switch (I WOULD go with a modparam personally) is
a good way to ensure that we can get the optimal media configuration for
dedicated media systems, while for general systems we get the best overall
performance (not just favouring media).

Why would we want to over engineer this?

Jon
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-07-18 16:44                       ` Bloomfield, Jon
@ 2018-07-19  7:04                         ` Joonas Lahtinen
  2018-07-24 21:50                         ` Bloomfield, Jon
  1 sibling, 0 replies; 29+ messages in thread
From: Joonas Lahtinen @ 2018-07-19  7:04 UTC (permalink / raw)
  To: Bloomfield, Jon, Landwerlin, Lionel G, intel-gfx, Chris Wilson,
	Tvrtko Ursulin

Quoting Bloomfield, Jon (2018-07-18 19:44:14)
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > Tvrtko Ursulin
> > Sent: Thursday, June 14, 2018 1:29 AM
> > To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Chris Wilson
> > <chris@chris-wilson.co.uk>; Landwerlin, Lionel G
> > <lionel.g.landwerlin@intel.com>; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let users
> > set sseu configs
> > 
> > 
> > On 13/06/2018 13:49, Joonas Lahtinen wrote:
> > > Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
> > >>
> > >> On 12/06/2018 11:52, Lionel Landwerlin wrote:
> > >>> On 12/06/18 11:37, Chris Wilson wrote:
> > >>>> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> > >>>>> On 12/06/18 10:20, Joonas Lahtinen wrote:
> > >>>>>> Quoting Chris Wilson (2018-06-11 18:02:37)
> > >>>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> > >>>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
> > >>>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
> > >>>>>>>>>> There are concerns about denial of service around the per
> > >>>>>>>>>> context sseu
> > >>>>>>>>>> configuration capability. In a previous commit introducing the
> > >>>>>>>>>> capability we allowed it only for capable users. This changes
> > >>>>>>>>>> adds a
> > >>>>>>>>>> new debugfs entry to let any user configure its own context
> > >>>>>>>>>> powergating setup.
> > >>>>>>>>> As far as I understood it, Joonas' concerns here are:
> > >>>>>>>>>
> > >>>>>>>>> 1) That in the containers use case individual containers wouldn't
> > be
> > >>>>>>>>> able to turn on the sysfs toggle for them.
> > >>>>>>>>>
> > >>>>>>>>> 2) That also in the containers use case if box admin turns on the
> > >>>>>>>>> feature, some containers would potentially start negatively
> > >>>>>>>>> affecting
> > >>>>>>>>> the others (via the accumulated cost of slice re-configuration on
> > >>>>>>>>> context switching).
> > >>>>>>>>>
> > >>>>>>>>> I am not familiar with typical container setups to be authoritative
> > >>>>>>>>> here, but intuitively I find it reasonable that a low-level hardware
> > >>>>>>>>> switch like this would be under the control of a master domain
> > >>>>>>>>> administrator. ("If you are installing our product in the container
> > >>>>>>>>> environment, make sure your system administrator enables this
> > >>>>>>>>> hardware
> > >>>>>>>>> feature.", "Note to system administrators: Enabling this features
> > >>>>>>>>> may
> > >>>>>>>>> negatively affect the performance of other containers.")
> > >>>>>>>>>
> > >>>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
> > >>>>>>>>> requested masks and in that way ensure dynamic re-
> > configuration
> > >>>>>>>>> doesn't happen on context switches, but driven from userspace
> > via
> > >>>>>>>>> ioctls.
> > >>>>>>>>>
> > >>>>>>>>> In other words, should _all_ userspace agree between
> > themselves that
> > >>>>>>>>> they want to turn off a slice, they would then need to send out a
> > >>>>>>>>> concerted ioctl storm, where number of needed ioctls equals
> > the
> > >>>>>>>>> number
> > >>>>>>>>> of currently active contexts. (This may have its own performance
> > >>>>>>>>> consequences caused by the barriers needed to modify all
> > context
> > >>>>>>>>> images.)
> > >>>>>>>>>
> > >>>>>>>>> This was deemed acceptable the the media use case, but my
> > concern is
> > >>>>>>>>> the approach is not elegant and will tie us with the "or" policy in
> > >>>>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but
> > they
> > >>>>>>>>> also
> > >>>>>>>>> may be significant.)
> > >>>>>>>>>
> > >>>>>>>>> If we go back thinking about the containers use case, then it
> > >>>>>>>>> transpires that even though the "or" policy does prevent one
> > >>>>>>>>> container
> > >>>>>>>>> from affecting the other from one angle, it also prevents one
> > >>>>>>>>> container from exercising the feature unless all containers
> > >>>>>>>>> co-operate.
> > >>>>>>>>>
> > >>>>>>>>> As such, we can view the original problem statement where we
> > have an
> > >>>>>>>>> issue if not everyone co-operates, as conceptually the same just
> > >>>>>>>>> from
> > >>>>>>>>> an opposite angle. (Rather than one container incurring the
> > >>>>>>>>> increased
> > >>>>>>>>> cost of context switches to the rest, we would have one
> > container
> > >>>>>>>>> preventing the optimized slice configuration to the other.)
> > >>>>>>>>>
> > >>>>>>>>>    From this follows that both proposals require complete
> > >>>>>>>>> co-operation
> > >>>>>>>>> from all running userspace to avoid complete control of the
> > feature.
> > >>>>>>>>>
> > >>>>>>>>> Since the balance between the benefit of optimized slice
> > >>>>>>>>> configuration
> > >>>>>>>>> (or penalty of suboptimal one), versus the penalty of increased
> > >>>>>>>>> context switch times, cannot be know by the driver (barring
> > >>>>>>>>> venturing
> > >>>>>>>>> into the heuristics territory), that is another reason why I find
> > >>>>>>>>> the
> > >>>>>>>>> "or" policy in the driver questionable.
> > >>>>>>>>>
> > >>>>>>>>> We can also ask a question of - If we go with the "or" policy, why
> > >>>>>>>>> require N per-context ioctls to modify the global GPU
> > configuration
> > >>>>>>>>> and not instead add a global driver ioctl to modify the state?
> > >>>>>>>>>
> > >>>>>>>>> If a future hardware requires, or enables, the per-context
> > behaviour
> > >>>>>>>>> in a more efficient way, we could then revisit the problem space.
> > >>>>>>>>>
> > >>>>>>>>> In the mean time I see the "or" policy solution as adding some
> > ABI
> > >>>>>>>>> which doesn't do anything for many use cases without any way
> > for the
> > >>>>>>>>> sysadmin to enable it. At the same time master sysfs knob at
> > least
> > >>>>>>>>> enables the sysadmin to make a decision. Here I am thinking
> > about a
> > >>>>>>>>> random client environment where not all userspace co-
> > operates,
> > >>>>>>>>> but for
> > >>>>>>>>> instance user is running the feature aware media stack, and
> > >>>>>>>>> non-feature aware OpenCL/3d stack.
> > >>>>>>>>>
> > >>>>>>>>> I guess the complete story boils down to - is the master sysfs
> > knob
> > >>>>>>>>> really a problem in container use cases.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> Tvrtko
> > >>>>>>>> Hey Tvrtko,
> > >>>>>>>>
> > >>>>>>>> Thanks for summarizing a bunch of discussions.
> > >>>>>>>> Essentially I agree with every you wrote above.
> > >>>>>>>>
> > >>>>>>>> If we have a global setting (determined by the OR policy), what's
> > the
> > >>>>>>>> point of per context settings?
> > >>>>>>>>
> > >>>>>>>> In Dmitry's scenario, all userspace applications will work
> > >>>>>>>> together to
> > >>>>>>>> reach the consensus so it sounds like we're reimplementing the
> > policy
> > >>>>>>>> that is already existing in userspace.
> > >>>>>>>>
> > >>>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully
> > somebody else
> > >>>>>>>> than me pick one or the other :)
> > >>>>>>> I'll just mention the voting/consensus approach to see if anyone
> > else
> > >>>>>>> likes it.
> > >>>>>>>
> > >>>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small,
> > dontcare, large }
> > >>>>>>> (or some other abstract names).
> > >>>>>> Yeah, the param name should have the word _HINT_ in it when it's
> > not a
> > >>>>>> definitive set.
> > >>>>>>
> > >>>>>> There's no global setter across containers, only a scenario when
> > >>>>>> everyone agrees or not. Tallying up the votes and going with a
> > majority
> > >>>>>> vote might be an option, too.
> > >>>>>>
> > >>>>>> Regards, Joonas
> > >>>>> Trying to test the "everyone agrees" approach here.
> > >>>> It's not everyone agrees, but the greater good.
> > >>>
> > >>> I'm looking forward to the definition of the greater good :)
> > >>> Tvrtko wanted to avoid the heuristic territory, it seems like we're just
> > >>> stepping into it.
> > >>
> > >> I am not sure that "small, dontcare, large" models brings much. No one
> > >> would probably set "dontcare" since we have to default new contexts to
> > >> large to be compatible.
> > >>
> > >> Don't know, I still prefer the master knob option. Honestly don't yet
> > >> see the containers use case as a problem. There is always a master
> > >> domain in any system where the knob can be enabled if the customers on
> > >> the system are such to warrant it. On mixed systems enable it or not
> > >> someone always suffers. And with the knob we are free to add heuristics
> > >> later, keep the uapi, and just default the knob to on.
> > >
> > > Master knob effectively means dead code behind a switch, that's not very
> > > upstream friendly.
> > 
> > Hey at least I wasn't proposing a modparam! :)))
> > 
> > Yes it is not the best software practice, upstream or not, however I am
> > trying to be pragmatical here and choose the simplest, smallest, good
> > enough, and least headache inducing in the future solution.
> > 
> > One way of of looking at the master switch could be like tune your
> > system for XYZ - change CPU frequency governor, disable SATA link
> > saving, allow i915 media optimized mode. Some live code out, some dead
> > code in.
> > 
> > But perhaps discussion is moot since we don't have userspace anyway.
> > 
> > Regards,
> > 
> > Tvrtko
> 
> I'm surprised by the "no deadcode behind knobs comment". We do this all
> the time - "display=0" anyone? Or "enable_cmdparser=false"?

Display-less machines are in the progress of being added to CI as we
speak. And what goes for enable_rc6=0 and enable_execlists=0 options, if
you look closely, they have been gotten rid of already. Something
existing in the codebase doesn't mean it's fine to add more of the same.

> Allowing user space to reduce EU performance for the system as a whole
> is not a great idea imho.

When all the clients ask for it, there should be no problem in reducing
the slice count.

> Only sysadmin should have the right to let that
> happen, and an admin switch (I WOULD go with a modparam personally) is
> a good way to ensure that we can get the optimal media configuration for
> dedicated media systems, while for general systems we get the best overall
> performance (not just favouring media).

There definitely won't be a modparam for it.

Regards, Joonas
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-07-18 16:44                       ` Bloomfield, Jon
  2018-07-19  7:04                         ` Joonas Lahtinen
@ 2018-07-24 21:50                         ` Bloomfield, Jon
  2018-07-30 20:40                           ` Rogozhkin, Dmitry V
  2018-08-02 10:00                           ` Tvrtko Ursulin
  1 sibling, 2 replies; 29+ messages in thread
From: Bloomfield, Jon @ 2018-07-24 21:50 UTC (permalink / raw)
  To: Bloomfield, Jon, Tvrtko Ursulin, Joonas Lahtinen, Chris Wilson,
	Landwerlin, Lionel G, intel-gfx

Gratuitous top posting to re-kick the thread.

For Gen11 we can't have an on/off switch anyway (media simply won't run
with an oncompatible slice config), so let's agree on an api to allow userland
to select the slice configuration for its contexts, for Gen11 only. I'd prefer a
simple {MediaConfig/GeneralConfig} type context param but I really don't
care that much.

For gen9/10 it's arguable whether we need this at all. The effect on media
workloads varies, but I'm guessing normal users (outside a transcode
type environment) will never know they're missing anything. Either way,
we can continue discussing while we progress the gen11 solution.

Jon

> -----Original Message-----
> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Bloomfield, Jon
> Sent: Wednesday, July 18, 2018 9:44 AM
> To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>; Joonas Lahtinen
> <joonas.lahtinen@linux.intel.com>; Chris Wilson <chris@chris-wilson.co.uk>;
> Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; intel-
> gfx@lists.freedesktop.org
> Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let users
> set sseu configs
> 
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
> > Tvrtko Ursulin
> > Sent: Thursday, June 14, 2018 1:29 AM
> > To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Chris Wilson
> > <chris@chris-wilson.co.uk>; Landwerlin, Lionel G
> > <lionel.g.landwerlin@intel.com>; intel-gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let
> users
> > set sseu configs
> >
> >
> > On 13/06/2018 13:49, Joonas Lahtinen wrote:
> > > Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
> > >>
> > >> On 12/06/2018 11:52, Lionel Landwerlin wrote:
> > >>> On 12/06/18 11:37, Chris Wilson wrote:
> > >>>> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> > >>>>> On 12/06/18 10:20, Joonas Lahtinen wrote:
> > >>>>>> Quoting Chris Wilson (2018-06-11 18:02:37)
> > >>>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> > >>>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
> > >>>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
> > >>>>>>>>>> There are concerns about denial of service around the per
> > >>>>>>>>>> context sseu
> > >>>>>>>>>> configuration capability. In a previous commit introducing the
> > >>>>>>>>>> capability we allowed it only for capable users. This changes
> > >>>>>>>>>> adds a
> > >>>>>>>>>> new debugfs entry to let any user configure its own context
> > >>>>>>>>>> powergating setup.
> > >>>>>>>>> As far as I understood it, Joonas' concerns here are:
> > >>>>>>>>>
> > >>>>>>>>> 1) That in the containers use case individual containers
> wouldn't
> > be
> > >>>>>>>>> able to turn on the sysfs toggle for them.
> > >>>>>>>>>
> > >>>>>>>>> 2) That also in the containers use case if box admin turns on
> the
> > >>>>>>>>> feature, some containers would potentially start negatively
> > >>>>>>>>> affecting
> > >>>>>>>>> the others (via the accumulated cost of slice re-configuration
> on
> > >>>>>>>>> context switching).
> > >>>>>>>>>
> > >>>>>>>>> I am not familiar with typical container setups to be
> authoritative
> > >>>>>>>>> here, but intuitively I find it reasonable that a low-level
> hardware
> > >>>>>>>>> switch like this would be under the control of a master domain
> > >>>>>>>>> administrator. ("If you are installing our product in the
> container
> > >>>>>>>>> environment, make sure your system administrator enables
> this
> > >>>>>>>>> hardware
> > >>>>>>>>> feature.", "Note to system administrators: Enabling this
> features
> > >>>>>>>>> may
> > >>>>>>>>> negatively affect the performance of other containers.")
> > >>>>>>>>>
> > >>>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
> > >>>>>>>>> requested masks and in that way ensure dynamic re-
> > configuration
> > >>>>>>>>> doesn't happen on context switches, but driven from
> userspace
> > via
> > >>>>>>>>> ioctls.
> > >>>>>>>>>
> > >>>>>>>>> In other words, should _all_ userspace agree between
> > themselves that
> > >>>>>>>>> they want to turn off a slice, they would then need to send out
> a
> > >>>>>>>>> concerted ioctl storm, where number of needed ioctls equals
> > the
> > >>>>>>>>> number
> > >>>>>>>>> of currently active contexts. (This may have its own
> performance
> > >>>>>>>>> consequences caused by the barriers needed to modify all
> > context
> > >>>>>>>>> images.)
> > >>>>>>>>>
> > >>>>>>>>> This was deemed acceptable the the media use case, but my
> > concern is
> > >>>>>>>>> the approach is not elegant and will tie us with the "or" policy
> in
> > >>>>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but
> > they
> > >>>>>>>>> also
> > >>>>>>>>> may be significant.)
> > >>>>>>>>>
> > >>>>>>>>> If we go back thinking about the containers use case, then it
> > >>>>>>>>> transpires that even though the "or" policy does prevent one
> > >>>>>>>>> container
> > >>>>>>>>> from affecting the other from one angle, it also prevents one
> > >>>>>>>>> container from exercising the feature unless all containers
> > >>>>>>>>> co-operate.
> > >>>>>>>>>
> > >>>>>>>>> As such, we can view the original problem statement where
> we
> > have an
> > >>>>>>>>> issue if not everyone co-operates, as conceptually the same
> just
> > >>>>>>>>> from
> > >>>>>>>>> an opposite angle. (Rather than one container incurring the
> > >>>>>>>>> increased
> > >>>>>>>>> cost of context switches to the rest, we would have one
> > container
> > >>>>>>>>> preventing the optimized slice configuration to the other.)
> > >>>>>>>>>
> > >>>>>>>>>    From this follows that both proposals require complete
> > >>>>>>>>> co-operation
> > >>>>>>>>> from all running userspace to avoid complete control of the
> > feature.
> > >>>>>>>>>
> > >>>>>>>>> Since the balance between the benefit of optimized slice
> > >>>>>>>>> configuration
> > >>>>>>>>> (or penalty of suboptimal one), versus the penalty of increased
> > >>>>>>>>> context switch times, cannot be know by the driver (barring
> > >>>>>>>>> venturing
> > >>>>>>>>> into the heuristics territory), that is another reason why I find
> > >>>>>>>>> the
> > >>>>>>>>> "or" policy in the driver questionable.
> > >>>>>>>>>
> > >>>>>>>>> We can also ask a question of - If we go with the "or" policy,
> why
> > >>>>>>>>> require N per-context ioctls to modify the global GPU
> > configuration
> > >>>>>>>>> and not instead add a global driver ioctl to modify the state?
> > >>>>>>>>>
> > >>>>>>>>> If a future hardware requires, or enables, the per-context
> > behaviour
> > >>>>>>>>> in a more efficient way, we could then revisit the problem
> space.
> > >>>>>>>>>
> > >>>>>>>>> In the mean time I see the "or" policy solution as adding some
> > ABI
> > >>>>>>>>> which doesn't do anything for many use cases without any way
> > for the
> > >>>>>>>>> sysadmin to enable it. At the same time master sysfs knob at
> > least
> > >>>>>>>>> enables the sysadmin to make a decision. Here I am thinking
> > about a
> > >>>>>>>>> random client environment where not all userspace co-
> > operates,
> > >>>>>>>>> but for
> > >>>>>>>>> instance user is running the feature aware media stack, and
> > >>>>>>>>> non-feature aware OpenCL/3d stack.
> > >>>>>>>>>
> > >>>>>>>>> I guess the complete story boils down to - is the master sysfs
> > knob
> > >>>>>>>>> really a problem in container use cases.
> > >>>>>>>>>
> > >>>>>>>>> Regards,
> > >>>>>>>>>
> > >>>>>>>>> Tvrtko
> > >>>>>>>> Hey Tvrtko,
> > >>>>>>>>
> > >>>>>>>> Thanks for summarizing a bunch of discussions.
> > >>>>>>>> Essentially I agree with every you wrote above.
> > >>>>>>>>
> > >>>>>>>> If we have a global setting (determined by the OR policy), what's
> > the
> > >>>>>>>> point of per context settings?
> > >>>>>>>>
> > >>>>>>>> In Dmitry's scenario, all userspace applications will work
> > >>>>>>>> together to
> > >>>>>>>> reach the consensus so it sounds like we're reimplementing the
> > policy
> > >>>>>>>> that is already existing in userspace.
> > >>>>>>>>
> > >>>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully
> > somebody else
> > >>>>>>>> than me pick one or the other :)
> > >>>>>>> I'll just mention the voting/consensus approach to see if anyone
> > else
> > >>>>>>> likes it.
> > >>>>>>>
> > >>>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small,
> > dontcare, large }
> > >>>>>>> (or some other abstract names).
> > >>>>>> Yeah, the param name should have the word _HINT_ in it when it's
> > not a
> > >>>>>> definitive set.
> > >>>>>>
> > >>>>>> There's no global setter across containers, only a scenario when
> > >>>>>> everyone agrees or not. Tallying up the votes and going with a
> > majority
> > >>>>>> vote might be an option, too.
> > >>>>>>
> > >>>>>> Regards, Joonas
> > >>>>> Trying to test the "everyone agrees" approach here.
> > >>>> It's not everyone agrees, but the greater good.
> > >>>
> > >>> I'm looking forward to the definition of the greater good :)
> > >>> Tvrtko wanted to avoid the heuristic territory, it seems like we're just
> > >>> stepping into it.
> > >>
> > >> I am not sure that "small, dontcare, large" models brings much. No one
> > >> would probably set "dontcare" since we have to default new contexts
> to
> > >> large to be compatible.
> > >>
> > >> Don't know, I still prefer the master knob option. Honestly don't yet
> > >> see the containers use case as a problem. There is always a master
> > >> domain in any system where the knob can be enabled if the customers
> on
> > >> the system are such to warrant it. On mixed systems enable it or not
> > >> someone always suffers. And with the knob we are free to add
> heuristics
> > >> later, keep the uapi, and just default the knob to on.
> > >
> > > Master knob effectively means dead code behind a switch, that's not
> very
> > > upstream friendly.
> >
> > Hey at least I wasn't proposing a modparam! :)))
> >
> > Yes it is not the best software practice, upstream or not, however I am
> > trying to be pragmatical here and choose the simplest, smallest, good
> > enough, and least headache inducing in the future solution.
> >
> > One way of of looking at the master switch could be like tune your
> > system for XYZ - change CPU frequency governor, disable SATA link
> > saving, allow i915 media optimized mode. Some live code out, some dead
> > code in.
> >
> > But perhaps discussion is moot since we don't have userspace anyway.
> >
> > Regards,
> >
> > Tvrtko
> 
> I'm surprised by the "no deadcode behind knobs comment". We do this all
> the time - "display=0" anyone? Or "enable_cmdparser=false"?
> 
> Allowing user space to reduce EU performance for the system as a whole
> is not a great idea imho. Only sysadmin should have the right to let that
> happen, and an admin switch (I WOULD go with a modparam personally) is
> a good way to ensure that we can get the optimal media configuration for
> dedicated media systems, while for general systems we get the best overall
> performance (not just favouring media).
> 
> Why would we want to over engineer this?
> 
> Jon
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-07-24 21:50                         ` Bloomfield, Jon
@ 2018-07-30 20:40                           ` Rogozhkin, Dmitry V
  2018-08-02 10:00                           ` Tvrtko Ursulin
  1 sibling, 0 replies; 29+ messages in thread
From: Rogozhkin, Dmitry V @ 2018-07-30 20:40 UTC (permalink / raw)
  To: Bloomfield, Jon, intel-gfx, joonas.lahtinen, Landwerlin,
	Lionel G, chris, tvrtko.ursulin

On Tue, 2018-07-24 at 21:50 +0000, Bloomfield, Jon wrote:
> Gratuitous top posting to re-kick the thread.
> 
> For Gen11 we can't have an on/off switch anyway (media simply won't
> run
> with an oncompatible slice config), so let's agree on an api to allow
> userland
> to select the slice configuration for its contexts, for Gen11 only.
> I'd prefer a
> simple {MediaConfig/GeneralConfig} type context param but I really
> don't
> care that much.

Does anyone work on breaking the patch series for Gen11 and Gen8/9/10
with Gen11 preceding since this is required for the functional side? We
hope to have userspace media driver ICL patches available pretty soon
and would be really nice to have kernel side prepared as much as
possible... If no one is working on this, can you, Lionel, help,
please?

> 
> For gen9/10 it's arguable whether we need this at all. The effect on
> media
> workloads varies, but I'm guessing normal users (outside a transcode
> type environment) will never know they're missing anything. Either
> way,
> we can continue discussing while we progress the gen11 solution.
> 
> Jon
> 
> > -----Original Message-----
> > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf
> > Of
> > Bloomfield, Jon
> > Sent: Wednesday, July 18, 2018 9:44 AM
> > To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>; Joonas
> > Lahtinen
> > <joonas.lahtinen@linux.intel.com>; Chris Wilson <chris@chris-wilson
> > .co.uk>;
> > Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; intel-
> > gfx@lists.freedesktop.org
> > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry
> > to let users
> > set sseu configs
> > 
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On
> > > Behalf Of
> > > Tvrtko Ursulin
> > > Sent: Thursday, June 14, 2018 1:29 AM
> > > To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Chris
> > > Wilson
> > > <chris@chris-wilson.co.uk>; Landwerlin, Lionel G
> > > <lionel.g.landwerlin@intel.com>; intel-gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs
> > > entry to let
> > 
> > users
> > > set sseu configs
> > > 
> > > 
> > > On 13/06/2018 13:49, Joonas Lahtinen wrote:
> > > > Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
> > > > > 
> > > > > On 12/06/2018 11:52, Lionel Landwerlin wrote:
> > > > > > On 12/06/18 11:37, Chris Wilson wrote:
> > > > > > > Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> > > > > > > > On 12/06/18 10:20, Joonas Lahtinen wrote:
> > > > > > > > > Quoting Chris Wilson (2018-06-11 18:02:37)
> > > > > > > > > > Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> > > > > > > > > > > On 11/06/18 13:10, Tvrtko Ursulin wrote:
> > > > > > > > > > > > On 30/05/2018 15:33, Lionel Landwerlin wrote:
> > > > > > > > > > > > > There are concerns about denial of service
> > > > > > > > > > > > > around the per
> > > > > > > > > > > > > context sseu
> > > > > > > > > > > > > configuration capability. In a previous
> > > > > > > > > > > > > commit introducing the
> > > > > > > > > > > > > capability we allowed it only for capable
> > > > > > > > > > > > > users. This changes
> > > > > > > > > > > > > adds a
> > > > > > > > > > > > > new debugfs entry to let any user configure
> > > > > > > > > > > > > its own context
> > > > > > > > > > > > > powergating setup.
> > > > > > > > > > > > 
> > > > > > > > > > > > As far as I understood it, Joonas' concerns
> > > > > > > > > > > > here are:
> > > > > > > > > > > > 
> > > > > > > > > > > > 1) That in the containers use case individual
> > > > > > > > > > > > containers
> > 
> > wouldn't
> > > be
> > > > > > > > > > > > able to turn on the sysfs toggle for them.
> > > > > > > > > > > > 
> > > > > > > > > > > > 2) That also in the containers use case if box
> > > > > > > > > > > > admin turns on
> > 
> > the
> > > > > > > > > > > > feature, some containers would potentially
> > > > > > > > > > > > start negatively
> > > > > > > > > > > > affecting
> > > > > > > > > > > > the others (via the accumulated cost of slice
> > > > > > > > > > > > re-configuration
> > 
> > on
> > > > > > > > > > > > context switching).
> > > > > > > > > > > > 
> > > > > > > > > > > > I am not familiar with typical container setups
> > > > > > > > > > > > to be
> > 
> > authoritative
> > > > > > > > > > > > here, but intuitively I find it reasonable that
> > > > > > > > > > > > a low-level
> > 
> > hardware
> > > > > > > > > > > > switch like this would be under the control of
> > > > > > > > > > > > a master domain
> > > > > > > > > > > > administrator. ("If you are installing our
> > > > > > > > > > > > product in the
> > 
> > container
> > > > > > > > > > > > environment, make sure your system
> > > > > > > > > > > > administrator enables
> > 
> > this
> > > > > > > > > > > > hardware
> > > > > > > > > > > > feature.", "Note to system administrators:
> > > > > > > > > > > > Enabling this
> > 
> > features
> > > > > > > > > > > > may
> > > > > > > > > > > > negatively affect the performance of other
> > > > > > > > > > > > containers.")
> > > > > > > > > > > > 
> > > > > > > > > > > > Alternative proposal is for the i915 to apply
> > > > > > > > > > > > an "or" filter on all
> > > > > > > > > > > > requested masks and in that way ensure dynamic
> > > > > > > > > > > > re-
> > > 
> > > configuration
> > > > > > > > > > > > doesn't happen on context switches, but driven
> > > > > > > > > > > > from
> > 
> > userspace
> > > via
> > > > > > > > > > > > ioctls.
> > > > > > > > > > > > 
> > > > > > > > > > > > In other words, should _all_ userspace agree
> > > > > > > > > > > > between
> > > 
> > > themselves that
> > > > > > > > > > > > they want to turn off a slice, they would then
> > > > > > > > > > > > need to send out
> > 
> > a
> > > > > > > > > > > > concerted ioctl storm, where number of needed
> > > > > > > > > > > > ioctls equals
> > > 
> > > the
> > > > > > > > > > > > number
> > > > > > > > > > > > of currently active contexts. (This may have
> > > > > > > > > > > > its own
> > 
> > performance
> > > > > > > > > > > > consequences caused by the barriers needed to
> > > > > > > > > > > > modify all
> > > 
> > > context
> > > > > > > > > > > > images.)
> > > > > > > > > > > > 
> > > > > > > > > > > > This was deemed acceptable the the media use
> > > > > > > > > > > > case, but my
> > > 
> > > concern is
> > > > > > > > > > > > the approach is not elegant and will tie us
> > > > > > > > > > > > with the "or" policy
> > 
> > in
> > > > > > > > > > > > the ABI. (Performance concerns I haven't
> > > > > > > > > > > > evaluated yet, but
> > > 
> > > they
> > > > > > > > > > > > also
> > > > > > > > > > > > may be significant.)
> > > > > > > > > > > > 
> > > > > > > > > > > > If we go back thinking about the containers use
> > > > > > > > > > > > case, then it
> > > > > > > > > > > > transpires that even though the "or" policy
> > > > > > > > > > > > does prevent one
> > > > > > > > > > > > container
> > > > > > > > > > > > from affecting the other from one angle, it
> > > > > > > > > > > > also prevents one
> > > > > > > > > > > > container from exercising the feature unless
> > > > > > > > > > > > all containers
> > > > > > > > > > > > co-operate.
> > > > > > > > > > > > 
> > > > > > > > > > > > As such, we can view the original problem
> > > > > > > > > > > > statement where
> > 
> > we
> > > have an
> > > > > > > > > > > > issue if not everyone co-operates, as
> > > > > > > > > > > > conceptually the same
> > 
> > just
> > > > > > > > > > > > from
> > > > > > > > > > > > an opposite angle. (Rather than one container
> > > > > > > > > > > > incurring the
> > > > > > > > > > > > increased
> > > > > > > > > > > > cost of context switches to the rest, we would
> > > > > > > > > > > > have one
> > > 
> > > container
> > > > > > > > > > > > preventing the optimized slice configuration to
> > > > > > > > > > > > the other.)
> > > > > > > > > > > > 
> > > > > > > > > > > >    From this follows that both proposals
> > > > > > > > > > > > require complete
> > > > > > > > > > > > co-operation
> > > > > > > > > > > > from all running userspace to avoid complete
> > > > > > > > > > > > control of the
> > > 
> > > feature.
> > > > > > > > > > > > 
> > > > > > > > > > > > Since the balance between the benefit of
> > > > > > > > > > > > optimized slice
> > > > > > > > > > > > configuration
> > > > > > > > > > > > (or penalty of suboptimal one), versus the
> > > > > > > > > > > > penalty of increased
> > > > > > > > > > > > context switch times, cannot be know by the
> > > > > > > > > > > > driver (barring
> > > > > > > > > > > > venturing
> > > > > > > > > > > > into the heuristics territory), that is another
> > > > > > > > > > > > reason why I find
> > > > > > > > > > > > the
> > > > > > > > > > > > "or" policy in the driver questionable.
> > > > > > > > > > > > 
> > > > > > > > > > > > We can also ask a question of - If we go with
> > > > > > > > > > > > the "or" policy,
> > 
> > why
> > > > > > > > > > > > require N per-context ioctls to modify the
> > > > > > > > > > > > global GPU
> > > 
> > > configuration
> > > > > > > > > > > > and not instead add a global driver ioctl to
> > > > > > > > > > > > modify the state?
> > > > > > > > > > > > 
> > > > > > > > > > > > If a future hardware requires, or enables, the
> > > > > > > > > > > > per-context
> > > 
> > > behaviour
> > > > > > > > > > > > in a more efficient way, we could then revisit
> > > > > > > > > > > > the problem
> > 
> > space.
> > > > > > > > > > > > 
> > > > > > > > > > > > In the mean time I see the "or" policy solution
> > > > > > > > > > > > as adding some
> > > 
> > > ABI
> > > > > > > > > > > > which doesn't do anything for many use cases
> > > > > > > > > > > > without any way
> > > 
> > > for the
> > > > > > > > > > > > sysadmin to enable it. At the same time master
> > > > > > > > > > > > sysfs knob at
> > > 
> > > least
> > > > > > > > > > > > enables the sysadmin to make a decision. Here I
> > > > > > > > > > > > am thinking
> > > 
> > > about a
> > > > > > > > > > > > random client environment where not all
> > > > > > > > > > > > userspace co-
> > > 
> > > operates,
> > > > > > > > > > > > but for
> > > > > > > > > > > > instance user is running the feature aware
> > > > > > > > > > > > media stack, and
> > > > > > > > > > > > non-feature aware OpenCL/3d stack.
> > > > > > > > > > > > 
> > > > > > > > > > > > I guess the complete story boils down to - is
> > > > > > > > > > > > the master sysfs
> > > 
> > > knob
> > > > > > > > > > > > really a problem in container use cases.
> > > > > > > > > > > > 
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > 
> > > > > > > > > > > > Tvrtko
> > > > > > > > > > > 
> > > > > > > > > > > Hey Tvrtko,
> > > > > > > > > > > 
> > > > > > > > > > > Thanks for summarizing a bunch of discussions.
> > > > > > > > > > > Essentially I agree with every you wrote above.
> > > > > > > > > > > 
> > > > > > > > > > > If we have a global setting (determined by the OR
> > > > > > > > > > > policy), what's
> > > 
> > > the
> > > > > > > > > > > point of per context settings?
> > > > > > > > > > > 
> > > > > > > > > > > In Dmitry's scenario, all userspace applications
> > > > > > > > > > > will work
> > > > > > > > > > > together to
> > > > > > > > > > > reach the consensus so it sounds like we're
> > > > > > > > > > > reimplementing the
> > > 
> > > policy
> > > > > > > > > > > that is already existing in userspace.
> > > > > > > > > > > 
> > > > > > > > > > > Anyway, I'm implementing Joonas' suggestion.
> > > > > > > > > > > Hopefully
> > > 
> > > somebody else
> > > > > > > > > > > than me pick one or the other :)
> > > > > > > > > > 
> > > > > > > > > > I'll just mention the voting/consensus approach to
> > > > > > > > > > see if anyone
> > > 
> > > else
> > > > > > > > > > likes it.
> > > > > > > > > > 
> > > > > > > > > > Each context has a CONTEXT_PARAM_HINT_SSEU { small,
> > > 
> > > dontcare, large }
> > > > > > > > > > (or some other abstract names).
> > > > > > > > > 
> > > > > > > > > Yeah, the param name should have the word _HINT_ in
> > > > > > > > > it when it's
> > > 
> > > not a
> > > > > > > > > definitive set.
> > > > > > > > > 
> > > > > > > > > There's no global setter across containers, only a
> > > > > > > > > scenario when
> > > > > > > > > everyone agrees or not. Tallying up the votes and
> > > > > > > > > going with a
> > > 
> > > majority
> > > > > > > > > vote might be an option, too.
> > > > > > > > > 
> > > > > > > > > Regards, Joonas
> > > > > > > > 
> > > > > > > > Trying to test the "everyone agrees" approach here.
> > > > > > > 
> > > > > > > It's not everyone agrees, but the greater good.
> > > > > > 
> > > > > > I'm looking forward to the definition of the greater good
> > > > > > :)
> > > > > > Tvrtko wanted to avoid the heuristic territory, it seems
> > > > > > like we're just
> > > > > > stepping into it.
> > > > > 
> > > > > I am not sure that "small, dontcare, large" models brings
> > > > > much. No one
> > > > > would probably set "dontcare" since we have to default new
> > > > > contexts
> > 
> > to
> > > > > large to be compatible.
> > > > > 
> > > > > Don't know, I still prefer the master knob option. Honestly
> > > > > don't yet
> > > > > see the containers use case as a problem. There is always a
> > > > > master
> > > > > domain in any system where the knob can be enabled if the
> > > > > customers
> > 
> > on
> > > > > the system are such to warrant it. On mixed systems enable it
> > > > > or not
> > > > > someone always suffers. And with the knob we are free to add
> > 
> > heuristics
> > > > > later, keep the uapi, and just default the knob to on.
> > > > 
> > > > Master knob effectively means dead code behind a switch, that's
> > > > not
> > 
> > very
> > > > upstream friendly.
> > > 
> > > Hey at least I wasn't proposing a modparam! :)))
> > > 
> > > Yes it is not the best software practice, upstream or not,
> > > however I am
> > > trying to be pragmatical here and choose the simplest, smallest,
> > > good
> > > enough, and least headache inducing in the future solution.
> > > 
> > > One way of of looking at the master switch could be like tune
> > > your
> > > system for XYZ - change CPU frequency governor, disable SATA link
> > > saving, allow i915 media optimized mode. Some live code out, some
> > > dead
> > > code in.
> > > 
> > > But perhaps discussion is moot since we don't have userspace
> > > anyway.
> > > 
> > > Regards,
> > > 
> > > Tvrtko
> > 
> > I'm surprised by the "no deadcode behind knobs comment". We do this
> > all
> > the time - "display=0" anyone? Or "enable_cmdparser=false"?
> > 
> > Allowing user space to reduce EU performance for the system as a
> > whole
> > is not a great idea imho. Only sysadmin should have the right to
> > let that
> > happen, and an admin switch (I WOULD go with a modparam personally)
> > is
> > a good way to ensure that we can get the optimal media
> > configuration for
> > dedicated media systems, while for general systems we get the best
> > overall
> > performance (not just favouring media).
> > 
> > Why would we want to over engineer this?
> > 
> > Jon
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-07-24 21:50                         ` Bloomfield, Jon
  2018-07-30 20:40                           ` Rogozhkin, Dmitry V
@ 2018-08-02 10:00                           ` Tvrtko Ursulin
  2018-08-02 19:24                             ` Rogozhkin, Dmitry V
  1 sibling, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2018-08-02 10:00 UTC (permalink / raw)
  To: Bloomfield, Jon, Joonas Lahtinen, Chris Wilson, Landwerlin,
	Lionel G, intel-gfx


[Picking this point in the thread to reply on some points mentioned by 
folks in the whole thread.]

I don't remember if any patches from Lionel's series actually had r-b's, 
but a few people including myself have certainly been reviewing them. If 
I had applied the final r-b I wouldn't have made much difference due 
lack of userspace and disagreement on the DoS mitigation story. So to 
say effectively put your money where your mouth is and review is not 
entirely fair.

Suggestion to add a master sysfs switch was to alleviate the DoS 
concerns because dynamic switching has a cost towards all GPU clients, 
not that it just has potential to slow down media.

Suggestion was also that this switch becomes immutable and defaults to 
"allow" on Gen11 onwards.

I preferred sysfs versus a modparam since it makes testing (Both for 
developers and users (what config works better for my use case?) easier.)

I am fine with the suggestion to drive the Gen11 part first, which 
removes the need for any of this.

Patch series is already (AFAIR) nicely split so only the last patch 
needs to be dropped.

If at some point we decide to revisit the Gen8/9 story, we can 
evaluate/measure whether any master switch is actually warranted. I 
think Lionel did some micro-benchmarking which showed impact is not so 
severe, so perhaps for real-world use cases it would be even less.

I can re-read the public patches and review them, or perhaps even adopt 
them if they have been orphaned. Have to check with Francesco first 
before I commit to the latter.

On the uAPI front interface looked fine to me.

One recent interesting development are Mesa patches posted to mesa-dev 
(https://lists.freedesktop.org/archives/mesa-dev/2018-July/200607.html) 
which add EGL_CONTEXT_load_type extension (low/medium/high). This would 
need some sort of mapping between low/medium/high to more specific 
settings but still seems okay to me.

This may bring another (open source?) user for the patches. Which Gen's 
they are interested in is also a question.

Regards,

Tvrtko

On 24/07/2018 22:50, Bloomfield, Jon wrote:
> Gratuitous top posting to re-kick the thread.
> 
> For Gen11 we can't have an on/off switch anyway (media simply won't run
> with an oncompatible slice config), so let's agree on an api to allow userland
> to select the slice configuration for its contexts, for Gen11 only. I'd prefer a
> simple {MediaConfig/GeneralConfig} type context param but I really don't
> care that much.
> 
> For gen9/10 it's arguable whether we need this at all. The effect on media
> workloads varies, but I'm guessing normal users (outside a transcode
> type environment) will never know they're missing anything. Either way,
> we can continue discussing while we progress the gen11 solution.
> 
> Jon
> 
>> -----Original Message-----
>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
>> Bloomfield, Jon
>> Sent: Wednesday, July 18, 2018 9:44 AM
>> To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>; Joonas Lahtinen
>> <joonas.lahtinen@linux.intel.com>; Chris Wilson <chris@chris-wilson.co.uk>;
>> Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; intel-
>> gfx@lists.freedesktop.org
>> Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let users
>> set sseu configs
>>
>>> -----Original Message-----
>>> From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On Behalf Of
>>> Tvrtko Ursulin
>>> Sent: Thursday, June 14, 2018 1:29 AM
>>> To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Chris Wilson
>>> <chris@chris-wilson.co.uk>; Landwerlin, Lionel G
>>> <lionel.g.landwerlin@intel.com>; intel-gfx@lists.freedesktop.org
>>> Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs entry to let
>> users
>>> set sseu configs
>>>
>>>
>>> On 13/06/2018 13:49, Joonas Lahtinen wrote:
>>>> Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
>>>>>
>>>>> On 12/06/2018 11:52, Lionel Landwerlin wrote:
>>>>>> On 12/06/18 11:37, Chris Wilson wrote:
>>>>>>> Quoting Lionel Landwerlin (2018-06-12 11:33:34)
>>>>>>>> On 12/06/18 10:20, Joonas Lahtinen wrote:
>>>>>>>>> Quoting Chris Wilson (2018-06-11 18:02:37)
>>>>>>>>>> Quoting Lionel Landwerlin (2018-06-11 14:46:07)
>>>>>>>>>>> On 11/06/18 13:10, Tvrtko Ursulin wrote:
>>>>>>>>>>>> On 30/05/2018 15:33, Lionel Landwerlin wrote:
>>>>>>>>>>>>> There are concerns about denial of service around the per
>>>>>>>>>>>>> context sseu
>>>>>>>>>>>>> configuration capability. In a previous commit introducing the
>>>>>>>>>>>>> capability we allowed it only for capable users. This changes
>>>>>>>>>>>>> adds a
>>>>>>>>>>>>> new debugfs entry to let any user configure its own context
>>>>>>>>>>>>> powergating setup.
>>>>>>>>>>>> As far as I understood it, Joonas' concerns here are:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) That in the containers use case individual containers
>> wouldn't
>>> be
>>>>>>>>>>>> able to turn on the sysfs toggle for them.
>>>>>>>>>>>>
>>>>>>>>>>>> 2) That also in the containers use case if box admin turns on
>> the
>>>>>>>>>>>> feature, some containers would potentially start negatively
>>>>>>>>>>>> affecting
>>>>>>>>>>>> the others (via the accumulated cost of slice re-configuration
>> on
>>>>>>>>>>>> context switching).
>>>>>>>>>>>>
>>>>>>>>>>>> I am not familiar with typical container setups to be
>> authoritative
>>>>>>>>>>>> here, but intuitively I find it reasonable that a low-level
>> hardware
>>>>>>>>>>>> switch like this would be under the control of a master domain
>>>>>>>>>>>> administrator. ("If you are installing our product in the
>> container
>>>>>>>>>>>> environment, make sure your system administrator enables
>> this
>>>>>>>>>>>> hardware
>>>>>>>>>>>> feature.", "Note to system administrators: Enabling this
>> features
>>>>>>>>>>>> may
>>>>>>>>>>>> negatively affect the performance of other containers.")
>>>>>>>>>>>>
>>>>>>>>>>>> Alternative proposal is for the i915 to apply an "or" filter on all
>>>>>>>>>>>> requested masks and in that way ensure dynamic re-
>>> configuration
>>>>>>>>>>>> doesn't happen on context switches, but driven from
>> userspace
>>> via
>>>>>>>>>>>> ioctls.
>>>>>>>>>>>>
>>>>>>>>>>>> In other words, should _all_ userspace agree between
>>> themselves that
>>>>>>>>>>>> they want to turn off a slice, they would then need to send out
>> a
>>>>>>>>>>>> concerted ioctl storm, where number of needed ioctls equals
>>> the
>>>>>>>>>>>> number
>>>>>>>>>>>> of currently active contexts. (This may have its own
>> performance
>>>>>>>>>>>> consequences caused by the barriers needed to modify all
>>> context
>>>>>>>>>>>> images.)
>>>>>>>>>>>>
>>>>>>>>>>>> This was deemed acceptable the the media use case, but my
>>> concern is
>>>>>>>>>>>> the approach is not elegant and will tie us with the "or" policy
>> in
>>>>>>>>>>>> the ABI. (Performance concerns I haven't evaluated yet, but
>>> they
>>>>>>>>>>>> also
>>>>>>>>>>>> may be significant.)
>>>>>>>>>>>>
>>>>>>>>>>>> If we go back thinking about the containers use case, then it
>>>>>>>>>>>> transpires that even though the "or" policy does prevent one
>>>>>>>>>>>> container
>>>>>>>>>>>> from affecting the other from one angle, it also prevents one
>>>>>>>>>>>> container from exercising the feature unless all containers
>>>>>>>>>>>> co-operate.
>>>>>>>>>>>>
>>>>>>>>>>>> As such, we can view the original problem statement where
>> we
>>> have an
>>>>>>>>>>>> issue if not everyone co-operates, as conceptually the same
>> just
>>>>>>>>>>>> from
>>>>>>>>>>>> an opposite angle. (Rather than one container incurring the
>>>>>>>>>>>> increased
>>>>>>>>>>>> cost of context switches to the rest, we would have one
>>> container
>>>>>>>>>>>> preventing the optimized slice configuration to the other.)
>>>>>>>>>>>>
>>>>>>>>>>>>     From this follows that both proposals require complete
>>>>>>>>>>>> co-operation
>>>>>>>>>>>> from all running userspace to avoid complete control of the
>>> feature.
>>>>>>>>>>>>
>>>>>>>>>>>> Since the balance between the benefit of optimized slice
>>>>>>>>>>>> configuration
>>>>>>>>>>>> (or penalty of suboptimal one), versus the penalty of increased
>>>>>>>>>>>> context switch times, cannot be know by the driver (barring
>>>>>>>>>>>> venturing
>>>>>>>>>>>> into the heuristics territory), that is another reason why I find
>>>>>>>>>>>> the
>>>>>>>>>>>> "or" policy in the driver questionable.
>>>>>>>>>>>>
>>>>>>>>>>>> We can also ask a question of - If we go with the "or" policy,
>> why
>>>>>>>>>>>> require N per-context ioctls to modify the global GPU
>>> configuration
>>>>>>>>>>>> and not instead add a global driver ioctl to modify the state?
>>>>>>>>>>>>
>>>>>>>>>>>> If a future hardware requires, or enables, the per-context
>>> behaviour
>>>>>>>>>>>> in a more efficient way, we could then revisit the problem
>> space.
>>>>>>>>>>>>
>>>>>>>>>>>> In the mean time I see the "or" policy solution as adding some
>>> ABI
>>>>>>>>>>>> which doesn't do anything for many use cases without any way
>>> for the
>>>>>>>>>>>> sysadmin to enable it. At the same time master sysfs knob at
>>> least
>>>>>>>>>>>> enables the sysadmin to make a decision. Here I am thinking
>>> about a
>>>>>>>>>>>> random client environment where not all userspace co-
>>> operates,
>>>>>>>>>>>> but for
>>>>>>>>>>>> instance user is running the feature aware media stack, and
>>>>>>>>>>>> non-feature aware OpenCL/3d stack.
>>>>>>>>>>>>
>>>>>>>>>>>> I guess the complete story boils down to - is the master sysfs
>>> knob
>>>>>>>>>>>> really a problem in container use cases.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>>
>>>>>>>>>>>> Tvrtko
>>>>>>>>>>> Hey Tvrtko,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for summarizing a bunch of discussions.
>>>>>>>>>>> Essentially I agree with every you wrote above.
>>>>>>>>>>>
>>>>>>>>>>> If we have a global setting (determined by the OR policy), what's
>>> the
>>>>>>>>>>> point of per context settings?
>>>>>>>>>>>
>>>>>>>>>>> In Dmitry's scenario, all userspace applications will work
>>>>>>>>>>> together to
>>>>>>>>>>> reach the consensus so it sounds like we're reimplementing the
>>> policy
>>>>>>>>>>> that is already existing in userspace.
>>>>>>>>>>>
>>>>>>>>>>> Anyway, I'm implementing Joonas' suggestion. Hopefully
>>> somebody else
>>>>>>>>>>> than me pick one or the other :)
>>>>>>>>>> I'll just mention the voting/consensus approach to see if anyone
>>> else
>>>>>>>>>> likes it.
>>>>>>>>>>
>>>>>>>>>> Each context has a CONTEXT_PARAM_HINT_SSEU { small,
>>> dontcare, large }
>>>>>>>>>> (or some other abstract names).
>>>>>>>>> Yeah, the param name should have the word _HINT_ in it when it's
>>> not a
>>>>>>>>> definitive set.
>>>>>>>>>
>>>>>>>>> There's no global setter across containers, only a scenario when
>>>>>>>>> everyone agrees or not. Tallying up the votes and going with a
>>> majority
>>>>>>>>> vote might be an option, too.
>>>>>>>>>
>>>>>>>>> Regards, Joonas
>>>>>>>> Trying to test the "everyone agrees" approach here.
>>>>>>> It's not everyone agrees, but the greater good.
>>>>>>
>>>>>> I'm looking forward to the definition of the greater good :)
>>>>>> Tvrtko wanted to avoid the heuristic territory, it seems like we're just
>>>>>> stepping into it.
>>>>>
>>>>> I am not sure that "small, dontcare, large" models brings much. No one
>>>>> would probably set "dontcare" since we have to default new contexts
>> to
>>>>> large to be compatible.
>>>>>
>>>>> Don't know, I still prefer the master knob option. Honestly don't yet
>>>>> see the containers use case as a problem. There is always a master
>>>>> domain in any system where the knob can be enabled if the customers
>> on
>>>>> the system are such to warrant it. On mixed systems enable it or not
>>>>> someone always suffers. And with the knob we are free to add
>> heuristics
>>>>> later, keep the uapi, and just default the knob to on.
>>>>
>>>> Master knob effectively means dead code behind a switch, that's not
>> very
>>>> upstream friendly.
>>>
>>> Hey at least I wasn't proposing a modparam! :)))
>>>
>>> Yes it is not the best software practice, upstream or not, however I am
>>> trying to be pragmatical here and choose the simplest, smallest, good
>>> enough, and least headache inducing in the future solution.
>>>
>>> One way of of looking at the master switch could be like tune your
>>> system for XYZ - change CPU frequency governor, disable SATA link
>>> saving, allow i915 media optimized mode. Some live code out, some dead
>>> code in.
>>>
>>> But perhaps discussion is moot since we don't have userspace anyway.
>>>
>>> Regards,
>>>
>>> Tvrtko
>>
>> I'm surprised by the "no deadcode behind knobs comment". We do this all
>> the time - "display=0" anyone? Or "enable_cmdparser=false"?
>>
>> Allowing user space to reduce EU performance for the system as a whole
>> is not a great idea imho. Only sysadmin should have the right to let that
>> happen, and an admin switch (I WOULD go with a modparam personally) is
>> a good way to ensure that we can get the optimal media configuration for
>> dedicated media systems, while for general systems we get the best overall
>> performance (not just favouring media).
>>
>> Why would we want to over engineer this?
>>
>> Jon
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs
  2018-08-02 10:00                           ` Tvrtko Ursulin
@ 2018-08-02 19:24                             ` Rogozhkin, Dmitry V
  0 siblings, 0 replies; 29+ messages in thread
From: Rogozhkin, Dmitry V @ 2018-08-02 19:24 UTC (permalink / raw)
  To: Bloomfield, Jon, intel-gfx, joonas.lahtinen, Landwerlin,
	Lionel G, chris, tvrtko.ursulin

On Thu, 2018-08-02 at 11:00 +0100, Tvrtko Ursulin wrote:
> [Picking this point in the thread to reply on some points mentioned
> by 
> folks in the whole thread.]
> 
> I don't remember if any patches from Lionel's series actually had r-
> b's, 
> but a few people including myself have certainly been reviewing them.
> If 
> I had applied the final r-b I wouldn't have made much difference due 
> lack of userspace and disagreement on the DoS mitigation story. So
> to 
> say effectively put your money where your mouth is and review is not 
> entirely fair.
> 
> Suggestion to add a master sysfs switch was to alleviate the DoS 
> concerns because dynamic switching has a cost towards all GPU
> clients, 
> not that it just has potential to slow down media.
> 
> Suggestion was also that this switch becomes immutable and defaults
> to 
> "allow" on Gen11 onwards.
> 
> I preferred sysfs versus a modparam since it makes testing (Both for 
> developers and users (what config works better for my use case?)
> easier.)
> 
> I am fine with the suggestion to drive the Gen11 part first, which 
> removes the need for any of this.
> 
> Patch series is already (AFAIR) nicely split so only the last patch 
> needs to be dropped.
> 
> If at some point we decide to revisit the Gen8/9 story, we can 
> evaluate/measure whether any master switch is actually warranted. I 
> think Lionel did some micro-benchmarking which showed impact is not
> so 
> severe, so perhaps for real-world use cases it would be even less.
> 
> I can re-read the public patches and review them, or perhaps even
> adopt 
> them if they have been orphaned. Have to check with Francesco first 
> before I commit to the latter.

Thank you for volunteering:).
I talked with Lionel about that and he thinks he may be able to do a
respin next week. So, please, check with him also to avoid double work.

> 
> On the uAPI front interface looked fine to me.
> 
> One recent interesting development are Mesa patches posted to mesa-
> dev 
> (https://lists.freedesktop.org/archives/mesa-dev/2018-July/200607.htm
> l) 
> which add EGL_CONTEXT_load_type extension (low/medium/high). This
> would 
> need some sort of mapping between low/medium/high to more specific 
> settings but still seems okay to me.
> 
> This may bring another (open source?) user for the patches. Which
> Gen's 
> they are interested in is also a question.
> 
> Regards,
> 
> Tvrtko
> 
> On 24/07/2018 22:50, Bloomfield, Jon wrote:
> > Gratuitous top posting to re-kick the thread.
> > 
> > For Gen11 we can't have an on/off switch anyway (media simply won't
> > run
> > with an oncompatible slice config), so let's agree on an api to
> > allow userland
> > to select the slice configuration for its contexts, for Gen11 only.
> > I'd prefer a
> > simple {MediaConfig/GeneralConfig} type context param but I really
> > don't
> > care that much.
> > 
> > For gen9/10 it's arguable whether we need this at all. The effect
> > on media
> > workloads varies, but I'm guessing normal users (outside a
> > transcode
> > type environment) will never know they're missing anything. Either
> > way,
> > we can continue discussing while we progress the gen11 solution.
> > 
> > Jon
> > 
> > > -----Original Message-----
> > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On
> > > Behalf Of
> > > Bloomfield, Jon
> > > Sent: Wednesday, July 18, 2018 9:44 AM
> > > To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>; Joonas
> > > Lahtinen
> > > <joonas.lahtinen@linux.intel.com>; Chris Wilson <chris@chris-wils
> > > on.co.uk>;
> > > Landwerlin, Lionel G <lionel.g.landwerlin@intel.com>; intel-
> > > gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs
> > > entry to let users
> > > set sseu configs
> > > 
> > > > -----Original Message-----
> > > > From: Intel-gfx <intel-gfx-bounces@lists.freedesktop.org> On
> > > > Behalf Of
> > > > Tvrtko Ursulin
> > > > Sent: Thursday, June 14, 2018 1:29 AM
> > > > To: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Chris
> > > > Wilson
> > > > <chris@chris-wilson.co.uk>; Landwerlin, Lionel G
> > > > <lionel.g.landwerlin@intel.com>; intel-gfx@lists.freedesktop.or
> > > > g
> > > > Subject: Re: [Intel-gfx] [PATCH v9 7/7] drm/i915: add a sysfs
> > > > entry to let
> > > 
> > > users
> > > > set sseu configs
> > > > 
> > > > 
> > > > On 13/06/2018 13:49, Joonas Lahtinen wrote:
> > > > > Quoting Tvrtko Ursulin (2018-06-12 15:02:07)
> > > > > > 
> > > > > > On 12/06/2018 11:52, Lionel Landwerlin wrote:
> > > > > > > On 12/06/18 11:37, Chris Wilson wrote:
> > > > > > > > Quoting Lionel Landwerlin (2018-06-12 11:33:34)
> > > > > > > > > On 12/06/18 10:20, Joonas Lahtinen wrote:
> > > > > > > > > > Quoting Chris Wilson (2018-06-11 18:02:37)
> > > > > > > > > > > Quoting Lionel Landwerlin (2018-06-11 14:46:07)
> > > > > > > > > > > > On 11/06/18 13:10, Tvrtko Ursulin wrote:
> > > > > > > > > > > > > On 30/05/2018 15:33, Lionel Landwerlin wrote:
> > > > > > > > > > > > > > There are concerns about denial of service
> > > > > > > > > > > > > > around the per
> > > > > > > > > > > > > > context sseu
> > > > > > > > > > > > > > configuration capability. In a previous
> > > > > > > > > > > > > > commit introducing the
> > > > > > > > > > > > > > capability we allowed it only for capable
> > > > > > > > > > > > > > users. This changes
> > > > > > > > > > > > > > adds a
> > > > > > > > > > > > > > new debugfs entry to let any user configure
> > > > > > > > > > > > > > its own context
> > > > > > > > > > > > > > powergating setup.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > As far as I understood it, Joonas' concerns
> > > > > > > > > > > > > here are:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 1) That in the containers use case individual
> > > > > > > > > > > > > containers
> > > 
> > > wouldn't
> > > > be
> > > > > > > > > > > > > able to turn on the sysfs toggle for them.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 2) That also in the containers use case if
> > > > > > > > > > > > > box admin turns on
> > > 
> > > the
> > > > > > > > > > > > > feature, some containers would potentially
> > > > > > > > > > > > > start negatively
> > > > > > > > > > > > > affecting
> > > > > > > > > > > > > the others (via the accumulated cost of slice
> > > > > > > > > > > > > re-configuration
> > > 
> > > on
> > > > > > > > > > > > > context switching).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I am not familiar with typical container
> > > > > > > > > > > > > setups to be
> > > 
> > > authoritative
> > > > > > > > > > > > > here, but intuitively I find it reasonable
> > > > > > > > > > > > > that a low-level
> > > 
> > > hardware
> > > > > > > > > > > > > switch like this would be under the control
> > > > > > > > > > > > > of a master domain
> > > > > > > > > > > > > administrator. ("If you are installing our
> > > > > > > > > > > > > product in the
> > > 
> > > container
> > > > > > > > > > > > > environment, make sure your system
> > > > > > > > > > > > > administrator enables
> > > 
> > > this
> > > > > > > > > > > > > hardware
> > > > > > > > > > > > > feature.", "Note to system administrators:
> > > > > > > > > > > > > Enabling this
> > > 
> > > features
> > > > > > > > > > > > > may
> > > > > > > > > > > > > negatively affect the performance of other
> > > > > > > > > > > > > containers.")
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Alternative proposal is for the i915 to apply
> > > > > > > > > > > > > an "or" filter on all
> > > > > > > > > > > > > requested masks and in that way ensure
> > > > > > > > > > > > > dynamic re-
> > > > 
> > > > configuration
> > > > > > > > > > > > > doesn't happen on context switches, but
> > > > > > > > > > > > > driven from
> > > 
> > > userspace
> > > > via
> > > > > > > > > > > > > ioctls.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > In other words, should _all_ userspace agree
> > > > > > > > > > > > > between
> > > > 
> > > > themselves that
> > > > > > > > > > > > > they want to turn off a slice, they would
> > > > > > > > > > > > > then need to send out
> > > 
> > > a
> > > > > > > > > > > > > concerted ioctl storm, where number of needed
> > > > > > > > > > > > > ioctls equals
> > > > 
> > > > the
> > > > > > > > > > > > > number
> > > > > > > > > > > > > of currently active contexts. (This may have
> > > > > > > > > > > > > its own
> > > 
> > > performance
> > > > > > > > > > > > > consequences caused by the barriers needed to
> > > > > > > > > > > > > modify all
> > > > 
> > > > context
> > > > > > > > > > > > > images.)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This was deemed acceptable the the media use
> > > > > > > > > > > > > case, but my
> > > > 
> > > > concern is
> > > > > > > > > > > > > the approach is not elegant and will tie us
> > > > > > > > > > > > > with the "or" policy
> > > 
> > > in
> > > > > > > > > > > > > the ABI. (Performance concerns I haven't
> > > > > > > > > > > > > evaluated yet, but
> > > > 
> > > > they
> > > > > > > > > > > > > also
> > > > > > > > > > > > > may be significant.)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > If we go back thinking about the containers
> > > > > > > > > > > > > use case, then it
> > > > > > > > > > > > > transpires that even though the "or" policy
> > > > > > > > > > > > > does prevent one
> > > > > > > > > > > > > container
> > > > > > > > > > > > > from affecting the other from one angle, it
> > > > > > > > > > > > > also prevents one
> > > > > > > > > > > > > container from exercising the feature unless
> > > > > > > > > > > > > all containers
> > > > > > > > > > > > > co-operate.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > As such, we can view the original problem
> > > > > > > > > > > > > statement where
> > > 
> > > we
> > > > have an
> > > > > > > > > > > > > issue if not everyone co-operates, as
> > > > > > > > > > > > > conceptually the same
> > > 
> > > just
> > > > > > > > > > > > > from
> > > > > > > > > > > > > an opposite angle. (Rather than one container
> > > > > > > > > > > > > incurring the
> > > > > > > > > > > > > increased
> > > > > > > > > > > > > cost of context switches to the rest, we
> > > > > > > > > > > > > would have one
> > > > 
> > > > container
> > > > > > > > > > > > > preventing the optimized slice configuration
> > > > > > > > > > > > > to the other.)
> > > > > > > > > > > > > 
> > > > > > > > > > > > >     From this follows that both proposals
> > > > > > > > > > > > > require complete
> > > > > > > > > > > > > co-operation
> > > > > > > > > > > > > from all running userspace to avoid complete
> > > > > > > > > > > > > control of the
> > > > 
> > > > feature.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Since the balance between the benefit of
> > > > > > > > > > > > > optimized slice
> > > > > > > > > > > > > configuration
> > > > > > > > > > > > > (or penalty of suboptimal one), versus the
> > > > > > > > > > > > > penalty of increased
> > > > > > > > > > > > > context switch times, cannot be know by the
> > > > > > > > > > > > > driver (barring
> > > > > > > > > > > > > venturing
> > > > > > > > > > > > > into the heuristics territory), that is
> > > > > > > > > > > > > another reason why I find
> > > > > > > > > > > > > the
> > > > > > > > > > > > > "or" policy in the driver questionable.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > We can also ask a question of - If we go with
> > > > > > > > > > > > > the "or" policy,
> > > 
> > > why
> > > > > > > > > > > > > require N per-context ioctls to modify the
> > > > > > > > > > > > > global GPU
> > > > 
> > > > configuration
> > > > > > > > > > > > > and not instead add a global driver ioctl to
> > > > > > > > > > > > > modify the state?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > If a future hardware requires, or enables,
> > > > > > > > > > > > > the per-context
> > > > 
> > > > behaviour
> > > > > > > > > > > > > in a more efficient way, we could then
> > > > > > > > > > > > > revisit the problem
> > > 
> > > space.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > In the mean time I see the "or" policy
> > > > > > > > > > > > > solution as adding some
> > > > 
> > > > ABI
> > > > > > > > > > > > > which doesn't do anything for many use cases
> > > > > > > > > > > > > without any way
> > > > 
> > > > for the
> > > > > > > > > > > > > sysadmin to enable it. At the same time
> > > > > > > > > > > > > master sysfs knob at
> > > > 
> > > > least
> > > > > > > > > > > > > enables the sysadmin to make a decision. Here
> > > > > > > > > > > > > I am thinking
> > > > 
> > > > about a
> > > > > > > > > > > > > random client environment where not all
> > > > > > > > > > > > > userspace co-
> > > > 
> > > > operates,
> > > > > > > > > > > > > but for
> > > > > > > > > > > > > instance user is running the feature aware
> > > > > > > > > > > > > media stack, and
> > > > > > > > > > > > > non-feature aware OpenCL/3d stack.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I guess the complete story boils down to - is
> > > > > > > > > > > > > the master sysfs
> > > > 
> > > > knob
> > > > > > > > > > > > > really a problem in container use cases.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Tvrtko
> > > > > > > > > > > > 
> > > > > > > > > > > > Hey Tvrtko,
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks for summarizing a bunch of discussions.
> > > > > > > > > > > > Essentially I agree with every you wrote above.
> > > > > > > > > > > > 
> > > > > > > > > > > > If we have a global setting (determined by the
> > > > > > > > > > > > OR policy), what's
> > > > 
> > > > the
> > > > > > > > > > > > point of per context settings?
> > > > > > > > > > > > 
> > > > > > > > > > > > In Dmitry's scenario, all userspace
> > > > > > > > > > > > applications will work
> > > > > > > > > > > > together to
> > > > > > > > > > > > reach the consensus so it sounds like we're
> > > > > > > > > > > > reimplementing the
> > > > 
> > > > policy
> > > > > > > > > > > > that is already existing in userspace.
> > > > > > > > > > > > 
> > > > > > > > > > > > Anyway, I'm implementing Joonas' suggestion.
> > > > > > > > > > > > Hopefully
> > > > 
> > > > somebody else
> > > > > > > > > > > > than me pick one or the other :)
> > > > > > > > > > > 
> > > > > > > > > > > I'll just mention the voting/consensus approach
> > > > > > > > > > > to see if anyone
> > > > 
> > > > else
> > > > > > > > > > > likes it.
> > > > > > > > > > > 
> > > > > > > > > > > Each context has a CONTEXT_PARAM_HINT_SSEU {
> > > > > > > > > > > small,
> > > > 
> > > > dontcare, large }
> > > > > > > > > > > (or some other abstract names).
> > > > > > > > > > 
> > > > > > > > > > Yeah, the param name should have the word _HINT_ in
> > > > > > > > > > it when it's
> > > > 
> > > > not a
> > > > > > > > > > definitive set.
> > > > > > > > > > 
> > > > > > > > > > There's no global setter across containers, only a
> > > > > > > > > > scenario when
> > > > > > > > > > everyone agrees or not. Tallying up the votes and
> > > > > > > > > > going with a
> > > > 
> > > > majority
> > > > > > > > > > vote might be an option, too.
> > > > > > > > > > 
> > > > > > > > > > Regards, Joonas
> > > > > > > > > 
> > > > > > > > > Trying to test the "everyone agrees" approach here.
> > > > > > > > 
> > > > > > > > It's not everyone agrees, but the greater good.
> > > > > > > 
> > > > > > > I'm looking forward to the definition of the greater good
> > > > > > > :)
> > > > > > > Tvrtko wanted to avoid the heuristic territory, it seems
> > > > > > > like we're just
> > > > > > > stepping into it.
> > > > > > 
> > > > > > I am not sure that "small, dontcare, large" models brings
> > > > > > much. No one
> > > > > > would probably set "dontcare" since we have to default new
> > > > > > contexts
> > > 
> > > to
> > > > > > large to be compatible.
> > > > > > 
> > > > > > Don't know, I still prefer the master knob option. Honestly
> > > > > > don't yet
> > > > > > see the containers use case as a problem. There is always a
> > > > > > master
> > > > > > domain in any system where the knob can be enabled if the
> > > > > > customers
> > > 
> > > on
> > > > > > the system are such to warrant it. On mixed systems enable
> > > > > > it or not
> > > > > > someone always suffers. And with the knob we are free to
> > > > > > add
> > > 
> > > heuristics
> > > > > > later, keep the uapi, and just default the knob to on.
> > > > > 
> > > > > Master knob effectively means dead code behind a switch,
> > > > > that's not
> > > 
> > > very
> > > > > upstream friendly.
> > > > 
> > > > Hey at least I wasn't proposing a modparam! :)))
> > > > 
> > > > Yes it is not the best software practice, upstream or not,
> > > > however I am
> > > > trying to be pragmatical here and choose the simplest,
> > > > smallest, good
> > > > enough, and least headache inducing in the future solution.
> > > > 
> > > > One way of of looking at the master switch could be like tune
> > > > your
> > > > system for XYZ - change CPU frequency governor, disable SATA
> > > > link
> > > > saving, allow i915 media optimized mode. Some live code out,
> > > > some dead
> > > > code in.
> > > > 
> > > > But perhaps discussion is moot since we don't have userspace
> > > > anyway.
> > > > 
> > > > Regards,
> > > > 
> > > > Tvrtko
> > > 
> > > I'm surprised by the "no deadcode behind knobs comment". We do
> > > this all
> > > the time - "display=0" anyone? Or "enable_cmdparser=false"?
> > > 
> > > Allowing user space to reduce EU performance for the system as a
> > > whole
> > > is not a great idea imho. Only sysadmin should have the right to
> > > let that
> > > happen, and an admin switch (I WOULD go with a modparam
> > > personally) is
> > > a good way to ensure that we can get the optimal media
> > > configuration for
> > > dedicated media systems, while for general systems we get the
> > > best overall
> > > performance (not just favouring media).
> > > 
> > > Why would we want to over engineer this?
> > > 
> > > Jon
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2018-08-02 19:24 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-30 14:33 [PATCH v9 0/7] drm/i915: per context slice/subslice powergating Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 1/7] drm/i915: Program RPCS for Broadwell Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 2/7] drm/i915: Record the sseu configuration per-context & engine Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 3/7] drm/i915/perf: simplify configure all context function Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 4/7] drm/i915/perf: reuse intel_lrc ctx regs macro Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 5/7] drm/i915/perf: lock powergating configuration to default when active Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 6/7] drm/i915: Expose RPCS (SSEU) configuration to userspace Lionel Landwerlin
2018-05-30 14:33 ` [PATCH v9 7/7] drm/i915: add a sysfs entry to let users set sseu configs Lionel Landwerlin
2018-06-11 12:10   ` Tvrtko Ursulin
2018-06-11 13:46     ` Lionel Landwerlin
2018-06-11 15:02       ` Chris Wilson
2018-06-12  9:20         ` Joonas Lahtinen
2018-06-12 10:33           ` Lionel Landwerlin
2018-06-12 10:37             ` Chris Wilson
2018-06-12 10:52               ` Lionel Landwerlin
2018-06-12 12:02                 ` Tvrtko Ursulin
2018-06-13 12:49                   ` Joonas Lahtinen
2018-06-14  8:28                     ` Tvrtko Ursulin
2018-07-18 16:44                       ` Bloomfield, Jon
2018-07-19  7:04                         ` Joonas Lahtinen
2018-07-24 21:50                         ` Bloomfield, Jon
2018-07-30 20:40                           ` Rogozhkin, Dmitry V
2018-08-02 10:00                           ` Tvrtko Ursulin
2018-08-02 19:24                             ` Rogozhkin, Dmitry V
2018-06-12 12:02                 ` Chris Wilson
2018-05-30 16:24 ` ✗ Fi.CI.CHECKPATCH: warning for drm/i915: per context slice/subslice powergating (rev8) Patchwork
2018-05-30 16:26 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-05-30 16:40 ` ✓ Fi.CI.BAT: success " Patchwork
2018-05-30 17:30 ` ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.