All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
@ 2019-03-14  8:36 Ankit Navik
  2019-03-14  8:36 ` [PATCH v4 1/3] drm/i915: Get active pending request for given context Ankit Navik
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Ankit Navik @ 2019-03-14  8:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: ankit.p.navik

Current GPU configuration code for i915 does not allow us to change
EU/Slice/Sub-slice configuration dynamically. Its done only once while context
is created.

While particular graphics application is running, if we examine the command
requests from user space, we observe that command density is not consistent.
It means there is scope to change the graphics configuration dynamically even
while context is running actively. This patch series proposes the solution to
find the active pending load for all active context at given time and based on
that, dynamically perform graphics configuration for each context.

We use a hr (high resolution) timer with i915 driver in kernel to get a
callback every few milliseconds (this timer value can be configured through
debugfs, default is '0' indicating timer is in disabled state i.e. original
system without any intervention).In the timer callback, we examine pending
commands for a context in the queue, essentially, we intercept them before
they are executed by GPU and we update context with required number of EUs.

Two questions, how did we arrive at right timer value? and what's the right
number of EUs? For the prior one, empirical data to achieve best performance
in least power was considered. For the later one, we roughly categorized number 
of EUs logically based on platform. Now we compare number of pending commands
with a particular threshold and then set number of EUs accordingly with update
context. That threshold is also based on experiments & findings. If GPU is able
to catch up with CPU, typically there are no pending commands, the EU config
would remain unchanged there. In case there are more pending commands we
reprogram context with higher number of EUs. Please note, here we are changing
EUs even while context is running by examining pending commands every 'x'
milliseconds.

With this solution in place, on KBL-GT3 + Android and Ubuntu (Linux) we saw
following pnp benefits, power numbers mentioned here are system power.

App /KPI               | % Power |  % Power |
                       | Benefit |  Benefit |
                       |  (mW)   |   (mW)   |
                       | Android |  Ubuntu  |
---------------------------------|----------|
3D Mark (Ice storm)    | 2.30%   |  N.A.    |
TRex On screen         | 2.49%   |  2.97%   |
TRex Off screen        | 1.32%   |  N.A.    |
Manhattan On screen    | 3.11%   |  4.90%   |
Manhattan Off screen   | 0.89%   |  N.A.    |
AnTuTu  6.1.4          | 3.42%   |  N.A.    |
SynMark2               | N.A.    |  1.70%   |

Note - For KBL (GEN9) we cannot control at sub-slice level, it was always  a
constraint.
We always controlled number of EUs rather than sub-slices/slices.
We have also observed GPU core residencies improves by 1.035%.

Praveen Diwakar (3):
  drm/i915: Get active pending request for given context
  drm/i915: set optimum eu/slice/sub-slice configuration based on load
    type
  drm/i915: Predictive governor to control eu/slice/subslice

 drivers/gpu/drm/i915/Makefile              |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c        |   1 +
 drivers/gpu/drm/i915/i915_drv.h            |   9 +++
 drivers/gpu/drm/i915/i915_gem.c            |   3 +
 drivers/gpu/drm/i915/i915_gem_context.c    |  21 ++++++
 drivers/gpu/drm/i915/i915_gem_context.h    |  39 +++++++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |   2 +
 drivers/gpu/drm/i915/i915_params.c         |   4 ++
 drivers/gpu/drm/i915/i915_params.h         |   1 +
 drivers/gpu/drm/i915/intel_deu.c           | 109 +++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_device_info.c   |  47 ++++++++++++-
 drivers/gpu/drm/i915/intel_drv.h           |   3 +
 drivers/gpu/drm/i915/intel_lrc.c           |  48 ++++++++++++-
 13 files changed, 285 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_deu.c

-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/3] drm/i915: Get active pending request for given context
  2019-03-14  8:36 [PATCH v4 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik
@ 2019-03-14  8:36 ` Ankit Navik
  2019-03-14  8:55   ` Chris Wilson
  2019-03-14 10:39   ` Tvrtko Ursulin
  2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Ankit Navik @ 2019-03-14  8:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: ankit.p.navik

From: Praveen Diwakar <praveen.diwakar@intel.com>

This patch gives us the active pending request count which is yet
to be submitted to the GPU

V2:
 * Change 64-bit to atomic for request count. (Tvrtko Ursulin)

V3:
 * Remove mutex for request count.
 * Rebase.
 * Fixes hitting underflow for predictive request. (Tvrtko Ursulin)

V4:
 * Rebase.

Cc: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Cc: Kedar J Karanje <kedar.j.karanje@intel.com>
Cc: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
 drivers/gpu/drm/i915/i915_gem_context.h    | 5 +++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++
 drivers/gpu/drm/i915/intel_lrc.c           | 3 +++
 4 files changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 280813a..a5876fe 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -453,6 +453,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 	}
 
 	trace_i915_context_create(ctx);
+	atomic_set(&ctx->req_cnt, 0);
 
 	return ctx;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index ca150a7..c940168 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -227,6 +227,11 @@ struct i915_gem_context {
 	 * context close.
 	 */
 	struct list_head handles_list;
+
+	/** req_cnt: tracks the pending commands, based on which we decide to
+	 * go for low/medium/high load configuration of the GPU.
+	 */
+	atomic_t req_cnt;
 };
 
 static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 02adcaf..a38963b 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2475,6 +2475,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 */
 	eb.request->batch = eb.batch;
 
+	atomic_inc(&eb.ctx->req_cnt);
+
 	trace_i915_request_queue(eb.request, eb.batch_flags);
 	err = eb_submit(&eb);
 err_request:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 34a0866..d0af37d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -780,6 +780,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 
 			last = rq;
 			submit = true;
+
+			if (atomic_read(&rq->gem_context->req_cnt) > 0)
+				atomic_dec(&rq->gem_context->req_cnt);
 		}
 
 		rb_erase_cached(&p->node, &execlists->queue);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2019-03-14  8:36 [PATCH v4 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik
  2019-03-14  8:36 ` [PATCH v4 1/3] drm/i915: Get active pending request for given context Ankit Navik
@ 2019-03-14  8:36 ` Ankit Navik
  2019-03-14 10:45   ` Tvrtko Ursulin
                     ` (3 more replies)
  2019-03-14  8:36 ` [PATCH v4 3/3] drm/i915: Predictive governor to control eu/slice/subslice Ankit Navik
  2019-03-14  9:35 ` ✗ Fi.CI.BAT: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Patchwork
  3 siblings, 4 replies; 12+ messages in thread
From: Ankit Navik @ 2019-03-14  8:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: ankit.p.navik

From: Praveen Diwakar <praveen.diwakar@intel.com>

This patch will select optimum eu/slice/sub-slice configuration based on
type of load (low, medium, high) as input.
Based on our readings and experiments we have predefined set of optimum
configuration for each platform(CHT, KBL).
i915_gem_context_set_load_type will select optimum configuration from
pre-defined optimum configuration table(opt_config).

It also introduce flag update_render_config which can set by any governor.

v2:
 * Move static optimum_config to device init time.
 * Rename function to appropriate name, fix data types and patch ordering.
 * Rename prev_load_type to pending_load_type. (Tvrtko Ursulin)

v3:
 * Add safe guard check in i915_gem_context_set_load_type.
 * Rename struct from optimum_config to i915_sseu_optimum_config to
   avoid namespace clashes.
 * Reduces memcpy for space efficient.
 * Rebase.
 * Improved commit message. (Tvrtko Ursulin)

v4:
 * Move optimum config table to file scope. (Tvrtko Ursulin)

Cc: Kedar J Karanje <kedar.j.karanje@intel.com>
Cc: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h          |  5 ++++
 drivers/gpu/drm/i915/i915_gem_context.c  | 20 ++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.h  | 34 +++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_device_info.c | 47 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_lrc.c         | 45 +++++++++++++++++++++++++++++-
 5 files changed, 148 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5c8d048..97cb36b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1593,6 +1593,11 @@ struct drm_i915_private {
 	struct drm_i915_fence_reg fence_regs[I915_MAX_NUM_FENCES]; /* assume 965 */
 	int num_fence_regs; /* 8 on pre-965, 16 otherwise */
 
+	/* optimal slice/subslice/EU configration state */
+	struct i915_sseu_optimum_config *opt_config;
+
+	int predictive_load_enable;
+
 	unsigned int fsb_freq, mem_freq, is_ddr3;
 	unsigned int skl_preferred_vco_freq;
 	unsigned int max_cdclk_freq;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a5876fe..8f16ef1 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -454,10 +454,30 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 
 	trace_i915_context_create(ctx);
 	atomic_set(&ctx->req_cnt, 0);
+	ctx->slice_cnt = hweight8(RUNTIME_INFO(dev_priv)->sseu.slice_mask);
+	ctx->subslice_cnt = hweight8(
+			RUNTIME_INFO(dev_priv)->sseu.subslice_mask[0]);
+	ctx->eu_cnt = RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
 
 	return ctx;
 }
 
+
+void i915_gem_context_set_load_type(struct i915_gem_context *ctx,
+		enum gem_load_type type)
+{
+	struct drm_i915_private *dev_priv = ctx->i915;
+
+	if (GEM_WARN_ON(type > LOAD_TYPE_LAST))
+		return;
+
+	/* Call opt_config to get correct configuration for eu,slice,subslice */
+	ctx->slice_cnt = dev_priv->opt_config[type].slice;
+	ctx->subslice_cnt = dev_priv->opt_config[type].subslice;
+	ctx->eu_cnt = dev_priv->opt_config[type].eu;
+	ctx->pending_load_type = type;
+}
+
 /**
  * i915_gem_context_create_gvt - create a GVT GEM context
  * @dev: drm device *
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index c940168..0a24d28 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -54,6 +54,19 @@ struct intel_context_ops {
 	void (*destroy)(struct intel_context *ce);
 };
 
+enum gem_load_type {
+	LOAD_TYPE_LOW,
+	LOAD_TYPE_MEDIUM,
+	LOAD_TYPE_HIGH,
+	LOAD_TYPE_LAST
+};
+
+struct i915_sseu_optimum_config {
+	u8 slice;
+	u8 subslice;
+	u8 eu;
+};
+
 /*
  * Powergating configuration for a particular (context,engine).
  */
@@ -232,6 +245,25 @@ struct i915_gem_context {
 	 * go for low/medium/high load configuration of the GPU.
 	 */
 	atomic_t req_cnt;
+
+	/** slice_cnt: used to set the # of slices to be enabled. */
+	u8 slice_cnt;
+
+	/** subslice_cnt: used to set the # of subslices to be enabled. */
+	u8 subslice_cnt;
+
+	/** eu_cnt: used to set the # of eu to be enabled. */
+	u8 eu_cnt;
+
+	/** load_type: The designated load_type (high/medium/low) for a given
+	 * number of pending commands in the command queue.
+	 */
+	enum gem_load_type load_type;
+
+	/** pending_load_type: The earlier load type that the GPU was configured
+	 * for (high/medium/low).
+	 */
+	enum gem_load_type pending_load_type;
 };
 
 static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
@@ -375,6 +407,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
+void i915_gem_context_set_load_type(struct i915_gem_context *ctx,
+		enum gem_load_type type);
 
 struct i915_gem_context *
 i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 855a507..017a1e2 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -707,6 +707,27 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
 	return 0;
 }
 
+/* static table of slice/subslice/EU for Cherryview */
+static const struct i915_sseu_optimum_config chv_config[LOAD_TYPE_LAST] = {
+	{1, 1, 4},	/* Low */
+	{1, 1, 6},	/* Medium */
+	{1, 2, 6}	/* High */
+};
+
+/* static table of slice/subslice/EU for KBL GT2 */
+static const struct i915_sseu_optimum_config kbl_gt2_config[LOAD_TYPE_LAST] = {
+	{1, 3, 2},	/* Low */
+	{1, 3, 4},	/* Medium */
+	{1, 3, 8}	/* High */
+};
+
+/* static table of slice/subslice/EU for KBL GT3 */
+static const struct i915_sseu_optimum_config kbl_gt3_config[LOAD_TYPE_LAST] = {
+	{2, 3, 4},	/* Low */
+	{2, 3, 6},	/* Medium */
+	{2, 3, 8}	/* High */
+};
+
 /**
  * intel_device_info_runtime_init - initialize runtime info
  * @dev_priv: the i915 device
@@ -728,6 +749,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 	struct intel_device_info *info = mkwrite_device_info(dev_priv);
 	struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
 	enum pipe pipe;
+	struct i915_sseu_optimum_config *opt_config = NULL;
 
 	if (INTEL_GEN(dev_priv) >= 10) {
 		for_each_pipe(dev_priv, pipe)
@@ -831,12 +853,30 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 	/* Initialize slice/subslice/EU info */
 	if (IS_HASWELL(dev_priv))
 		haswell_sseu_info_init(dev_priv);
-	else if (IS_CHERRYVIEW(dev_priv))
+	else if (IS_CHERRYVIEW(dev_priv)) {
 		cherryview_sseu_info_init(dev_priv);
+		opt_config = chv_config;
+		BUILD_BUG_ON(ARRAY_SIZE(chv_config) != LOAD_TYPE_LAST);
+	}
 	else if (IS_BROADWELL(dev_priv))
 		broadwell_sseu_info_init(dev_priv);
-	else if (IS_GEN(dev_priv, 9))
+	else if (IS_GEN(dev_priv, 9)) {
 		gen9_sseu_info_init(dev_priv);
+
+		switch (info->gt) {
+		default: /* fall through */
+		case 2:
+			opt_config = kbl_gt2_config;
+			BUILD_BUG_ON(ARRAY_SIZE(kbl_gt2_config)
+						!= LOAD_TYPE_LAST);
+		break;
+		case 3:
+			opt_config = kbl_gt3_config;
+			BUILD_BUG_ON(ARRAY_SIZE(kbl_gt3_config)
+						!= LOAD_TYPE_LAST);
+		break;
+		}
+	}
 	else if (IS_GEN(dev_priv, 10))
 		gen10_sseu_info_init(dev_priv);
 	else if (INTEL_GEN(dev_priv) >= 11)
@@ -847,6 +887,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 		info->ppgtt = INTEL_PPGTT_NONE;
 	}
 
+	if (opt_config)
+		dev_priv->opt_config = opt_config;
+
 	/* Initialize command stream timestamp frequency */
 	runtime->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
 }
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d0af37d..397af1e 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1282,6 +1282,35 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
 	return i915_vma_pin(vma, 0, 0, flags);
 }
 
+static u32
+get_context_rpcs_config(struct i915_gem_context *ctx)
+{
+	u32 rpcs = 0;
+	struct drm_i915_private *dev_priv = ctx->i915;
+
+	if (INTEL_GEN(dev_priv) < 8)
+		return 0;
+
+	if (RUNTIME_INFO(dev_priv)->sseu.has_slice_pg) {
+		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
+		rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	if (RUNTIME_INFO(dev_priv)->sseu.has_subslice_pg) {
+		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
+		rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	if (RUNTIME_INFO(dev_priv)->sseu.has_eu_pg) {
+		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
+		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	return rpcs;
+}
 static void
 __execlists_update_reg_state(struct intel_engine_cs *engine,
 			     struct intel_context *ce)
@@ -1294,9 +1323,20 @@ __execlists_update_reg_state(struct intel_engine_cs *engine,
 	regs[CTX_RING_TAIL + 1] = ring->tail;
 
 	/* RPCS */
-	if (engine->class == RENDER_CLASS)
+	if (engine->class == RENDER_CLASS &&
+				engine->i915->predictive_load_enable) {
+		u32 rpcs_config = 0;
+		struct i915_gem_context *ctx = ce->gem_context;
+
+		rpcs_config = get_context_rpcs_config(ctx);
+		regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+		CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
+						rpcs_config);
+
+	} else if (engine->class == RENDER_CLASS) {
 		regs[CTX_R_PWR_CLK_STATE + 1] = gen8_make_rpcs(engine->i915,
 							       &ce->sseu);
+	}
 }
 
 static struct intel_context *
@@ -1340,6 +1380,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
 
 	__execlists_update_reg_state(engine, ce);
 
+	if (ctx->load_type != ctx->pending_load_type)
+		ctx->load_type = ctx->pending_load_type;
+
 	ce->state->obj->pin_global++;
 	i915_gem_context_get(ctx);
 	return ce;
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/3] drm/i915: Predictive governor to control eu/slice/subslice
  2019-03-14  8:36 [PATCH v4 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik
  2019-03-14  8:36 ` [PATCH v4 1/3] drm/i915: Get active pending request for given context Ankit Navik
  2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
@ 2019-03-14  8:36 ` Ankit Navik
  2019-03-14 10:58   ` Tvrtko Ursulin
  2019-03-14  9:35 ` ✗ Fi.CI.BAT: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Patchwork
  3 siblings, 1 reply; 12+ messages in thread
From: Ankit Navik @ 2019-03-14  8:36 UTC (permalink / raw)
  To: intel-gfx; +Cc: ankit.p.navik

From: Praveen Diwakar <praveen.diwakar@intel.com>

High resolution timer is used for predictive governor to control
eu/slice/subslice based on workloads.

Debugfs is provided to enable/disable/update timer configuration

V2:
 * Fix code style.
 * Move predictive_load_timer into a drm_i915_private
   structure.
 * Make generic function to set optimum config. (Tvrtko Ursulin)

V3:
 * Rebase.
 * Fix race condition for predictive load set.
 * Add slack to start hrtimer for more power efficient. (Tvrtko Ursulin)

V4:
 * Fix data type and initialization of mutex to protect predictive load state.
 * Move predictive timer init to i915_gem_init_early. (Tvrtko Ursulin)
 * Move debugfs to kernel parameter.

Cc: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Cc: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/Makefile       |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c |   1 +
 drivers/gpu/drm/i915/i915_drv.h     |   4 ++
 drivers/gpu/drm/i915/i915_gem.c     |   3 +
 drivers/gpu/drm/i915/i915_params.c  |   4 ++
 drivers/gpu/drm/i915/i915_params.h  |   1 +
 drivers/gpu/drm/i915/intel_deu.c    | 109 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_drv.h    |   3 +
 8 files changed, 126 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/intel_deu.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 1787e12..b9eaaed 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -119,6 +119,7 @@ i915-y += intel_audio.o \
 	  intel_color.o \
 	  intel_combo_phy.o \
 	  intel_connector.o \
+	  intel_deu.o \
 	  intel_display.o \
 	  intel_dpio_phy.o \
 	  intel_dpll_mgr.o \
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index ca8fa44..1b8f631 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4612,6 +4612,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_drrs_status", i915_drrs_status, 0},
 	{"i915_rps_boost_info", i915_rps_boost_info, 0},
 };
+
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
 static const struct i915_debugfs_files {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 97cb36b..dfe80d9 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1596,6 +1596,9 @@ struct drm_i915_private {
 	/* optimal slice/subslice/EU configration state */
 	struct i915_sseu_optimum_config *opt_config;
 
+	/* protects predictive load state */
+	struct mutex pred_mutex;
+	struct hrtimer pred_timer;
 	int predictive_load_enable;
 
 	unsigned int fsb_freq, mem_freq, is_ddr3;
@@ -2646,6 +2649,7 @@ extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
 extern unsigned long i915_mch_val(struct drm_i915_private *dev_priv);
 extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv);
 extern void i915_update_gfx_val(struct drm_i915_private *dev_priv);
+extern enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer);
 int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
 
 int intel_engines_init_mmio(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5c1b9d4..18c5195 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5267,6 +5267,9 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
 
 	spin_lock_init(&dev_priv->fb_tracking.lock);
 
+	/* Dynamic EU timer initialization for predictive load */
+	intel_deu_init(dev_priv);
+
 	err = i915_gemfs_init(dev_priv);
 	if (err)
 		DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err);
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index b5be0ab..7857e92 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -97,6 +97,10 @@ i915_param_named_unsafe(disable_power_well, int, 0400,
 
 i915_param_named_unsafe(enable_ips, int, 0600, "Enable IPS (default: true)");
 
+i915_param_named_unsafe(deu_timer_enable, int, 0600,
+	"Enable dynamic EU control for power savings "
+	"(0=disable deu predictive timer [default], 150=optimal deu predictive timer)");
+
 i915_param_named(fastboot, int, 0600,
 	"Try to skip unnecessary mode sets at boot time "
 	"(0=disabled, 1=enabled) "
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 3f14e98..c51f56b 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -54,6 +54,7 @@ struct drm_printer;
 	param(int, disable_power_well, -1) \
 	param(int, enable_ips, 1) \
 	param(int, invert_brightness, 0) \
+	param(int, deu_timer_enable, 0) \
 	param(int, enable_guc, 0) \
 	param(int, guc_log_level, -1) \
 	param(char *, guc_firmware_path, NULL) \
diff --git a/drivers/gpu/drm/i915/intel_deu.c b/drivers/gpu/drm/i915/intel_deu.c
new file mode 100644
index 0000000..34654d5
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_deu.c
@@ -0,0 +1,109 @@
+/*
+ * Copyright © 2019 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Ankit Navik <ankit.p.navik@intel.com>
+ */
+
+/**
+ * DOC: Dynamic EU Control (DEUC)
+ *
+ * DEUC tries to re-configure EU allocation during runtime by predictive load
+ * calculation of command queue to gain power saving.
+ * It is transparent to user space and completely handled in the kernel.
+ */
+
+#include "intel_drv.h"
+#include "i915_drv.h"
+
+/*
+ * Anything above threshold is considered as HIGH load, less is considered
+ * as LOW load and equal is considered as MEDIUM load.
+ *
+ * The threshold value of three active requests pending.
+ */
+#define PENDING_THRESHOLD_MEDIUM 3
+
+#define SLACK_TIMER_NSEC 1000000 /* Timer range in nano second */
+
+enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
+{
+	struct drm_i915_private *dev_priv =
+			container_of(hrtimer, typeof(*dev_priv), pred_timer);
+	struct i915_gem_context *ctx;
+	enum gem_load_type load_type;
+	unsigned int req_pending;
+
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
+
+		req_pending = atomic_read(&ctx->req_cnt);
+
+		/*
+		 * Transitioning to low state whenever pending request is zero
+		 * would cause vacillation between low and high state.
+		 */
+		if (req_pending == 0)
+			continue;
+
+		if (req_pending > PENDING_THRESHOLD_MEDIUM)
+			load_type = LOAD_TYPE_HIGH;
+		else if (req_pending == PENDING_THRESHOLD_MEDIUM)
+			load_type = LOAD_TYPE_MEDIUM;
+		else
+			load_type = LOAD_TYPE_LOW;
+
+		i915_gem_context_set_load_type(ctx, load_type);
+	}
+
+	hrtimer_forward_now(hrtimer,
+			ms_to_ktime(dev_priv->predictive_load_enable));
+
+	return HRTIMER_RESTART;
+}
+
+/**
+ * intel_deu_init - Initialize dynamic EU
+ * @dev_priv: i915 device instance
+ *
+ * This fuction is called at driver load
+ */
+void intel_deu_init(struct drm_i915_private *dev_priv)
+{
+	dev_priv->predictive_load_enable = i915_modparams.deu_timer_enable;
+	hrtimer_init(&dev_priv->pred_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	dev_priv->pred_timer.function = predictive_load_cb;
+	mutex_init(&dev_priv->pred_mutex);
+
+	mutex_lock(&dev_priv->pred_mutex);
+
+	if (dev_priv->predictive_load_enable) {
+		if (!hrtimer_active(&dev_priv->pred_timer))
+			hrtimer_start_range_ns(&dev_priv->pred_timer,
+			ms_to_ktime(dev_priv->predictive_load_enable),
+			SLACK_TIMER_NSEC,
+			HRTIMER_MODE_REL_PINNED);
+	} else {
+		hrtimer_cancel(&dev_priv->pred_timer);
+	}
+
+	mutex_unlock(&dev_priv->pred_mutex);
+}
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index b4e29b8..8e6c1da 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -1990,6 +1990,9 @@ void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv);
 void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private *dev_priv);
 int intel_fbc_reset_underrun(struct drm_i915_private *dev_priv);
 
+/* intel_deu.c */
+void intel_deu_init(struct drm_i915_private *dev_priv);
+
 /* intel_hdmi.c */
 void intel_hdmi_init(struct drm_i915_private *dev_priv, i915_reg_t hdmi_reg,
 		     enum port port);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/3] drm/i915: Get active pending request for given context
  2019-03-14  8:36 ` [PATCH v4 1/3] drm/i915: Get active pending request for given context Ankit Navik
@ 2019-03-14  8:55   ` Chris Wilson
  2019-03-14 10:39   ` Tvrtko Ursulin
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2019-03-14  8:55 UTC (permalink / raw)
  To: intel-gfx; +Cc: ankit.p.navik

Quoting Ankit Navik (2019-03-14 08:36:53)
>  static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 02adcaf..a38963b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2475,6 +2475,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>          */
>         eb.request->batch = eb.batch;
>  
> +       atomic_inc(&eb.ctx->req_cnt);
> +
>         trace_i915_request_queue(eb.request, eb.batch_flags);
>         err = eb_submit(&eb);
>  err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 34a0866..d0af37d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -780,6 +780,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  
>                         last = rq;
>                         submit = true;
> +
> +                       if (atomic_read(&rq->gem_context->req_cnt) > 0)
> +                               atomic_dec(&rq->gem_context->req_cnt);

Not atomic. But is this not a clue that you've got the model wrong?

If only we were already tracking requests within contexts.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✗ Fi.CI.BAT: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
  2019-03-14  8:36 [PATCH v4 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik
                   ` (2 preceding siblings ...)
  2019-03-14  8:36 ` [PATCH v4 3/3] drm/i915: Predictive governor to control eu/slice/subslice Ankit Navik
@ 2019-03-14  9:35 ` Patchwork
  3 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2019-03-14  9:35 UTC (permalink / raw)
  To: Ankit Navik; +Cc: intel-gfx

== Series Details ==

Series: drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
URL   : https://patchwork.freedesktop.org/series/57989/
State : failure

== Summary ==

Applying: drm/i915: Get active pending request for given context
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem_context.c
M	drivers/gpu/drm/i915/i915_gem_context.h
M	drivers/gpu/drm/i915/i915_gem_execbuffer.c
M	drivers/gpu/drm/i915/intel_lrc.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/i915_gem_execbuffer.c
Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/i915_gem_context.h
Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0001 drm/i915: Get active pending request for given context
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/3] drm/i915: Get active pending request for given context
  2019-03-14  8:36 ` [PATCH v4 1/3] drm/i915: Get active pending request for given context Ankit Navik
  2019-03-14  8:55   ` Chris Wilson
@ 2019-03-14 10:39   ` Tvrtko Ursulin
  1 sibling, 0 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2019-03-14 10:39 UTC (permalink / raw)
  To: Ankit Navik, intel-gfx


On 14/03/2019 08:36, Ankit Navik wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
> 
> V2:
>   * Change 64-bit to atomic for request count. (Tvrtko Ursulin)
> 
> V3:
>   * Remove mutex for request count.
>   * Rebase.
>   * Fixes hitting underflow for predictive request. (Tvrtko Ursulin)
> 
> V4:
>   * Rebase.
> 
> Cc: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Cc: Kedar J Karanje <kedar.j.karanje@intel.com>
> Cc: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
>   drivers/gpu/drm/i915/i915_gem_context.h    | 5 +++++
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 ++
>   drivers/gpu/drm/i915/intel_lrc.c           | 3 +++
>   4 files changed, 11 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 280813a..a5876fe 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -453,6 +453,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   	}
>   
>   	trace_i915_context_create(ctx);
> +	atomic_set(&ctx->req_cnt, 0);
>   
>   	return ctx;
>   }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index ca150a7..c940168 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -227,6 +227,11 @@ struct i915_gem_context {
>   	 * context close.
>   	 */
>   	struct list_head handles_list;
> +
> +	/** req_cnt: tracks the pending commands, based on which we decide to
> +	 * go for low/medium/high load configuration of the GPU.
> +	 */
> +	atomic_t req_cnt;
>   };
>   
>   static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 02adcaf..a38963b 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2475,6 +2475,8 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	 */
>   	eb.request->batch = eb.batch;
>   
> +	atomic_inc(&eb.ctx->req_cnt);
> +
>   	trace_i915_request_queue(eb.request, eb.batch_flags);
>   	err = eb_submit(&eb);
>   err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 34a0866..d0af37d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -780,6 +780,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   
>   			last = rq;
>   			submit = true;
> +
> +			if (atomic_read(&rq->gem_context->req_cnt) > 0)
> +				atomic_dec(&rq->gem_context->req_cnt);

Every review round I keep pointing out this is wrong and you keep 
persisting to have it like this. :( Would you in my place want to keep 
reviewing in these circumstances?

As Chris also pointed out it is not atomic so in theory doesn't even 
protect against underflow and I repeat, yet again, is a hint the 
placement is unbalanced.

I pointed you to some of my old patches which do correct accounting of 
per engine queued / runnable / running, to give an idea where to put the 
inc/dec.

Chris now also suggests maybe to tie it with timelines (AFAIU), so you 
could also see where requests are added and removed from the 
ce->ring->timeline list.

You say you tried something with my patches but "didnt see much
power benefit with that" - what exactly have you tried? And what is not 
much?

I am still curious what metric works for this. The one you implement is 
something like queued + _sometimes_ runnable. Sometimes because you may 
or may not be decrementing runnable depending on ELSP contention. And 
number of ELSP slots may change with GuC and/or Gen11 so I worry it is 
way to undefined even ignoring the underflow issue.

Regards,

Tvrtko

>   		}
>   
>   		rb_erase_cached(&p->node, &execlists->queue);
> 

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
@ 2019-03-14 10:45   ` Tvrtko Ursulin
  2019-03-14 10:52   ` kbuild test robot
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2019-03-14 10:45 UTC (permalink / raw)
  To: Ankit Navik, intel-gfx


On 14/03/2019 08:36, Ankit Navik wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> This patch will select optimum eu/slice/sub-slice configuration based on
> type of load (low, medium, high) as input.
> Based on our readings and experiments we have predefined set of optimum
> configuration for each platform(CHT, KBL).
> i915_gem_context_set_load_type will select optimum configuration from
> pre-defined optimum configuration table(opt_config).
> 
> It also introduce flag update_render_config which can set by any governor.
> 
> v2:
>   * Move static optimum_config to device init time.
>   * Rename function to appropriate name, fix data types and patch ordering.
>   * Rename prev_load_type to pending_load_type. (Tvrtko Ursulin)
> 
> v3:
>   * Add safe guard check in i915_gem_context_set_load_type.
>   * Rename struct from optimum_config to i915_sseu_optimum_config to
>     avoid namespace clashes.
>   * Reduces memcpy for space efficient.
>   * Rebase.
>   * Improved commit message. (Tvrtko Ursulin)
> 
> v4:
>   * Move optimum config table to file scope. (Tvrtko Ursulin)
> 
> Cc: Kedar J Karanje <kedar.j.karanje@intel.com>
> Cc: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h          |  5 ++++
>   drivers/gpu/drm/i915/i915_gem_context.c  | 20 ++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_context.h  | 34 +++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_device_info.c | 47 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/intel_lrc.c         | 45 +++++++++++++++++++++++++++++-
>   5 files changed, 148 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 5c8d048..97cb36b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1593,6 +1593,11 @@ struct drm_i915_private {
>   	struct drm_i915_fence_reg fence_regs[I915_MAX_NUM_FENCES]; /* assume 965 */
>   	int num_fence_regs; /* 8 on pre-965, 16 otherwise */
>   
> +	/* optimal slice/subslice/EU configration state */
> +	struct i915_sseu_optimum_config *opt_config;
> +
> +	int predictive_load_enable;
> +
>   	unsigned int fsb_freq, mem_freq, is_ddr3;
>   	unsigned int skl_preferred_vco_freq;
>   	unsigned int max_cdclk_freq;
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index a5876fe..8f16ef1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -454,10 +454,30 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   
>   	trace_i915_context_create(ctx);
>   	atomic_set(&ctx->req_cnt, 0);
> +	ctx->slice_cnt = hweight8(RUNTIME_INFO(dev_priv)->sseu.slice_mask);
> +	ctx->subslice_cnt = hweight8(
> +			RUNTIME_INFO(dev_priv)->sseu.subslice_mask[0]);
> +	ctx->eu_cnt = RUNTIME_INFO(dev_priv)->sseu.eu_per_subslice;
>   
>   	return ctx;
>   }
>   
> +
> +void i915_gem_context_set_load_type(struct i915_gem_context *ctx,
> +		enum gem_load_type type)
> +{
> +	struct drm_i915_private *dev_priv = ctx->i915;
> +
> +	if (GEM_WARN_ON(type > LOAD_TYPE_LAST))
> +		return;
> +
> +	/* Call opt_config to get correct configuration for eu,slice,subslice */
> +	ctx->slice_cnt = dev_priv->opt_config[type].slice;
> +	ctx->subslice_cnt = dev_priv->opt_config[type].subslice;
> +	ctx->eu_cnt = dev_priv->opt_config[type].eu;
> +	ctx->pending_load_type = type;
> +}
> +
>   /**
>    * i915_gem_context_create_gvt - create a GVT GEM context
>    * @dev: drm device *
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index c940168..0a24d28 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -54,6 +54,19 @@ struct intel_context_ops {
>   	void (*destroy)(struct intel_context *ce);
>   };
>   
> +enum gem_load_type {
> +	LOAD_TYPE_LOW,
> +	LOAD_TYPE_MEDIUM,
> +	LOAD_TYPE_HIGH,
> +	LOAD_TYPE_LAST
> +};
> +
> +struct i915_sseu_optimum_config {
> +	u8 slice;
> +	u8 subslice;
> +	u8 eu;
> +};
> +
>   /*
>    * Powergating configuration for a particular (context,engine).
>    */
> @@ -232,6 +245,25 @@ struct i915_gem_context {
>   	 * go for low/medium/high load configuration of the GPU.
>   	 */
>   	atomic_t req_cnt;
> +
> +	/** slice_cnt: used to set the # of slices to be enabled. */
> +	u8 slice_cnt;
> +
> +	/** subslice_cnt: used to set the # of subslices to be enabled. */
> +	u8 subslice_cnt;
> +
> +	/** eu_cnt: used to set the # of eu to be enabled. */
> +	u8 eu_cnt;
> +
> +	/** load_type: The designated load_type (high/medium/low) for a given
> +	 * number of pending commands in the command queue.
> +	 */
> +	enum gem_load_type load_type;
> +
> +	/** pending_load_type: The earlier load type that the GPU was configured
> +	 * for (high/medium/low).
> +	 */
> +	enum gem_load_type pending_load_type;
>   };
>   
>   static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> @@ -375,6 +407,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>   				    struct drm_file *file_priv);
>   int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>   				       struct drm_file *file);
> +void i915_gem_context_set_load_type(struct i915_gem_context *ctx,
> +		enum gem_load_type type);
>   
>   struct i915_gem_context *
>   i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
> index 855a507..017a1e2 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.c
> +++ b/drivers/gpu/drm/i915/intel_device_info.c
> @@ -707,6 +707,27 @@ static u32 read_timestamp_frequency(struct drm_i915_private *dev_priv)
>   	return 0;
>   }
>   
> +/* static table of slice/subslice/EU for Cherryview */
> +static const struct i915_sseu_optimum_config chv_config[LOAD_TYPE_LAST] = {
> +	{1, 1, 4},	/* Low */
> +	{1, 1, 6},	/* Medium */
> +	{1, 2, 6}	/* High */
> +};
> +
> +/* static table of slice/subslice/EU for KBL GT2 */
> +static const struct i915_sseu_optimum_config kbl_gt2_config[LOAD_TYPE_LAST] = {
> +	{1, 3, 2},	/* Low */
> +	{1, 3, 4},	/* Medium */
> +	{1, 3, 8}	/* High */
> +};
> +
> +/* static table of slice/subslice/EU for KBL GT3 */
> +static const struct i915_sseu_optimum_config kbl_gt3_config[LOAD_TYPE_LAST] = {
> +	{2, 3, 4},	/* Low */
> +	{2, 3, 6},	/* Medium */
> +	{2, 3, 8}	/* High */
> +};
> +
>   /**
>    * intel_device_info_runtime_init - initialize runtime info
>    * @dev_priv: the i915 device
> @@ -728,6 +749,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   	struct intel_device_info *info = mkwrite_device_info(dev_priv);
>   	struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
>   	enum pipe pipe;
> +	struct i915_sseu_optimum_config *opt_config = NULL;
>   
>   	if (INTEL_GEN(dev_priv) >= 10) {
>   		for_each_pipe(dev_priv, pipe)
> @@ -831,12 +853,30 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   	/* Initialize slice/subslice/EU info */
>   	if (IS_HASWELL(dev_priv))
>   		haswell_sseu_info_init(dev_priv);
> -	else if (IS_CHERRYVIEW(dev_priv))
> +	else if (IS_CHERRYVIEW(dev_priv)) {
>   		cherryview_sseu_info_init(dev_priv);
> +		opt_config = chv_config;
> +		BUILD_BUG_ON(ARRAY_SIZE(chv_config) != LOAD_TYPE_LAST);
> +	}
>   	else if (IS_BROADWELL(dev_priv))
>   		broadwell_sseu_info_init(dev_priv);
> -	else if (IS_GEN(dev_priv, 9))
> +	else if (IS_GEN(dev_priv, 9)) {
>   		gen9_sseu_info_init(dev_priv);
> +
> +		switch (info->gt) {
> +		default: /* fall through */
> +		case 2:
> +			opt_config = kbl_gt2_config;
> +			BUILD_BUG_ON(ARRAY_SIZE(kbl_gt2_config)
> +						!= LOAD_TYPE_LAST);
> +		break;
> +		case 3:
> +			opt_config = kbl_gt3_config;
> +			BUILD_BUG_ON(ARRAY_SIZE(kbl_gt3_config)
> +						!= LOAD_TYPE_LAST);
> +		break;
> +		}
> +	}
>   	else if (IS_GEN(dev_priv, 10))
>   		gen10_sseu_info_init(dev_priv);
>   	else if (INTEL_GEN(dev_priv) >= 11)
> @@ -847,6 +887,9 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
>   		info->ppgtt = INTEL_PPGTT_NONE;
>   	}
>   
> +	if (opt_config)
> +		dev_priv->opt_config = opt_config;
> +
>   	/* Initialize command stream timestamp frequency */
>   	runtime->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
>   }
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index d0af37d..397af1e 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1282,6 +1282,35 @@ static int __context_pin(struct i915_gem_context *ctx, struct i915_vma *vma)
>   	return i915_vma_pin(vma, 0, 0, flags);
>   }
>   
> +static u32
> +get_context_rpcs_config(struct i915_gem_context *ctx)
> +{
> +	u32 rpcs = 0;
> +	struct drm_i915_private *dev_priv = ctx->i915;
> +
> +	if (INTEL_GEN(dev_priv) < 8)
> +		return 0;
> +
> +	if (RUNTIME_INFO(dev_priv)->sseu.has_slice_pg) {
> +		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
> +		rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	if (RUNTIME_INFO(dev_priv)->sseu.has_subslice_pg) {
> +		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
> +		rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	if (RUNTIME_INFO(dev_priv)->sseu.has_eu_pg) {
> +		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
> +		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	return rpcs;
> +}
>   static void
>   __execlists_update_reg_state(struct intel_engine_cs *engine,
>   			     struct intel_context *ce)
> @@ -1294,9 +1323,20 @@ __execlists_update_reg_state(struct intel_engine_cs *engine,
>   	regs[CTX_RING_TAIL + 1] = ring->tail;
>   
>   	/* RPCS */
> -	if (engine->class == RENDER_CLASS)
> +	if (engine->class == RENDER_CLASS &&
> +				engine->i915->predictive_load_enable) {
> +		u32 rpcs_config = 0;
> +		struct i915_gem_context *ctx = ce->gem_context;
> +
> +		rpcs_config = get_context_rpcs_config(ctx);
> +		regs[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> +		CTX_REG(regs, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
> +						rpcs_config);
> +
> +	} else if (engine->class == RENDER_CLASS) {

>   		regs[CTX_R_PWR_CLK_STATE + 1] = gen8_make_rpcs(engine->i915,
>   							       &ce->sseu);

We can't have to places/paths controlling RPCS.

What happens when both OA and your feature are enabled? (See comments in 
gen8_make_rpcs.)

What happens when media has requested reduces subslice count on Gen11 if 
predictive RPCS is added on Gen11?

You need to put the logic into gen8_make_rpcs and handle all those 
overrides.

Regards,

Tvrtko


> +	}
>   }
>   
>   static struct intel_context *
> @@ -1340,6 +1380,9 @@ __execlists_context_pin(struct intel_engine_cs *engine,
>   
>   	__execlists_update_reg_state(engine, ce);
>   
> +	if (ctx->load_type != ctx->pending_load_type)
> +		ctx->load_type = ctx->pending_load_type;
> +
>   	ce->state->obj->pin_global++;
>   	i915_gem_context_get(ctx);
>   	return ce;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
  2019-03-14 10:45   ` Tvrtko Ursulin
@ 2019-03-14 10:52   ` kbuild test robot
  2019-03-14 11:04   ` kbuild test robot
  2019-03-14 15:50   ` kbuild test robot
  3 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2019-03-14 10:52 UTC (permalink / raw)
  Cc: ankit.p.navik, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 8022 bytes --]

Hi Praveen,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20190306]
[cannot apply to v5.0]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ankit-Navik/drm-i915-Context-aware-user-agnostic-EU-Slice-Sub-slice-control-within-kernel/20190314-181342
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x015-201910 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/i915/intel_device_info.c: In function 'intel_device_info_runtime_init':
>> drivers/gpu/drm/i915/intel_device_info.c:858:14: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
      opt_config = chv_config;
                 ^
   drivers/gpu/drm/i915/intel_device_info.c:869:15: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
       opt_config = kbl_gt2_config;
                  ^
   drivers/gpu/drm/i915/intel_device_info.c:874:15: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
       opt_config = kbl_gt3_config;
                  ^

vim +/const +858 drivers/gpu/drm/i915/intel_device_info.c

   730	
   731	/**
   732	 * intel_device_info_runtime_init - initialize runtime info
   733	 * @dev_priv: the i915 device
   734	 *
   735	 * Determine various intel_device_info fields at runtime.
   736	 *
   737	 * Use it when either:
   738	 *   - it's judged too laborious to fill n static structures with the limit
   739	 *     when a simple if statement does the job,
   740	 *   - run-time checks (eg read fuse/strap registers) are needed.
   741	 *
   742	 * This function needs to be called:
   743	 *   - after the MMIO has been setup as we are reading registers,
   744	 *   - after the PCH has been detected,
   745	 *   - before the first usage of the fields it can tweak.
   746	 */
   747	void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
   748	{
   749		struct intel_device_info *info = mkwrite_device_info(dev_priv);
   750		struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
   751		enum pipe pipe;
   752		struct i915_sseu_optimum_config *opt_config = NULL;
   753	
   754		if (INTEL_GEN(dev_priv) >= 10) {
   755			for_each_pipe(dev_priv, pipe)
   756				runtime->num_scalers[pipe] = 2;
   757		} else if (IS_GEN(dev_priv, 9)) {
   758			runtime->num_scalers[PIPE_A] = 2;
   759			runtime->num_scalers[PIPE_B] = 2;
   760			runtime->num_scalers[PIPE_C] = 1;
   761		}
   762	
   763		BUILD_BUG_ON(I915_NUM_ENGINES > BITS_PER_TYPE(intel_ring_mask_t));
   764	
   765		if (IS_GEN(dev_priv, 11))
   766			for_each_pipe(dev_priv, pipe)
   767				runtime->num_sprites[pipe] = 6;
   768		else if (IS_GEN(dev_priv, 10) || IS_GEMINILAKE(dev_priv))
   769			for_each_pipe(dev_priv, pipe)
   770				runtime->num_sprites[pipe] = 3;
   771		else if (IS_BROXTON(dev_priv)) {
   772			/*
   773			 * Skylake and Broxton currently don't expose the topmost plane as its
   774			 * use is exclusive with the legacy cursor and we only want to expose
   775			 * one of those, not both. Until we can safely expose the topmost plane
   776			 * as a DRM_PLANE_TYPE_CURSOR with all the features exposed/supported,
   777			 * we don't expose the topmost plane at all to prevent ABI breakage
   778			 * down the line.
   779			 */
   780	
   781			runtime->num_sprites[PIPE_A] = 2;
   782			runtime->num_sprites[PIPE_B] = 2;
   783			runtime->num_sprites[PIPE_C] = 1;
   784		} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
   785			for_each_pipe(dev_priv, pipe)
   786				runtime->num_sprites[pipe] = 2;
   787		} else if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv)) {
   788			for_each_pipe(dev_priv, pipe)
   789				runtime->num_sprites[pipe] = 1;
   790		}
   791	
   792		if (i915_modparams.disable_display) {
   793			DRM_INFO("Display disabled (module parameter)\n");
   794			info->num_pipes = 0;
   795		} else if (HAS_DISPLAY(dev_priv) &&
   796			   (IS_GEN_RANGE(dev_priv, 7, 8)) &&
   797			   HAS_PCH_SPLIT(dev_priv)) {
   798			u32 fuse_strap = I915_READ(FUSE_STRAP);
   799			u32 sfuse_strap = I915_READ(SFUSE_STRAP);
   800	
   801			/*
   802			 * SFUSE_STRAP is supposed to have a bit signalling the display
   803			 * is fused off. Unfortunately it seems that, at least in
   804			 * certain cases, fused off display means that PCH display
   805			 * reads don't land anywhere. In that case, we read 0s.
   806			 *
   807			 * On CPT/PPT, we can detect this case as SFUSE_STRAP_FUSE_LOCK
   808			 * should be set when taking over after the firmware.
   809			 */
   810			if (fuse_strap & ILK_INTERNAL_DISPLAY_DISABLE ||
   811			    sfuse_strap & SFUSE_STRAP_DISPLAY_DISABLED ||
   812			    (HAS_PCH_CPT(dev_priv) &&
   813			     !(sfuse_strap & SFUSE_STRAP_FUSE_LOCK))) {
   814				DRM_INFO("Display fused off, disabling\n");
   815				info->num_pipes = 0;
   816			} else if (fuse_strap & IVB_PIPE_C_DISABLE) {
   817				DRM_INFO("PipeC fused off\n");
   818				info->num_pipes -= 1;
   819			}
   820		} else if (HAS_DISPLAY(dev_priv) && INTEL_GEN(dev_priv) >= 9) {
   821			u32 dfsm = I915_READ(SKL_DFSM);
   822			u8 disabled_mask = 0;
   823			bool invalid;
   824			int num_bits;
   825	
   826			if (dfsm & SKL_DFSM_PIPE_A_DISABLE)
   827				disabled_mask |= BIT(PIPE_A);
   828			if (dfsm & SKL_DFSM_PIPE_B_DISABLE)
   829				disabled_mask |= BIT(PIPE_B);
   830			if (dfsm & SKL_DFSM_PIPE_C_DISABLE)
   831				disabled_mask |= BIT(PIPE_C);
   832	
   833			num_bits = hweight8(disabled_mask);
   834	
   835			switch (disabled_mask) {
   836			case BIT(PIPE_A):
   837			case BIT(PIPE_B):
   838			case BIT(PIPE_A) | BIT(PIPE_B):
   839			case BIT(PIPE_A) | BIT(PIPE_C):
   840				invalid = true;
   841				break;
   842			default:
   843				invalid = false;
   844			}
   845	
   846			if (num_bits > info->num_pipes || invalid)
   847				DRM_ERROR("invalid pipe fuse configuration: 0x%x\n",
   848					  disabled_mask);
   849			else
   850				info->num_pipes -= num_bits;
   851		}
   852	
   853		/* Initialize slice/subslice/EU info */
   854		if (IS_HASWELL(dev_priv))
   855			haswell_sseu_info_init(dev_priv);
   856		else if (IS_CHERRYVIEW(dev_priv)) {
   857			cherryview_sseu_info_init(dev_priv);
 > 858			opt_config = chv_config;
   859			BUILD_BUG_ON(ARRAY_SIZE(chv_config) != LOAD_TYPE_LAST);
   860		}
   861		else if (IS_BROADWELL(dev_priv))
   862			broadwell_sseu_info_init(dev_priv);
   863		else if (IS_GEN(dev_priv, 9)) {
   864			gen9_sseu_info_init(dev_priv);
   865	
   866			switch (info->gt) {
   867			default: /* fall through */
   868			case 2:
   869				opt_config = kbl_gt2_config;
   870				BUILD_BUG_ON(ARRAY_SIZE(kbl_gt2_config)
   871							!= LOAD_TYPE_LAST);
   872			break;
   873			case 3:
   874				opt_config = kbl_gt3_config;
   875				BUILD_BUG_ON(ARRAY_SIZE(kbl_gt3_config)
   876							!= LOAD_TYPE_LAST);
   877			break;
   878			}
   879		}
   880		else if (IS_GEN(dev_priv, 10))
   881			gen10_sseu_info_init(dev_priv);
   882		else if (INTEL_GEN(dev_priv) >= 11)
   883			gen11_sseu_info_init(dev_priv);
   884	
   885		if (IS_GEN(dev_priv, 6) && intel_vtd_active()) {
   886			DRM_INFO("Disabling ppGTT for VT-d support\n");
   887			info->ppgtt = INTEL_PPGTT_NONE;
   888		}
   889	
   890		if (opt_config)
   891			dev_priv->opt_config = opt_config;
   892	
   893		/* Initialize command stream timestamp frequency */
   894		runtime->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
   895	}
   896	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31190 bytes --]

[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/3] drm/i915: Predictive governor to control eu/slice/subslice
  2019-03-14  8:36 ` [PATCH v4 3/3] drm/i915: Predictive governor to control eu/slice/subslice Ankit Navik
@ 2019-03-14 10:58   ` Tvrtko Ursulin
  0 siblings, 0 replies; 12+ messages in thread
From: Tvrtko Ursulin @ 2019-03-14 10:58 UTC (permalink / raw)
  To: Ankit Navik, intel-gfx


On 14/03/2019 08:36, Ankit Navik wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> High resolution timer is used for predictive governor to control
> eu/slice/subslice based on workloads.
> 
> Debugfs is provided to enable/disable/update timer configuration
> 
> V2:
>   * Fix code style.
>   * Move predictive_load_timer into a drm_i915_private
>     structure.
>   * Make generic function to set optimum config. (Tvrtko Ursulin)
> 
> V3:
>   * Rebase.
>   * Fix race condition for predictive load set.
>   * Add slack to start hrtimer for more power efficient. (Tvrtko Ursulin)
> 
> V4:
>   * Fix data type and initialization of mutex to protect predictive load state.
>   * Move predictive timer init to i915_gem_init_early. (Tvrtko Ursulin)
>   * Move debugfs to kernel parameter.
> 
> Cc: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Cc: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile       |   1 +
>   drivers/gpu/drm/i915/i915_debugfs.c |   1 +
>   drivers/gpu/drm/i915/i915_drv.h     |   4 ++
>   drivers/gpu/drm/i915/i915_gem.c     |   3 +
>   drivers/gpu/drm/i915/i915_params.c  |   4 ++
>   drivers/gpu/drm/i915/i915_params.h  |   1 +
>   drivers/gpu/drm/i915/intel_deu.c    | 109 ++++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_drv.h    |   3 +
>   8 files changed, 126 insertions(+)
>   create mode 100644 drivers/gpu/drm/i915/intel_deu.c
> 
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 1787e12..b9eaaed 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -119,6 +119,7 @@ i915-y += intel_audio.o \
>   	  intel_color.o \
>   	  intel_combo_phy.o \
>   	  intel_connector.o \
> +	  intel_deu.o \
>   	  intel_display.o \
>   	  intel_dpio_phy.o \
>   	  intel_dpll_mgr.o \
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index ca8fa44..1b8f631 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4612,6 +4612,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
>   	{"i915_drrs_status", i915_drrs_status, 0},
>   	{"i915_rps_boost_info", i915_rps_boost_info, 0},
>   };
> +
>   #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>   
>   static const struct i915_debugfs_files {
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 97cb36b..dfe80d9 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1596,6 +1596,9 @@ struct drm_i915_private {
>   	/* optimal slice/subslice/EU configration state */
>   	struct i915_sseu_optimum_config *opt_config;
>   
> +	/* protects predictive load state */
> +	struct mutex pred_mutex;
> +	struct hrtimer pred_timer;
>   	int predictive_load_enable;
>   
>   	unsigned int fsb_freq, mem_freq, is_ddr3;
> @@ -2646,6 +2649,7 @@ extern unsigned long i915_chipset_val(struct drm_i915_private *dev_priv);
>   extern unsigned long i915_mch_val(struct drm_i915_private *dev_priv);
>   extern unsigned long i915_gfx_val(struct drm_i915_private *dev_priv);
>   extern void i915_update_gfx_val(struct drm_i915_private *dev_priv);
> +extern enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer);
>   int vlv_force_gfx_clock(struct drm_i915_private *dev_priv, bool on);
>   
>   int intel_engines_init_mmio(struct drm_i915_private *dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 5c1b9d4..18c5195 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5267,6 +5267,9 @@ int i915_gem_init_early(struct drm_i915_private *dev_priv)
>   
>   	spin_lock_init(&dev_priv->fb_tracking.lock);
>   
> +	/* Dynamic EU timer initialization for predictive load */
> +	intel_deu_init(dev_priv);
> +
>   	err = i915_gemfs_init(dev_priv);
>   	if (err)
>   		DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n", err);
> diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
> index b5be0ab..7857e92 100644
> --- a/drivers/gpu/drm/i915/i915_params.c
> +++ b/drivers/gpu/drm/i915/i915_params.c
> @@ -97,6 +97,10 @@ i915_param_named_unsafe(disable_power_well, int, 0400,
>   
>   i915_param_named_unsafe(enable_ips, int, 0600, "Enable IPS (default: true)");
>   
> +i915_param_named_unsafe(deu_timer_enable, int, 0600,
> +	"Enable dynamic EU control for power savings "
> +	"(0=disable deu predictive timer [default], 150=optimal deu predictive timer)");
> +
>   i915_param_named(fastboot, int, 0600,
>   	"Try to skip unnecessary mode sets at boot time "
>   	"(0=disabled, 1=enabled) "
> diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
> index 3f14e98..c51f56b 100644
> --- a/drivers/gpu/drm/i915/i915_params.h
> +++ b/drivers/gpu/drm/i915/i915_params.h
> @@ -54,6 +54,7 @@ struct drm_printer;
>   	param(int, disable_power_well, -1) \
>   	param(int, enable_ips, 1) \
>   	param(int, invert_brightness, 0) \
> +	param(int, deu_timer_enable, 0) \

Maybe just deu_enable, although I am not sure acronym is good here. 
i915.predictive_sseu = 1/0?

>   	param(int, enable_guc, 0) \
>   	param(int, guc_log_level, -1) \
>   	param(char *, guc_firmware_path, NULL) \
> diff --git a/drivers/gpu/drm/i915/intel_deu.c b/drivers/gpu/drm/i915/intel_deu.c
> new file mode 100644
> index 0000000..34654d5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_deu.c
> @@ -0,0 +1,109 @@
> +/*
> + * Copyright © 2019 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Authors:
> + *    Ankit Navik <ankit.p.navik@intel.com>
> + */
> +
> +/**
> + * DOC: Dynamic EU Control (DEUC)

I am not sure about the acronym but ignoring that it is not consistently 
used between the file name, module parameter name, function prefix name 
and here. Well all is consistent apart from this.

> + *
> + * DEUC tries to re-configure EU allocation during runtime by predictive load
> + * calculation of command queue to gain power saving.
> + * It is transparent to user space and completely handled in the kernel.
> + */
> +
> +#include "intel_drv.h"
> +#include "i915_drv.h"
> +
> +/*
> + * Anything above threshold is considered as HIGH load, less is considered
> + * as LOW load and equal is considered as MEDIUM load.
> + *
> + * The threshold value of three active requests pending.
> + */
> +#define PENDING_THRESHOLD_MEDIUM 3
> +
> +#define SLACK_TIMER_NSEC 1000000 /* Timer range in nano second */
> +
> +enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)

static

> +{
> +	struct drm_i915_private *dev_priv =
> +			container_of(hrtimer, typeof(*dev_priv), pred_timer);
> +	struct i915_gem_context *ctx;
> +	enum gem_load_type load_type;
> +	unsigned int req_pending;
> +
> +	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> +
> +		req_pending = atomic_read(&ctx->req_cnt);
> +
> +		/*
> +		 * Transitioning to low state whenever pending request is zero
> +		 * would cause vacillation between low and high state.
> +		 */
> +		if (req_pending == 0)
> +			continue;
> +
> +		if (req_pending > PENDING_THRESHOLD_MEDIUM)
> +			load_type = LOAD_TYPE_HIGH;
> +		else if (req_pending == PENDING_THRESHOLD_MEDIUM)
> +			load_type = LOAD_TYPE_MEDIUM;
> +		else
> +			load_type = LOAD_TYPE_LOW;
> +
> +		i915_gem_context_set_load_type(ctx, load_type);
> +	}
> +
> +	hrtimer_forward_now(hrtimer,
> +			ms_to_ktime(dev_priv->predictive_load_enable));
> +
> +	return HRTIMER_RESTART;
> +}

Thinking out loud - I wonder if you could drive the logic from context 
activity instead of the timer. It would avoid having to walk all the 
idle contexts, but would require adding time checks to rq timeline 
management.

> +
> +/**
> + * intel_deu_init - Initialize dynamic EU
> + * @dev_priv: i915 device instance
> + *
> + * This fuction is called at driver load
> + */
> +void intel_deu_init(struct drm_i915_private *dev_priv)
> +{
> +	dev_priv->predictive_load_enable = i915_modparams.deu_timer_enable;
> +	hrtimer_init(&dev_priv->pred_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	dev_priv->pred_timer.function = predictive_load_cb;
> +	mutex_init(&dev_priv->pred_mutex);
> +
> +	mutex_lock(&dev_priv->pred_mutex);
> +
> +	if (dev_priv->predictive_load_enable) {
> +		if (!hrtimer_active(&dev_priv->pred_timer))
> +			hrtimer_start_range_ns(&dev_priv->pred_timer,
> +			ms_to_ktime(dev_priv->predictive_load_enable),
> +			SLACK_TIMER_NSEC,
> +			HRTIMER_MODE_REL_PINNED);
> +	} else {
> +		hrtimer_cancel(&dev_priv->pred_timer);

I'd move stopping and starting of the timer into GT idle/active hooks. 
No point in keeping it running while the GPU is asleep.

Haven't I suggested this before? Look for i915_pmu_gt_(un)parked and put 
your hooks next to them.

> +	}
> +
> +	mutex_unlock(&dev_priv->pred_mutex);

Mutex doesn't seem to be useful now that you made this modparam.

Regards,

Tvrtko

> +}
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index b4e29b8..8e6c1da 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -1990,6 +1990,9 @@ void intel_fbc_cleanup_cfb(struct drm_i915_private *dev_priv);
>   void intel_fbc_handle_fifo_underrun_irq(struct drm_i915_private *dev_priv);
>   int intel_fbc_reset_underrun(struct drm_i915_private *dev_priv);
>   
> +/* intel_deu.c */
> +void intel_deu_init(struct drm_i915_private *dev_priv);
> +
>   /* intel_hdmi.c */
>   void intel_hdmi_init(struct drm_i915_private *dev_priv, i915_reg_t hdmi_reg,
>   		     enum port port);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
  2019-03-14 10:45   ` Tvrtko Ursulin
  2019-03-14 10:52   ` kbuild test robot
@ 2019-03-14 11:04   ` kbuild test robot
  2019-03-14 15:50   ` kbuild test robot
  3 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2019-03-14 11:04 UTC (permalink / raw)
  Cc: ankit.p.navik, intel-gfx, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 8070 bytes --]

Hi Praveen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on next-20190306]
[cannot apply to v5.0]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ankit-Navik/drm-i915-Context-aware-user-agnostic-EU-Slice-Sub-slice-control-within-kernel/20190314-181342
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x002-201910 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu//drm/i915/intel_device_info.c: In function 'intel_device_info_runtime_init':
>> drivers/gpu//drm/i915/intel_device_info.c:858:14: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
      opt_config = chv_config;
                 ^
   drivers/gpu//drm/i915/intel_device_info.c:869:15: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
       opt_config = kbl_gt2_config;
                  ^
   drivers/gpu//drm/i915/intel_device_info.c:874:15: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
       opt_config = kbl_gt3_config;
                  ^
   cc1: all warnings being treated as errors

vim +/const +858 drivers/gpu//drm/i915/intel_device_info.c

   730	
   731	/**
   732	 * intel_device_info_runtime_init - initialize runtime info
   733	 * @dev_priv: the i915 device
   734	 *
   735	 * Determine various intel_device_info fields at runtime.
   736	 *
   737	 * Use it when either:
   738	 *   - it's judged too laborious to fill n static structures with the limit
   739	 *     when a simple if statement does the job,
   740	 *   - run-time checks (eg read fuse/strap registers) are needed.
   741	 *
   742	 * This function needs to be called:
   743	 *   - after the MMIO has been setup as we are reading registers,
   744	 *   - after the PCH has been detected,
   745	 *   - before the first usage of the fields it can tweak.
   746	 */
   747	void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
   748	{
   749		struct intel_device_info *info = mkwrite_device_info(dev_priv);
   750		struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
   751		enum pipe pipe;
   752		struct i915_sseu_optimum_config *opt_config = NULL;
   753	
   754		if (INTEL_GEN(dev_priv) >= 10) {
   755			for_each_pipe(dev_priv, pipe)
   756				runtime->num_scalers[pipe] = 2;
   757		} else if (IS_GEN(dev_priv, 9)) {
   758			runtime->num_scalers[PIPE_A] = 2;
   759			runtime->num_scalers[PIPE_B] = 2;
   760			runtime->num_scalers[PIPE_C] = 1;
   761		}
   762	
   763		BUILD_BUG_ON(I915_NUM_ENGINES > BITS_PER_TYPE(intel_ring_mask_t));
   764	
   765		if (IS_GEN(dev_priv, 11))
   766			for_each_pipe(dev_priv, pipe)
   767				runtime->num_sprites[pipe] = 6;
   768		else if (IS_GEN(dev_priv, 10) || IS_GEMINILAKE(dev_priv))
   769			for_each_pipe(dev_priv, pipe)
   770				runtime->num_sprites[pipe] = 3;
   771		else if (IS_BROXTON(dev_priv)) {
   772			/*
   773			 * Skylake and Broxton currently don't expose the topmost plane as its
   774			 * use is exclusive with the legacy cursor and we only want to expose
   775			 * one of those, not both. Until we can safely expose the topmost plane
   776			 * as a DRM_PLANE_TYPE_CURSOR with all the features exposed/supported,
   777			 * we don't expose the topmost plane at all to prevent ABI breakage
   778			 * down the line.
   779			 */
   780	
   781			runtime->num_sprites[PIPE_A] = 2;
   782			runtime->num_sprites[PIPE_B] = 2;
   783			runtime->num_sprites[PIPE_C] = 1;
   784		} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
   785			for_each_pipe(dev_priv, pipe)
   786				runtime->num_sprites[pipe] = 2;
   787		} else if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv)) {
   788			for_each_pipe(dev_priv, pipe)
   789				runtime->num_sprites[pipe] = 1;
   790		}
   791	
   792		if (i915_modparams.disable_display) {
   793			DRM_INFO("Display disabled (module parameter)\n");
   794			info->num_pipes = 0;
   795		} else if (HAS_DISPLAY(dev_priv) &&
   796			   (IS_GEN_RANGE(dev_priv, 7, 8)) &&
   797			   HAS_PCH_SPLIT(dev_priv)) {
   798			u32 fuse_strap = I915_READ(FUSE_STRAP);
   799			u32 sfuse_strap = I915_READ(SFUSE_STRAP);
   800	
   801			/*
   802			 * SFUSE_STRAP is supposed to have a bit signalling the display
   803			 * is fused off. Unfortunately it seems that, at least in
   804			 * certain cases, fused off display means that PCH display
   805			 * reads don't land anywhere. In that case, we read 0s.
   806			 *
   807			 * On CPT/PPT, we can detect this case as SFUSE_STRAP_FUSE_LOCK
   808			 * should be set when taking over after the firmware.
   809			 */
   810			if (fuse_strap & ILK_INTERNAL_DISPLAY_DISABLE ||
   811			    sfuse_strap & SFUSE_STRAP_DISPLAY_DISABLED ||
   812			    (HAS_PCH_CPT(dev_priv) &&
   813			     !(sfuse_strap & SFUSE_STRAP_FUSE_LOCK))) {
   814				DRM_INFO("Display fused off, disabling\n");
   815				info->num_pipes = 0;
   816			} else if (fuse_strap & IVB_PIPE_C_DISABLE) {
   817				DRM_INFO("PipeC fused off\n");
   818				info->num_pipes -= 1;
   819			}
   820		} else if (HAS_DISPLAY(dev_priv) && INTEL_GEN(dev_priv) >= 9) {
   821			u32 dfsm = I915_READ(SKL_DFSM);
   822			u8 disabled_mask = 0;
   823			bool invalid;
   824			int num_bits;
   825	
   826			if (dfsm & SKL_DFSM_PIPE_A_DISABLE)
   827				disabled_mask |= BIT(PIPE_A);
   828			if (dfsm & SKL_DFSM_PIPE_B_DISABLE)
   829				disabled_mask |= BIT(PIPE_B);
   830			if (dfsm & SKL_DFSM_PIPE_C_DISABLE)
   831				disabled_mask |= BIT(PIPE_C);
   832	
   833			num_bits = hweight8(disabled_mask);
   834	
   835			switch (disabled_mask) {
   836			case BIT(PIPE_A):
   837			case BIT(PIPE_B):
   838			case BIT(PIPE_A) | BIT(PIPE_B):
   839			case BIT(PIPE_A) | BIT(PIPE_C):
   840				invalid = true;
   841				break;
   842			default:
   843				invalid = false;
   844			}
   845	
   846			if (num_bits > info->num_pipes || invalid)
   847				DRM_ERROR("invalid pipe fuse configuration: 0x%x\n",
   848					  disabled_mask);
   849			else
   850				info->num_pipes -= num_bits;
   851		}
   852	
   853		/* Initialize slice/subslice/EU info */
   854		if (IS_HASWELL(dev_priv))
   855			haswell_sseu_info_init(dev_priv);
   856		else if (IS_CHERRYVIEW(dev_priv)) {
   857			cherryview_sseu_info_init(dev_priv);
 > 858			opt_config = chv_config;
   859			BUILD_BUG_ON(ARRAY_SIZE(chv_config) != LOAD_TYPE_LAST);
   860		}
   861		else if (IS_BROADWELL(dev_priv))
   862			broadwell_sseu_info_init(dev_priv);
   863		else if (IS_GEN(dev_priv, 9)) {
   864			gen9_sseu_info_init(dev_priv);
   865	
   866			switch (info->gt) {
   867			default: /* fall through */
   868			case 2:
   869				opt_config = kbl_gt2_config;
   870				BUILD_BUG_ON(ARRAY_SIZE(kbl_gt2_config)
   871							!= LOAD_TYPE_LAST);
   872			break;
   873			case 3:
   874				opt_config = kbl_gt3_config;
   875				BUILD_BUG_ON(ARRAY_SIZE(kbl_gt3_config)
   876							!= LOAD_TYPE_LAST);
   877			break;
   878			}
   879		}
   880		else if (IS_GEN(dev_priv, 10))
   881			gen10_sseu_info_init(dev_priv);
   882		else if (INTEL_GEN(dev_priv) >= 11)
   883			gen11_sseu_info_init(dev_priv);
   884	
   885		if (IS_GEN(dev_priv, 6) && intel_vtd_active()) {
   886			DRM_INFO("Disabling ppGTT for VT-d support\n");
   887			info->ppgtt = INTEL_PPGTT_NONE;
   888		}
   889	
   890		if (opt_config)
   891			dev_priv->opt_config = opt_config;
   892	
   893		/* Initialize command stream timestamp frequency */
   894		runtime->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
   895	}
   896	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 29699 bytes --]

[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
                     ` (2 preceding siblings ...)
  2019-03-14 11:04   ` kbuild test robot
@ 2019-03-14 15:50   ` kbuild test robot
  3 siblings, 0 replies; 12+ messages in thread
From: kbuild test robot @ 2019-03-14 15:50 UTC (permalink / raw)
  Cc: ankit.p.navik, intel-gfx, kbuild-all

Hi Praveen,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on drm-intel/for-linux-next]
[also build test WARNING on next-20190306]
[cannot apply to v5.0]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Ankit-Navik/drm-i915-Context-aware-user-agnostic-EU-Slice-Sub-slice-control-within-kernel/20190314-181342
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'


sparse warnings: (new ones prefixed by >>)

   include/uapi/linux/perf_event.h:147:56: sparse: cast truncates bits from constant value (8000000000000000 becomes 0)
>> drivers/gpu/drm/i915/intel_device_info.c:858:28: sparse: incorrect type in assignment (different modifiers) @@    expected struct i915_sseu_optimum_config *opt_config @@    got structstruct i915_sseu_optimum_config *opt_config @@
   drivers/gpu/drm/i915/intel_device_info.c:858:28:    expected struct i915_sseu_optimum_config *opt_config
   drivers/gpu/drm/i915/intel_device_info.c:858:28:    got struct i915_sseu_optimum_config const *<noident>
   drivers/gpu/drm/i915/intel_device_info.c:869:36: sparse: incorrect type in assignment (different modifiers) @@    expected struct i915_sseu_optimum_config *opt_config @@    got structstruct i915_sseu_optimum_config *opt_config @@
   drivers/gpu/drm/i915/intel_device_info.c:869:36:    expected struct i915_sseu_optimum_config *opt_config
   drivers/gpu/drm/i915/intel_device_info.c:869:36:    got struct i915_sseu_optimum_config const *<noident>
   drivers/gpu/drm/i915/intel_device_info.c:874:36: sparse: incorrect type in assignment (different modifiers) @@    expected struct i915_sseu_optimum_config *opt_config @@    got structstruct i915_sseu_optimum_config *opt_config @@
   drivers/gpu/drm/i915/intel_device_info.c:874:36:    expected struct i915_sseu_optimum_config *opt_config
   drivers/gpu/drm/i915/intel_device_info.c:874:36:    got struct i915_sseu_optimum_config const *<noident>

vim +858 drivers/gpu/drm/i915/intel_device_info.c

   730	
   731	/**
   732	 * intel_device_info_runtime_init - initialize runtime info
   733	 * @dev_priv: the i915 device
   734	 *
   735	 * Determine various intel_device_info fields at runtime.
   736	 *
   737	 * Use it when either:
   738	 *   - it's judged too laborious to fill n static structures with the limit
   739	 *     when a simple if statement does the job,
   740	 *   - run-time checks (eg read fuse/strap registers) are needed.
   741	 *
   742	 * This function needs to be called:
   743	 *   - after the MMIO has been setup as we are reading registers,
   744	 *   - after the PCH has been detected,
   745	 *   - before the first usage of the fields it can tweak.
   746	 */
   747	void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
   748	{
   749		struct intel_device_info *info = mkwrite_device_info(dev_priv);
   750		struct intel_runtime_info *runtime = RUNTIME_INFO(dev_priv);
   751		enum pipe pipe;
   752		struct i915_sseu_optimum_config *opt_config = NULL;
   753	
   754		if (INTEL_GEN(dev_priv) >= 10) {
   755			for_each_pipe(dev_priv, pipe)
   756				runtime->num_scalers[pipe] = 2;
   757		} else if (IS_GEN(dev_priv, 9)) {
   758			runtime->num_scalers[PIPE_A] = 2;
   759			runtime->num_scalers[PIPE_B] = 2;
   760			runtime->num_scalers[PIPE_C] = 1;
   761		}
   762	
   763		BUILD_BUG_ON(I915_NUM_ENGINES > BITS_PER_TYPE(intel_ring_mask_t));
   764	
   765		if (IS_GEN(dev_priv, 11))
   766			for_each_pipe(dev_priv, pipe)
   767				runtime->num_sprites[pipe] = 6;
   768		else if (IS_GEN(dev_priv, 10) || IS_GEMINILAKE(dev_priv))
   769			for_each_pipe(dev_priv, pipe)
   770				runtime->num_sprites[pipe] = 3;
   771		else if (IS_BROXTON(dev_priv)) {
   772			/*
   773			 * Skylake and Broxton currently don't expose the topmost plane as its
   774			 * use is exclusive with the legacy cursor and we only want to expose
   775			 * one of those, not both. Until we can safely expose the topmost plane
   776			 * as a DRM_PLANE_TYPE_CURSOR with all the features exposed/supported,
   777			 * we don't expose the topmost plane at all to prevent ABI breakage
   778			 * down the line.
   779			 */
   780	
   781			runtime->num_sprites[PIPE_A] = 2;
   782			runtime->num_sprites[PIPE_B] = 2;
   783			runtime->num_sprites[PIPE_C] = 1;
   784		} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
   785			for_each_pipe(dev_priv, pipe)
   786				runtime->num_sprites[pipe] = 2;
   787		} else if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv)) {
   788			for_each_pipe(dev_priv, pipe)
   789				runtime->num_sprites[pipe] = 1;
   790		}
   791	
   792		if (i915_modparams.disable_display) {
   793			DRM_INFO("Display disabled (module parameter)\n");
   794			info->num_pipes = 0;
   795		} else if (HAS_DISPLAY(dev_priv) &&
   796			   (IS_GEN_RANGE(dev_priv, 7, 8)) &&
   797			   HAS_PCH_SPLIT(dev_priv)) {
   798			u32 fuse_strap = I915_READ(FUSE_STRAP);
   799			u32 sfuse_strap = I915_READ(SFUSE_STRAP);
   800	
   801			/*
   802			 * SFUSE_STRAP is supposed to have a bit signalling the display
   803			 * is fused off. Unfortunately it seems that, at least in
   804			 * certain cases, fused off display means that PCH display
   805			 * reads don't land anywhere. In that case, we read 0s.
   806			 *
   807			 * On CPT/PPT, we can detect this case as SFUSE_STRAP_FUSE_LOCK
   808			 * should be set when taking over after the firmware.
   809			 */
   810			if (fuse_strap & ILK_INTERNAL_DISPLAY_DISABLE ||
   811			    sfuse_strap & SFUSE_STRAP_DISPLAY_DISABLED ||
   812			    (HAS_PCH_CPT(dev_priv) &&
   813			     !(sfuse_strap & SFUSE_STRAP_FUSE_LOCK))) {
   814				DRM_INFO("Display fused off, disabling\n");
   815				info->num_pipes = 0;
   816			} else if (fuse_strap & IVB_PIPE_C_DISABLE) {
   817				DRM_INFO("PipeC fused off\n");
   818				info->num_pipes -= 1;
   819			}
   820		} else if (HAS_DISPLAY(dev_priv) && INTEL_GEN(dev_priv) >= 9) {
   821			u32 dfsm = I915_READ(SKL_DFSM);
   822			u8 disabled_mask = 0;
   823			bool invalid;
   824			int num_bits;
   825	
   826			if (dfsm & SKL_DFSM_PIPE_A_DISABLE)
   827				disabled_mask |= BIT(PIPE_A);
   828			if (dfsm & SKL_DFSM_PIPE_B_DISABLE)
   829				disabled_mask |= BIT(PIPE_B);
   830			if (dfsm & SKL_DFSM_PIPE_C_DISABLE)
   831				disabled_mask |= BIT(PIPE_C);
   832	
   833			num_bits = hweight8(disabled_mask);
   834	
   835			switch (disabled_mask) {
   836			case BIT(PIPE_A):
   837			case BIT(PIPE_B):
   838			case BIT(PIPE_A) | BIT(PIPE_B):
   839			case BIT(PIPE_A) | BIT(PIPE_C):
   840				invalid = true;
   841				break;
   842			default:
   843				invalid = false;
   844			}
   845	
   846			if (num_bits > info->num_pipes || invalid)
   847				DRM_ERROR("invalid pipe fuse configuration: 0x%x\n",
   848					  disabled_mask);
   849			else
   850				info->num_pipes -= num_bits;
   851		}
   852	
   853		/* Initialize slice/subslice/EU info */
   854		if (IS_HASWELL(dev_priv))
   855			haswell_sseu_info_init(dev_priv);
   856		else if (IS_CHERRYVIEW(dev_priv)) {
   857			cherryview_sseu_info_init(dev_priv);
 > 858			opt_config = chv_config;
   859			BUILD_BUG_ON(ARRAY_SIZE(chv_config) != LOAD_TYPE_LAST);
   860		}
   861		else if (IS_BROADWELL(dev_priv))
   862			broadwell_sseu_info_init(dev_priv);
   863		else if (IS_GEN(dev_priv, 9)) {
   864			gen9_sseu_info_init(dev_priv);
   865	
   866			switch (info->gt) {
   867			default: /* fall through */
   868			case 2:
   869				opt_config = kbl_gt2_config;
   870				BUILD_BUG_ON(ARRAY_SIZE(kbl_gt2_config)
   871							!= LOAD_TYPE_LAST);
   872			break;
   873			case 3:
   874				opt_config = kbl_gt3_config;
   875				BUILD_BUG_ON(ARRAY_SIZE(kbl_gt3_config)
   876							!= LOAD_TYPE_LAST);
   877			break;
   878			}
   879		}
   880		else if (IS_GEN(dev_priv, 10))
   881			gen10_sseu_info_init(dev_priv);
   882		else if (INTEL_GEN(dev_priv) >= 11)
   883			gen11_sseu_info_init(dev_priv);
   884	
   885		if (IS_GEN(dev_priv, 6) && intel_vtd_active()) {
   886			DRM_INFO("Disabling ppGTT for VT-d support\n");
   887			info->ppgtt = INTEL_PPGTT_NONE;
   888		}
   889	
   890		if (opt_config)
   891			dev_priv->opt_config = opt_config;
   892	
   893		/* Initialize command stream timestamp frequency */
   894		runtime->cs_timestamp_frequency_khz = read_timestamp_frequency(dev_priv);
   895	}
   896	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-03-14 15:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-14  8:36 [PATCH v4 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik
2019-03-14  8:36 ` [PATCH v4 1/3] drm/i915: Get active pending request for given context Ankit Navik
2019-03-14  8:55   ` Chris Wilson
2019-03-14 10:39   ` Tvrtko Ursulin
2019-03-14  8:36 ` [PATCH v4 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik
2019-03-14 10:45   ` Tvrtko Ursulin
2019-03-14 10:52   ` kbuild test robot
2019-03-14 11:04   ` kbuild test robot
2019-03-14 15:50   ` kbuild test robot
2019-03-14  8:36 ` [PATCH v4 3/3] drm/i915: Predictive governor to control eu/slice/subslice Ankit Navik
2019-03-14 10:58   ` Tvrtko Ursulin
2019-03-14  9:35 ` ✗ Fi.CI.BAT: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.