All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.
@ 2018-09-21  9:13 kedar.j.karanje
  2018-09-21  9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
                   ` (6 more replies)
  0 siblings, 7 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21  9:13 UTC (permalink / raw)
  To: intel-gfx

From: "Kedar J. Karanje" <kedar.j.karanje@intel.com>

drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel

Current GPU configuration code for i915 does not allow us to change
EU/Slice/Sub-slice configuration dynamically. Its done only once while context
is created.

While particular graphics application is running, if we examine the command
requests from user space, we observe that command density is not consistent.
It means there is scope to change the graphics configuration dynamically even
while context is running actively. This patch series proposes the solution to
find the active pending load for all active context at given time and based on
that, dynamically perform graphics configuration for each context.

We use a hr (high resolution) timer with i915 driver in kernel to get a
callback every few milliseconds (this timer value can be configured through
debugfs, default is '0' indicating timer is in disabled state i.e. original
system without any intervention).In the timer callback, we examine pending
commands for a context in the queue, essentially, we intercept them before
they are executed by GPU and we update context with required number of EUs.

Two questions, how did we arrive at right timer value? and what's the right
number of EUs? For the prior one, empirical data to achieve best performance
in least power was considered. For the later one, we roughly categorized number 
of EUs logically based on platform. Now we compare number of pending commands
with a particular threshold and then set number of EUs accordingly with update
context. That threshold is also based on experiments & findings. If GPU is able
to catch up with CPU, typically there are no pending commands, the EU config
would remain unchanged there. In case there are more pending commands we
reprogram context with higher number of EUs. Please note, here we are changing
EUs even while context is running by examining pending commands every 'x'
milliseconds.

With this solution in place, on KBL-GT3 + Android we saw following pnp
benefits without any performance degradation, power numbers mentioned here
are system power.

App /KPI               | % Power |
                       | Benefit |
                       |  (mW)   |
---------------------------------|
3D Mark (Ice storm)    | 2.30%   |
TRex On screen         | 2.49%   |
TRex Off screen        | 1.32%   |
ManhattanOn screen     | 3.11%   |
Manhattan Off screen   | 0.89%   |
AnTuTu  6.1.4          | 3.42%   |

Note - For KBL (GEN9) we cannot control at sub-slice level, it was always  a
constraint.
We always controlled number of EUs rather than sub-slices/slices.


Praveen Diwakar (4):
  drm/i915: Get active pending request for given context
  drm/i915: Update render power clock state configuration for given
    context
  drm/i915: set optimum eu/slice/sub-slice configuration based on load
    type
  drm/i915: Predictive governor to control eu/slice/subslice based on
    workload

 drivers/gpu/drm/i915/i915_debugfs.c        | 94 +++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drv.c            |  1 +
 drivers/gpu/drm/i915/i915_drv.h            |  6 ++
 drivers/gpu/drm/i915/i915_gem_context.c    | 52 +++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.h    | 52 +++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++
 drivers/gpu/drm/i915/intel_lrc.c           | 47 +++++++++++++++
 7 files changed, 256 insertions(+), 1 deletion(-)

--
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] drm/i915: Get active pending request for given context
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
@ 2018-09-21  9:13 ` kedar.j.karanje
  2018-09-21 12:39   ` Tvrtko Ursulin
  2018-09-25  8:04   ` Jani Nikula
  2018-09-21  9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21  9:13 UTC (permalink / raw)
  To: intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar

From: Praveen Diwakar <praveen.diwakar@intel.com>

This patch gives us the active pending request count which is yet
to be submitted to the GPU

Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c            | 1 +
 drivers/gpu/drm/i915/i915_drv.h            | 5 +++++
 drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
 drivers/gpu/drm/i915/i915_gem_context.h    | 6 ++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
 drivers/gpu/drm/i915/intel_lrc.c           | 6 ++++++
 6 files changed, 24 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f8cfd16..d37c46e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
 	mutex_init(&dev_priv->av_mutex);
 	mutex_init(&dev_priv->wm.wm_mutex);
 	mutex_init(&dev_priv->pps_mutex);
+	mutex_init(&dev_priv->pred_mutex);
 
 	i915_memcpy_init_early(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4aca534..137ec33 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1609,6 +1609,11 @@ struct drm_i915_private {
 	 * controller on different i2c buses. */
 	struct mutex gmbus_mutex;
 
+	/** pred_mutex protects against councurrent usage of pending
+	 * request counter for multiple contexts
+	 */
+	struct mutex pred_mutex;
+
 	/**
 	 * Base address of the gmbus and gpio block.
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b10770c..30932d9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 	}
 
 	trace_i915_context_create(ctx);
+	ctx->req_cnt = 0;
 
 	return ctx;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index b116e49..243ea22 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -194,6 +194,12 @@ struct i915_gem_context {
 	 * context close.
 	 */
 	struct list_head handles_list;
+
+	/** req_cnt: tracks the pending commands, based on which we decide to
+	 * go for low/medium/high load configuration of the GPU, this is
+	 * controlled via a mutex
+	 */
+	u64 req_cnt;
 };
 
 static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3f0c612..f799ff9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		       struct drm_syncobj **fences)
 {
 	struct i915_execbuffer eb;
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct dma_fence *in_fence = NULL;
 	struct sync_file *out_fence = NULL;
 	int out_fence_fd = -1;
@@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 */
 	eb.request->batch = eb.batch;
 
+	mutex_lock(&dev_priv->pred_mutex);
+	eb.ctx->req_cnt++;
+	mutex_unlock(&dev_priv->pred_mutex);
+
 	trace_i915_request_queue(eb.request, eb.batch_flags);
 	err = eb_submit(&eb);
 err_request:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1744792..039fbdb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
 			trace_i915_request_in(rq, port_index(port, execlists));
 			last = rq;
 			submit = true;
+
+			mutex_lock(&rq->i915->pred_mutex);
+			if (rq->gem_context->req_cnt > 0) {
+				rq->gem_context->req_cnt--;
+			}
+			mutex_unlock(&rq->i915->pred_mutex);
 		}
 
 		rb_erase_cached(&p->node, &execlists->queue);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
  2018-09-21  9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
@ 2018-09-21  9:13 ` kedar.j.karanje
  2018-09-21 12:51   ` Tvrtko Ursulin
  2018-09-22 19:13   ` kbuild test robot
  2018-09-21  9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21  9:13 UTC (permalink / raw)
  To: intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar

From: Praveen Diwakar <praveen.diwakar@intel.com>

This patch will update power clock state register at runtime base on the
flag update_render_config which can set by any governor which computes load
and want to update rpcs register.
subsequent patches will have a timer based governor which computes pending
load/request.

Change-Id: I4e7d2f484b957d5bd496e1decc59a69e3bc6d186
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++
 drivers/gpu/drm/i915/i915_gem_context.h | 14 +++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        | 41 +++++++++++++++++++++++++++++++++
 3 files changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 30932d9..2838c1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -388,6 +388,11 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 
 	trace_i915_context_create(ctx);
 	ctx->req_cnt = 0;
+	ctx->update_render_config = 0;
+	ctx->slice_cnt = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
+	ctx->subslice_cnt = hweight8(
+			INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
+	ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
 
 	return ctx;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 243ea22..52e341c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -200,6 +200,20 @@ struct i915_gem_context {
 	 * controlled via a mutex
 	 */
 	u64 req_cnt;
+
+	/** slice_cnt: used to set the # of slices to be enabled. */
+	u8 slice_cnt;
+
+	/** subslice_cnt: used to set the # of subslices to be enabled. */
+	u8 subslice_cnt;
+
+	/** eu_cnt: used to set the # of eu to be enabled. */
+	u8 eu_cnt;
+
+	/** update_render_config: to track the updates to the render
+	 * configuration (S/SS/EU Configuration on the GPU)
+	 */
+	bool update_render_config;
 };
 
 static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 039fbdb..d2d0e7d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -364,6 +364,36 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
 }
 
+static u32
+get_context_rpcs_config(struct i915_gem_context *ctx)
+{
+	struct drm_i915_private *dev_priv = ctx->i915;
+	u32 rpcs = 0;
+
+	if (INTEL_GEN(dev_priv) < 8)
+		return 0;
+
+	if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
+		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
+		rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
+		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
+		rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
+		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
+		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
+		rpcs |= GEN8_RPCS_ENABLE;
+	}
+
+	return rpcs;
+}
+
 static inline void
 execlists_context_status_change(struct i915_request *rq, unsigned long status)
 {
@@ -418,11 +448,22 @@ execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
 static u64 execlists_update_context(struct i915_request *rq)
 {
 	struct intel_context *ce = rq->hw_context;
+	struct i915_gem_context *ctx = rq->gem_context;
+	struct intel_engine_cs *engine = rq->engine;
 	struct i915_hw_ppgtt *ppgtt =
 		rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
 	u32 *reg_state = ce->lrc_reg_state;
+	u32 rpcs_config = 0;
 
 	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
+	if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
+			ctx->update_render_config) {
+		rpcs_config = get_context_rpcs_config(ctx);
+		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+		CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
+				rpcs_config);
+		ctx->update_render_config = 0;
+	}
 
 	/* True 32b PPGTT with dynamic page allocation: update PDP
 	 * registers and point the unallocated PDPs to scratch page.
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
  2018-09-21  9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
  2018-09-21  9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
@ 2018-09-21  9:13 ` kedar.j.karanje
  2018-09-21 13:05   ` Tvrtko Ursulin
  2018-09-22 18:09   ` kbuild test robot
  2018-09-21  9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21  9:13 UTC (permalink / raw)
  To: intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar

From: Praveen Diwakar <praveen.diwakar@intel.com>

This patch will select optimum eu/slice/sub-slice configuration based on type
of load (low, medium, high) as input.
Based on our readings and experiments we have predefined set of optimum
configuration for each platform(CHT, KBL).
i915_set_optimum_config will select optimum configuration from pre-defined
optimum configuration table(opt_config).

Change-Id: I3a6a2a6bdddd01b3d3c97995f5403aef3c6fa989
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 46 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.h | 32 +++++++++++++++++++++++
 2 files changed, 78 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2838c1d..1b76410 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -94,6 +94,32 @@
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
+static struct optimum_config opt_config[TIER_VERSION_MAX][LOAD_TYPE_MAX] = {
+	{
+		/* Cherry trail low */
+		{ 1, 1, 4},
+		/* Cherry trail medium */
+		{ 1, 1, 6},
+		/* Cherry trail high */
+		{ 1, 2, 6}
+	},
+	{
+		/* kbl gt2 low */
+		{ 2, 3, 4},
+		/* kbl gt2 medium */
+		{ 2, 3, 6},
+		/* kbl gt2 high */
+		{ 2, 3, 8}
+	},
+	{
+		/* kbl gt3 low */
+		{ 2, 3, 4},
+		/* kbl gt3 medium */
+		{ 2, 3, 6},
+		/* kbl gt3 high */
+		{ 2, 3, 8}
+	}
+};
 static void lut_close(struct i915_gem_context *ctx)
 {
 	struct i915_lut_handle *lut, *ln;
@@ -393,10 +419,30 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 	ctx->subslice_cnt = hweight8(
 			INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
 	ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+	ctx->load_type = 0;
+	ctx->prev_load_type = 0;
 
 	return ctx;
 }
 
+
+void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
+		enum gem_tier_versions version)
+{
+	struct intel_context *ce = &ctx->__engine[RCS];
+	u32 *reg_state = ce->lrc_reg_state;
+	u32 rpcs_config = 0;
+	/* Call opt_config to get correct configuration for eu,slice,subslice */
+	ctx->slice_cnt = (u8)opt_config[version][type].slice;
+	ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
+	ctx->eu_cnt = (u8)opt_config[version][type].eu;
+
+	/* Enabling this to update the rpcs */
+	if (ctx->prev_load_type != type)
+		ctx->update_render_config = 1;
+
+	ctx->prev_load_type = type;
+}
 /**
  * i915_gem_context_create_gvt - create a GVT GEM context
  * @dev: drm device *
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 52e341c..50183e6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -53,6 +53,26 @@ struct intel_context_ops {
 	void (*destroy)(struct intel_context *ce);
 };
 
+enum gem_load_type {
+	LOAD_TYPE_LOW,
+	LOAD_TYPE_MEDIUM,
+	LOAD_TYPE_HIGH,
+	LOAD_TYPE_MAX
+};
+
+enum gem_tier_versions {
+	CHERRYVIEW = 0,
+	KABYLAKE_GT2,
+	KABYLAKE_GT3,
+	TIER_VERSION_MAX
+};
+
+struct optimum_config {
+	int slice;
+	int subslice;
+	int eu;
+};
+
 /**
  * struct i915_gem_context - client state
  *
@@ -210,6 +230,16 @@ struct i915_gem_context {
 	/** eu_cnt: used to set the # of eu to be enabled. */
 	u8 eu_cnt;
 
+	/** load_type: The designated load_type (high/medium/low) for a given
+	 * number of pending commands in the command queue.
+	 */
+	u8 load_type;
+
+	/** prev_load_type: The earlier load type that the GPU was configured
+	 * for (high/medium/low).
+	 */
+	u8 prev_load_type;
+
 	/** update_render_config: to track the updates to the render
 	 * configuration (S/SS/EU Configuration on the GPU)
 	 */
@@ -342,6 +372,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv);
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
+void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
+			     enum gem_tier_versions version);
 
 struct i915_gem_context *
 i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
                   ` (2 preceding siblings ...)
  2018-09-21  9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
@ 2018-09-21  9:13 ` kedar.j.karanje
  2018-09-21 13:24   ` Tvrtko Ursulin
  2018-09-22 18:31   ` kbuild test robot
  2018-09-21 11:16 ` ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU Patchwork
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21  9:13 UTC (permalink / raw)
  To: intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar

From: Praveen Diwakar <praveen.diwakar@intel.com>

High resoluton timer is used for this purpose.

Debugfs is provided to enable/disable/update timer configuration

Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 94 ++++++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 2 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f9ce35d..81ba509 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4740,6 +4740,97 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_drrs_status", i915_drrs_status, 0},
 	{"i915_rps_boost_info", i915_rps_boost_info, 0},
 };
+
+#define POLL_PERIOD_MS	(1000 * 1000)
+#define PENDING_REQ_0	0 /* No active request pending*/
+#define PENDING_REQ_3	3 /* Threshold value of 3 active request pending*/
+			  /* Anything above this is considered as HIGH load
+			   * context
+			   */
+			  /* And less is considered as LOW load*/
+			  /* And equal is considered as mediaum load */
+
+static int predictive_load_enable;
+static int predictive_load_timer_init;
+
+static enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(hrtimer, typeof(*dev_priv),
+				pred_timer);
+	enum intel_engine_id id;
+	struct intel_engine_cs *engine;
+	struct i915_gem_context *ctx;
+	u64 req_pending;
+
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
+
+		if (!ctx->name)
+			continue;
+
+		mutex_lock(&dev_priv->pred_mutex);
+		req_pending = ctx->req_cnt;
+		mutex_unlock(&dev_priv->pred_mutex);
+
+		if (req_pending == PENDING_REQ_0)
+			continue;
+
+		if (req_pending > PENDING_REQ_3)
+			ctx->load_type = LOAD_TYPE_HIGH;
+		else if (req_pending == PENDING_REQ_3)
+			ctx->load_type = LOAD_TYPE_MEDIUM;
+		else if (req_pending < PENDING_REQ_3)
+			ctx->load_type = LOAD_TYPE_LOW;
+
+		i915_set_optimum_config(ctx->load_type, ctx, KABYLAKE_GT3);
+	}
+
+	hrtimer_forward_now(hrtimer,
+			ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
+
+	return HRTIMER_RESTART;
+}
+
+static int
+i915_predictive_load_get(void *data, u64 *val)
+{
+	struct drm_i915_private *dev_priv = data;
+
+	*val = predictive_load_enable;
+	return 0;
+}
+
+static int
+i915_predictive_load_set(void *data, u64 val)
+{
+	struct drm_i915_private *dev_priv = data;
+	struct intel_device_info *info;
+
+	info = mkwrite_device_info(dev_priv);
+
+	predictive_load_enable = val;
+
+	if (predictive_load_enable) {
+		if (!predictive_load_timer_init) {
+			hrtimer_init(&dev_priv->pred_timer, CLOCK_MONOTONIC,
+					HRTIMER_MODE_REL);
+			dev_priv->pred_timer.function = predictive_load_cb;
+			predictive_load_timer_init = 1;
+		}
+		hrtimer_start(&dev_priv->pred_timer,
+			ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
+			HRTIMER_MODE_REL_PINNED);
+	} else {
+		hrtimer_cancel(&dev_priv->pred_timer);
+	}
+
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
+			i915_predictive_load_get, i915_predictive_load_set,
+			"%llu\n");
+
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
 static const struct i915_debugfs_files {
@@ -4769,7 +4860,8 @@ static const struct i915_debugfs_files {
 	{"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
 	{"i915_ipc_status", &i915_ipc_status_fops},
 	{"i915_drrs_ctl", &i915_drrs_ctl_fops},
-	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
+	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
+	{"i915_predictive_load_ctl", &i915_predictive_load_ctl}
 };
 
 int i915_debugfs_register(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 137ec33..0505c47 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
 };
 
 struct drm_i915_private {
+	struct hrtimer pred_timer;
 	struct drm_device drm;
 
 	struct kmem_cache *objects;
-- 
2.7.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU.
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
                   ` (3 preceding siblings ...)
  2018-09-21  9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
@ 2018-09-21 11:16 ` Patchwork
  2018-09-21 12:31 ` [PATCH 0/4][RFC] " Tvrtko Ursulin
  2018-09-27 14:02 ` Joonas Lahtinen
  6 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2018-09-21 11:16 UTC (permalink / raw)
  To: kedar.j.karanje; +Cc: intel-gfx

== Series Details ==

Series: Dynamic EU configuration of Slice/Subslice/EU.
URL   : https://patchwork.freedesktop.org/series/50006/
State : failure

== Summary ==

CALL    scripts/checksyscalls.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
  CC [M]  drivers/gpu/drm/i915/i915_debugfs.o
drivers/gpu/drm/i915/i915_debugfs.c: In function ‘predictive_load_cb’:
drivers/gpu/drm/i915/i915_debugfs.c:4813:26: error: unused variable ‘engine’ [-Werror=unused-variable]
  struct intel_engine_cs *engine;
                          ^~~~~~
drivers/gpu/drm/i915/i915_debugfs.c:4812:23: error: unused variable ‘id’ [-Werror=unused-variable]
  enum intel_engine_id id;
                       ^~
drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_predictive_load_get’:
drivers/gpu/drm/i915/i915_debugfs.c:4848:27: error: unused variable ‘dev_priv’ [-Werror=unused-variable]
  struct drm_i915_private *dev_priv = data;
                           ^~~~~~~~
cc1: all warnings being treated as errors
scripts/Makefile.build:305: recipe for target 'drivers/gpu/drm/i915/i915_debugfs.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_debugfs.o] Error 1
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1062: recipe for target 'drivers' failed
make: *** [drivers] Error 2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
                   ` (4 preceding siblings ...)
  2018-09-21 11:16 ` ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU Patchwork
@ 2018-09-21 12:31 ` Tvrtko Ursulin
  2018-09-27 14:02 ` Joonas Lahtinen
  6 siblings, 0 replies; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 12:31 UTC (permalink / raw)
  To: kedar.j.karanje, intel-gfx


Hi,

On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: "Kedar J. Karanje" <kedar.j.karanje@intel.com>
> 
> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
> 
> Current GPU configuration code for i915 does not allow us to change
> EU/Slice/Sub-slice configuration dynamically. Its done only once while context
> is created.
> 
> While particular graphics application is running, if we examine the command
> requests from user space, we observe that command density is not consistent.
> It means there is scope to change the graphics configuration dynamically even
> while context is running actively. This patch series proposes the solution to
> find the active pending load for all active context at given time and based on
> that, dynamically perform graphics configuration for each context.
> 
> We use a hr (high resolution) timer with i915 driver in kernel to get a
> callback every few milliseconds (this timer value can be configured through
> debugfs, default is '0' indicating timer is in disabled state i.e. original
> system without any intervention).In the timer callback, we examine pending
> commands for a context in the queue, essentially, we intercept them before
> they are executed by GPU and we update context with required number of EUs.
> 
> Two questions, how did we arrive at right timer value? and what's the right
> number of EUs? For the prior one, empirical data to achieve best performance
> in least power was considered. For the later one, we roughly categorized number
> of EUs logically based on platform. Now we compare number of pending commands
> with a particular threshold and then set number of EUs accordingly with update
> context. That threshold is also based on experiments & findings. If GPU is able
> to catch up with CPU, typically there are no pending commands, the EU config
> would remain unchanged there. In case there are more pending commands we
> reprogram context with higher number of EUs. Please note, here we are changing
> EUs even while context is running by examining pending commands every 'x'
> milliseconds.
> 
> With this solution in place, on KBL-GT3 + Android we saw following pnp
> benefits without any performance degradation, power numbers mentioned here
> are system power.
> 
> App /KPI               | % Power |
>                         | Benefit |
>                         |  (mW)   |
> ---------------------------------|
> 3D Mark (Ice storm)    | 2.30%   |
> TRex On screen         | 2.49%   |
> TRex Off screen        | 1.32%   |
> ManhattanOn screen     | 3.11%   |
> Manhattan Off screen   | 0.89%   |
> AnTuTu  6.1.4          | 3.42%   |

Good numbers! It is hard to argue against them. :) Even though obviously 
the heuristics like the one you implemented can be easily fooled by 
different workloads... Since you make no distinction between different 
engines and EUs vs fixed functions, external fences vs GPU 
over-subscription and similar.

But then again you have a control knob and it is off by default. So if 
it genuinely always only helps typical use cases, and people can stomach 
a control knob, perhaps we can have it.

In any case I think you would need to test against a lot more benchmarks 
to pass the threshold of whether upstream could consider this. And 
definitely not only under Android. It is just that these days I don't 
know with whom to put you in touch in order to recommend a list of 
benchmarks, or even provide some automated system to run them. Anybody?

Regards,

Tvrtko

> 
> Note - For KBL (GEN9) we cannot control at sub-slice level, it was always  a
> constraint.
> We always controlled number of EUs rather than sub-slices/slices.
> 
> 
> Praveen Diwakar (4):
>    drm/i915: Get active pending request for given context
>    drm/i915: Update render power clock state configuration for given
>      context
>    drm/i915: set optimum eu/slice/sub-slice configuration based on load
>      type
>    drm/i915: Predictive governor to control eu/slice/subslice based on
>      workload
> 
>   drivers/gpu/drm/i915/i915_debugfs.c        | 94 +++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/i915_drv.c            |  1 +
>   drivers/gpu/drm/i915/i915_drv.h            |  6 ++
>   drivers/gpu/drm/i915/i915_gem_context.c    | 52 +++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_context.h    | 52 +++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++
>   drivers/gpu/drm/i915/intel_lrc.c           | 47 +++++++++++++++
>   7 files changed, 256 insertions(+), 1 deletion(-)
> 
> --
> 2.7.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
  2018-09-21  9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
@ 2018-09-21 12:39   ` Tvrtko Ursulin
  2018-11-06  4:18     ` Navik, Ankit P
  2018-09-25  8:04   ` Jani Nikula
  1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 12:39 UTC (permalink / raw)
  To: kedar.j.karanje, intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar


On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
> 
> Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c            | 1 +
>   drivers/gpu/drm/i915/i915_drv.h            | 5 +++++
>   drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
>   drivers/gpu/drm/i915/i915_gem_context.h    | 6 ++++++
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
>   drivers/gpu/drm/i915/intel_lrc.c           | 6 ++++++
>   6 files changed, 24 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index f8cfd16..d37c46e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
>   	mutex_init(&dev_priv->av_mutex);
>   	mutex_init(&dev_priv->wm.wm_mutex);
>   	mutex_init(&dev_priv->pps_mutex);
> +	mutex_init(&dev_priv->pred_mutex);
>   
>   	i915_memcpy_init_early(dev_priv);
>   
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4aca534..137ec33 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
>   	 * controller on different i2c buses. */
>   	struct mutex gmbus_mutex;
>   
> +	/** pred_mutex protects against councurrent usage of pending
> +	 * request counter for multiple contexts
> +	 */
> +	struct mutex pred_mutex;
> +
>   	/**
>   	 * Base address of the gmbus and gpio block.
>   	 */
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index b10770c..30932d9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   	}
>   
>   	trace_i915_context_create(ctx);
> +	ctx->req_cnt = 0;
>   
>   	return ctx;
>   }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index b116e49..243ea22 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -194,6 +194,12 @@ struct i915_gem_context {
>   	 * context close.
>   	 */
>   	struct list_head handles_list;
> +
> +	/** req_cnt: tracks the pending commands, based on which we decide to
> +	 * go for low/medium/high load configuration of the GPU, this is
> +	 * controlled via a mutex
> +	 */
> +	u64 req_cnt;

64-bit is too wide and mutex causes you problems later in the series. 
I'd suggest atomic_t.

>   };
>   
>   static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3f0c612..f799ff9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   		       struct drm_syncobj **fences)
>   {
>   	struct i915_execbuffer eb;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
>   	struct dma_fence *in_fence = NULL;
>   	struct sync_file *out_fence = NULL;
>   	int out_fence_fd = -1;
> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>   	 */
>   	eb.request->batch = eb.batch;
>   
> +	mutex_lock(&dev_priv->pred_mutex);
> +	eb.ctx->req_cnt++;
> +	mutex_unlock(&dev_priv->pred_mutex);
> +
>   	trace_i915_request_queue(eb.request, eb.batch_flags);
>   	err = eb_submit(&eb);
>   err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 1744792..039fbdb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>   			trace_i915_request_in(rq, port_index(port, execlists));
>   			last = rq;
>   			submit = true;
> +
> +			mutex_lock(&rq->i915->pred_mutex);
> +			if (rq->gem_context->req_cnt > 0) {

Presumably you hit underflow here and then added the protection. This 
was a hint the accounting does not work as expected. Preemption for 
instance would wreak havoc with it. :)

Have a look at my per engine queued/runnable counters for how to do it 
correctly, only that you would need to make it per context and merge 
into one aggregated queued+runnable instead of two separate counters, if 
you wanted to preserve your heuristics.

https://patchwork.freedesktop.org/patch/227981/
https://patchwork.freedesktop.org/patch/227976/

Regards,

Tvrtko

> +				rq->gem_context->req_cnt--;
> +			}
> +			mutex_unlock(&rq->i915->pred_mutex);
>   		}
>   
>   		rb_erase_cached(&p->node, &execlists->queue);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
  2018-09-21  9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
@ 2018-09-21 12:51   ` Tvrtko Ursulin
  2018-11-06  4:23     ` Navik, Ankit P
  2018-09-22 19:13   ` kbuild test robot
  1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 12:51 UTC (permalink / raw)
  To: kedar.j.karanje, intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar


On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> This patch will update power clock state register at runtime base on the
> flag update_render_config which can set by any governor which computes load
> and want to update rpcs register.
> subsequent patches will have a timer based governor which computes pending
> load/request.
> 
> Change-Id: I4e7d2f484b957d5bd496e1decc59a69e3bc6d186
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++
>   drivers/gpu/drm/i915/i915_gem_context.h | 14 +++++++++++
>   drivers/gpu/drm/i915/intel_lrc.c        | 41 +++++++++++++++++++++++++++++++++
>   3 files changed, 60 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 30932d9..2838c1d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -388,6 +388,11 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   
>   	trace_i915_context_create(ctx);
>   	ctx->req_cnt = 0;
> +	ctx->update_render_config = 0;
> +	ctx->slice_cnt = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
> +	ctx->subslice_cnt = hweight8(
> +			INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> +	ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
>   
>   	return ctx;
>   }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 243ea22..52e341c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -200,6 +200,20 @@ struct i915_gem_context {
>   	 * controlled via a mutex
>   	 */
>   	u64 req_cnt;
> +
> +	/** slice_cnt: used to set the # of slices to be enabled. */
> +	u8 slice_cnt;
> +
> +	/** subslice_cnt: used to set the # of subslices to be enabled. */
> +	u8 subslice_cnt;
> +
> +	/** eu_cnt: used to set the # of eu to be enabled. */
> +	u8 eu_cnt;
> +
> +	/** update_render_config: to track the updates to the render
> +	 * configuration (S/SS/EU Configuration on the GPU)
> +	 */
> +	bool update_render_config;
>   };
>   
>   static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 039fbdb..d2d0e7d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -364,6 +364,36 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
>   	spin_unlock_irqrestore(&engine->timeline.lock, flags);
>   }
>   
> +static u32
> +get_context_rpcs_config(struct i915_gem_context *ctx)
> +{
> +	struct drm_i915_private *dev_priv = ctx->i915;
> +	u32 rpcs = 0;
> +
> +	if (INTEL_GEN(dev_priv) < 8)
> +		return 0;
> +
> +	if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
> +		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
> +		rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
> +		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
> +		rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
> +		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
> +		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
> +		rpcs |= GEN8_RPCS_ENABLE;
> +	}
> +
> +	return rpcs;
> +}

This function is very similar to make_rpcs so I'd suggest to extract 
some commonality and share the code.

In any case you are likely to get overtaken by 
https://patchwork.freedesktop.org/series/48194/ after which you will be 
able to use struct intel_sseu to package the data you need in one ctx 
member, etc.

> +
>   static inline void
>   execlists_context_status_change(struct i915_request *rq, unsigned long status)
>   {
> @@ -418,11 +448,22 @@ execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
>   static u64 execlists_update_context(struct i915_request *rq)
>   {
>   	struct intel_context *ce = rq->hw_context;
> +	struct i915_gem_context *ctx = rq->gem_context;
> +	struct intel_engine_cs *engine = rq->engine;
>   	struct i915_hw_ppgtt *ppgtt =
>   		rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
>   	u32 *reg_state = ce->lrc_reg_state;
> +	u32 rpcs_config = 0;
>   
>   	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
> +	if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
> +			ctx->update_render_config) {
> +		rpcs_config = get_context_rpcs_config(ctx);
> +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> +		CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
> +				rpcs_config);

For a first submission this works, but when appending to a running 
context the RPCS value you write here will probably get overwritten by 
context save. At least I would be surprised if lite-restore would write 
to the register.

So if I am correct the problem you have is that your configuration 
changes will not always become live when you expect it to. And then if 
you hit lite-restore you don't try to re-configure it later unless there 
is a config change.

But I don't think there is a cheap/reasonable way to fix this fully. So 
maybe you just give up on this chunk and rely on context image to be 
updated on context pin like in 
https://patchwork.freedesktop.org/patch/250007/.

Regards,

Tvrtko

> +		ctx->update_render_config = 0;
> +	}
>   
>   	/* True 32b PPGTT with dynamic page allocation: update PDP
>   	 * registers and point the unallocated PDPs to scratch page.
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2018-09-21  9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
@ 2018-09-21 13:05   ` Tvrtko Ursulin
  2018-11-06  4:25     ` Navik, Ankit P
  2018-09-22 18:09   ` kbuild test robot
  1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 13:05 UTC (permalink / raw)
  To: kedar.j.karanje, intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar


On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> This patch will select optimum eu/slice/sub-slice configuration based on type
> of load (low, medium, high) as input.
> Based on our readings and experiments we have predefined set of optimum
> configuration for each platform(CHT, KBL).
> i915_set_optimum_config will select optimum configuration from pre-defined
> optimum configuration table(opt_config).
> 
> Change-Id: I3a6a2a6bdddd01b3d3c97995f5403aef3c6fa989
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_context.c | 46 +++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/i915_gem_context.h | 32 +++++++++++++++++++++++
>   2 files changed, 78 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2838c1d..1b76410 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -94,6 +94,32 @@
>   
>   #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
>   
> +static struct optimum_config opt_config[TIER_VERSION_MAX][LOAD_TYPE_MAX] = {
> +	{
> +		/* Cherry trail low */
> +		{ 1, 1, 4},
> +		/* Cherry trail medium */
> +		{ 1, 1, 6},
> +		/* Cherry trail high */
> +		{ 1, 2, 6}
> +	},
> +	{
> +		/* kbl gt2 low */
> +		{ 2, 3, 4},
> +		/* kbl gt2 medium */
> +		{ 2, 3, 6},
> +		/* kbl gt2 high */
> +		{ 2, 3, 8}
> +	},
> +	{
> +		/* kbl gt3 low */
> +		{ 2, 3, 4},
> +		/* kbl gt3 medium */
> +		{ 2, 3, 6},
> +		/* kbl gt3 high */
> +		{ 2, 3, 8}
> +	}
> +};
>   static void lut_close(struct i915_gem_context *ctx)
>   {
>   	struct i915_lut_handle *lut, *ln;
> @@ -393,10 +419,30 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>   	ctx->subslice_cnt = hweight8(
>   			INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
>   	ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> +	ctx->load_type = 0;
> +	ctx->prev_load_type = 0;
>   
>   	return ctx;
>   }
>   
> +
> +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,

Type for type should be enum.

Name the function as i915_gem_context_set_load_type(ctx, type).

> +		enum gem_tier_versions version)
> +{
> +	struct intel_context *ce = &ctx->__engine[RCS];
> +	u32 *reg_state = ce->lrc_reg_state;
> +	u32 rpcs_config = 0;

Unused locals?

> +	/* Call opt_config to get correct configuration for eu,slice,subslice */
> +	ctx->slice_cnt = (u8)opt_config[version][type].slice;
> +	ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
> +	ctx->eu_cnt = (u8)opt_config[version][type].eu;

You could store the correct table for the device in dev_priv and use the 
static table to assign/populate on device init time.

> +
> +	/* Enabling this to update the rpcs */
> +	if (ctx->prev_load_type != type)
> +		ctx->update_render_config = 1;

ctx->update_render_config was unused in last patch. So patch ordering is 
a bit suboptimal but I don't know.

> +
> +	ctx->prev_load_type = type;
> +}
>   /**
>    * i915_gem_context_create_gvt - create a GVT GEM context
>    * @dev: drm device *
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 52e341c..50183e6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -53,6 +53,26 @@ struct intel_context_ops {
>   	void (*destroy)(struct intel_context *ce);
>   };
>   
> +enum gem_load_type {
> +	LOAD_TYPE_LOW,
> +	LOAD_TYPE_MEDIUM,
> +	LOAD_TYPE_HIGH,
> +	LOAD_TYPE_MAX
> +};
> +
> +enum gem_tier_versions {
> +	CHERRYVIEW = 0,
> +	KABYLAKE_GT2,
> +	KABYLAKE_GT3,
> +	TIER_VERSION_MAX
> +};
> +
> +struct optimum_config {
> +	int slice;
> +	int subslice;
> +	int eu;

u8 would do global table definitely.

> +};
> +
>   /**
>    * struct i915_gem_context - client state
>    *
> @@ -210,6 +230,16 @@ struct i915_gem_context {
>   	/** eu_cnt: used to set the # of eu to be enabled. */
>   	u8 eu_cnt;
>   
> +	/** load_type: The designated load_type (high/medium/low) for a given
> +	 * number of pending commands in the command queue.
> +	 */
> +	u8 load_type;

Unused?

> +
> +	/** prev_load_type: The earlier load type that the GPU was configured
> +	 * for (high/medium/low).
> +	 */
> +	u8 prev_load_type;

Would pending_load_type sound more obvious?

Types should be enums for both.

Regards,

Tvrtko

> +
>   	/** update_render_config: to track the updates to the render
>   	 * configuration (S/SS/EU Configuration on the GPU)
>   	 */
> @@ -342,6 +372,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
>   				    struct drm_file *file_priv);
>   int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
>   				       struct drm_file *file);
> +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
> +			     enum gem_tier_versions version);
>   
>   struct i915_gem_context *
>   i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
  2018-09-21  9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
@ 2018-09-21 13:24   ` Tvrtko Ursulin
  2018-09-25  6:12     ` Navik, Ankit P
  2018-09-22 18:31   ` kbuild test robot
  1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 13:24 UTC (permalink / raw)
  To: kedar.j.karanje, intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar


On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
> 
> High resoluton timer is used for this purpose.
> 
> Debugfs is provided to enable/disable/update timer configuration
> 
> Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c | 94 ++++++++++++++++++++++++++++++++++++-
>   drivers/gpu/drm/i915/i915_drv.h     |  1 +
>   2 files changed, 94 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index f9ce35d..81ba509 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4740,6 +4740,97 @@ static const struct drm_info_list i915_debugfs_list[] = {
>   	{"i915_drrs_status", i915_drrs_status, 0},
>   	{"i915_rps_boost_info", i915_rps_boost_info, 0},
>   };
> +
> +#define POLL_PERIOD_MS	(1000 * 1000)
> +#define PENDING_REQ_0	0 /* No active request pending*/
> +#define PENDING_REQ_3	3 /* Threshold value of 3 active request pending*/
> +			  /* Anything above this is considered as HIGH load
> +			   * context
> +			   */
> +			  /* And less is considered as LOW load*/
> +			  /* And equal is considered as mediaum load */

Wonky comments and some typos up to here.

> +
> +static int predictive_load_enable;
> +static int predictive_load_timer_init;
> +
> +static enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
> +{
> +	struct drm_i915_private *dev_priv =
> +		container_of(hrtimer, typeof(*dev_priv),
> +				pred_timer);
> +	enum intel_engine_id id;
> +	struct intel_engine_cs *engine;

Some unused's.

> +	struct i915_gem_context *ctx;
> +	u64 req_pending;

unsigned long, and also please try to order declaration so the right 
edge of text is moving in one direction only.

> +
> +	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> +
> +		if (!ctx->name)
> +			continue;

What is this?

> +
> +		mutex_lock(&dev_priv->pred_mutex);

Here the mutex bites you since you cannot sleep in the timer callback. 
atomic_t would solve it. Or would a native unsigned int/long since lock 
to read a native word on x86 is not needed.

> +		req_pending = ctx->req_cnt;
> +		mutex_unlock(&dev_priv->pred_mutex);
> +
> +		if (req_pending == PENDING_REQ_0)
> +			continue;
> +
> +		if (req_pending > PENDING_REQ_3)
> +			ctx->load_type = LOAD_TYPE_HIGH;
> +		else if (req_pending == PENDING_REQ_3)
> +			ctx->load_type = LOAD_TYPE_MEDIUM;
> +		else if (req_pending < PENDING_REQ_3)

Must be smaller if not greater or equal, but maybe the compiler does 
that for you.

> +			ctx->load_type = LOAD_TYPE_LOW;
> +
> +		i915_set_optimum_config(ctx->load_type, ctx, KABYLAKE_GT3);

Only KBL? Idea to put the table in dev_priv FTW! :)

ctx->load_type used only as a temporary uncovered here? :)

> +	}
> +
> +	hrtimer_forward_now(hrtimer,
> +			ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));

Or HRTIMER_NORESTART if disabled? Hard to call it, details..

> +
> +	return HRTIMER_RESTART;
> +}
> +
> +static int
> +i915_predictive_load_get(void *data, u64 *val)
> +{
> +	struct drm_i915_private *dev_priv = data;
> +
> +	*val = predictive_load_enable;
> +	return 0;
> +}
> +
> +static int
> +i915_predictive_load_set(void *data, u64 val)
> +{
> +	struct drm_i915_private *dev_priv = data;
> +	struct intel_device_info *info;
> +
> +	info = mkwrite_device_info(dev_priv);

Unused, why?

> +
> +	predictive_load_enable = val;
> +
> +	if (predictive_load_enable) {
> +		if (!predictive_load_timer_init) {
> +			hrtimer_init(&dev_priv->pred_timer, CLOCK_MONOTONIC,
> +					HRTIMER_MODE_REL);
> +			dev_priv->pred_timer.function = predictive_load_cb;
> +			predictive_load_timer_init = 1;

Move timer init to dev_priv setup.

> +		}
> +		hrtimer_start(&dev_priv->pred_timer,
> +			ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
> +			HRTIMER_MODE_REL_PINNED);

Two threads can race to here.

Also you can give some slack to the timer since precision is not 
critical to you and can save you some CPU power.

> +	} else {
> +		hrtimer_cancel(&dev_priv->pred_timer);
> +	}
> +
> +	return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
> +			i915_predictive_load_get, i915_predictive_load_set,
> +			"%llu\n");
> +
>   #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>   
>   static const struct i915_debugfs_files {
> @@ -4769,7 +4860,8 @@ static const struct i915_debugfs_files {
>   	{"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
>   	{"i915_ipc_status", &i915_ipc_status_fops},
>   	{"i915_drrs_ctl", &i915_drrs_ctl_fops},
> -	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
> +	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
> +	{"i915_predictive_load_ctl", &i915_predictive_load_ctl}

And if the feature is to become real one day, given it's nature, the 
knob would probably have to go to sysfs, not debugfs.

>   };
>   
>   int i915_debugfs_register(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 137ec33..0505c47 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
>   };
>   
>   struct drm_i915_private {
> +	struct hrtimer pred_timer;

Surely not the first member! :)

Regards,

Tvrtko

>   	struct drm_device drm;
>   
>   	struct kmem_cache *objects;
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2018-09-21  9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
  2018-09-21 13:05   ` Tvrtko Ursulin
@ 2018-09-22 18:09   ` kbuild test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2018-09-22 18:09 UTC (permalink / raw)
  To: kedar.j.karanje
  Cc: Aravindan Muthukumar, intel-gfx, Yogesh Marathe, kbuild-all,
	Ankit Navik, Praveen Diwakar

[-- Attachment #1: Type: text/plain, Size: 3375 bytes --]

Hi Praveen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/kedar-j-karanje-intel-com/drm-i915-Get-active-pending-request-for-given-context/20180923-012250
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x014-201838 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu/drm/i915/i915_gem_context.c: In function 'i915_set_optimum_config':
>> drivers/gpu/drm/i915/i915_gem_context.c:434:6: error: unused variable 'rpcs_config' [-Werror=unused-variable]
     u32 rpcs_config = 0;
         ^~~~~~~~~~~
>> drivers/gpu/drm/i915/i915_gem_context.c:433:7: error: unused variable 'reg_state' [-Werror=unused-variable]
     u32 *reg_state = ce->lrc_reg_state;
          ^~~~~~~~~
   cc1: all warnings being treated as errors

vim +/rpcs_config +434 drivers/gpu/drm/i915/i915_gem_context.c

   427	
   428	
   429	void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
   430			enum gem_tier_versions version)
   431	{
   432		struct intel_context *ce = &ctx->__engine[RCS];
 > 433		u32 *reg_state = ce->lrc_reg_state;
 > 434		u32 rpcs_config = 0;
   435		/* Call opt_config to get correct configuration for eu,slice,subslice */
   436		ctx->slice_cnt = (u8)opt_config[version][type].slice;
   437		ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
   438		ctx->eu_cnt = (u8)opt_config[version][type].eu;
   439	
   440		/* Enabling this to update the rpcs */
   441		if (ctx->prev_load_type != type)
   442			ctx->update_render_config = 1;
   443	
   444		ctx->prev_load_type = type;
   445	}
   446	/**
   447	 * i915_gem_context_create_gvt - create a GVT GEM context
   448	 * @dev: drm device *
   449	 *
   450	 * This function is used to create a GVT specific GEM context.
   451	 *
   452	 * Returns:
   453	 * pointer to i915_gem_context on success, error pointer if failed
   454	 *
   455	 */
   456	struct i915_gem_context *
   457	i915_gem_context_create_gvt(struct drm_device *dev)
   458	{
   459		struct i915_gem_context *ctx;
   460		int ret;
   461	
   462		if (!IS_ENABLED(CONFIG_DRM_I915_GVT))
   463			return ERR_PTR(-ENODEV);
   464	
   465		ret = i915_mutex_lock_interruptible(dev);
   466		if (ret)
   467			return ERR_PTR(ret);
   468	
   469		ctx = __create_hw_context(to_i915(dev), NULL);
   470		if (IS_ERR(ctx))
   471			goto out;
   472	
   473		ctx->file_priv = ERR_PTR(-EBADF);
   474		i915_gem_context_set_closed(ctx); /* not user accessible */
   475		i915_gem_context_clear_bannable(ctx);
   476		i915_gem_context_set_force_single_submission(ctx);
   477		if (!USES_GUC_SUBMISSION(to_i915(dev)))
   478			ctx->ring_size = 512 * PAGE_SIZE; /* Max ring buffer size */
   479	
   480		GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
   481	out:
   482		mutex_unlock(&dev->struct_mutex);
   483		return ctx;
   484	}
   485	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35992 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
  2018-09-21  9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
  2018-09-21 13:24   ` Tvrtko Ursulin
@ 2018-09-22 18:31   ` kbuild test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2018-09-22 18:31 UTC (permalink / raw)
  To: kedar.j.karanje
  Cc: Aravindan Muthukumar, intel-gfx, Yogesh Marathe, kbuild-all,
	Ankit Navik, Praveen Diwakar

[-- Attachment #1: Type: text/plain, Size: 3087 bytes --]

Hi Praveen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/kedar-j-karanje-intel-com/drm-i915-Get-active-pending-request-for-given-context/20180923-012250
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x014-201838 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpu//drm/i915/i915_debugfs.c: In function 'predictive_load_cb':
>> drivers/gpu//drm/i915/i915_debugfs.c:4762:26: error: unused variable 'engine' [-Werror=unused-variable]
     struct intel_engine_cs *engine;
                             ^~~~~~
>> drivers/gpu//drm/i915/i915_debugfs.c:4761:23: error: unused variable 'id' [-Werror=unused-variable]
     enum intel_engine_id id;
                          ^~
   drivers/gpu//drm/i915/i915_debugfs.c: In function 'i915_predictive_load_get':
>> drivers/gpu//drm/i915/i915_debugfs.c:4797:27: error: unused variable 'dev_priv' [-Werror=unused-variable]
     struct drm_i915_private *dev_priv = data;
                              ^~~~~~~~
   cc1: all warnings being treated as errors

vim +/dev_priv +4797 drivers/gpu//drm/i915/i915_debugfs.c

  4755	
  4756	static enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
  4757	{
  4758		struct drm_i915_private *dev_priv =
  4759			container_of(hrtimer, typeof(*dev_priv),
  4760					pred_timer);
> 4761		enum intel_engine_id id;
> 4762		struct intel_engine_cs *engine;
  4763		struct i915_gem_context *ctx;
  4764		u64 req_pending;
  4765	
  4766		list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
  4767	
  4768			if (!ctx->name)
  4769				continue;
  4770	
  4771			mutex_lock(&dev_priv->pred_mutex);
  4772			req_pending = ctx->req_cnt;
  4773			mutex_unlock(&dev_priv->pred_mutex);
  4774	
  4775			if (req_pending == PENDING_REQ_0)
  4776				continue;
  4777	
  4778			if (req_pending > PENDING_REQ_3)
  4779				ctx->load_type = LOAD_TYPE_HIGH;
  4780			else if (req_pending == PENDING_REQ_3)
  4781				ctx->load_type = LOAD_TYPE_MEDIUM;
  4782			else if (req_pending < PENDING_REQ_3)
  4783				ctx->load_type = LOAD_TYPE_LOW;
  4784	
  4785			i915_set_optimum_config(ctx->load_type, ctx, KABYLAKE_GT3);
  4786		}
  4787	
  4788		hrtimer_forward_now(hrtimer,
  4789				ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
  4790	
  4791		return HRTIMER_RESTART;
  4792	}
  4793	
  4794	static int
  4795	i915_predictive_load_get(void *data, u64 *val)
  4796	{
> 4797		struct drm_i915_private *dev_priv = data;
  4798	
  4799		*val = predictive_load_enable;
  4800		return 0;
  4801	}
  4802	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35992 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
  2018-09-21  9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
  2018-09-21 12:51   ` Tvrtko Ursulin
@ 2018-09-22 19:13   ` kbuild test robot
  1 sibling, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2018-09-22 19:13 UTC (permalink / raw)
  To: kedar.j.karanje
  Cc: Aravindan Muthukumar, intel-gfx, Yogesh Marathe, kbuild-all,
	Ankit Navik, Praveen Diwakar

[-- Attachment #1: Type: text/plain, Size: 9434 bytes --]

Hi Praveen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/kedar-j-karanje-intel-com/drm-i915-Get-active-pending-request-for-given-context/20180923-012250
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x014-201838 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/gpu//drm/i915/intel_lrc.c: In function 'execlists_update_context':
>> drivers/gpu//drm/i915/intel_lrc.c:452:26: error: unused variable 'engine' [-Werror=unused-variable]
     struct intel_engine_cs *engine = rq->engine;
                             ^~~~~~
   Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size
   Cyclomatic Complexity 5 include/linux/compiler.h:__write_once_size
   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_read
   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:__set_bit
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:clear_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:__clear_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:__test_and_set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/arch_hweight.h:__arch_hweight32
   Cyclomatic Complexity 1 arch/x86/include/asm/arch_hweight.h:__arch_hweight8
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_dec
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_add_return
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_sub_return
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc_return
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec_return
   Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD
   Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid
   Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid
   Cyclomatic Complexity 2 include/linux/list.h:__list_add
   Cyclomatic Complexity 1 include/linux/list.h:list_add
   Cyclomatic Complexity 1 include/linux/list.h:list_add_tail
   Cyclomatic Complexity 1 include/linux/list.h:__list_del
   Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry
   Cyclomatic Complexity 1 include/linux/list.h:list_move_tail
   Cyclomatic Complexity 1 include/linux/list.h:list_empty
   Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current
   Cyclomatic Complexity 1 include/linux/err.h:ERR_PTR
   Cyclomatic Complexity 1 include/linux/err.h:PTR_ERR
   Cyclomatic Complexity 1 include/linux/err.h:IS_ERR
   Cyclomatic Complexity 3 include/linux/err.h:IS_ERR_OR_NULL
   Cyclomatic Complexity 1 arch/x86/include/asm/processor.h:rep_nop
   Cyclomatic Complexity 1 arch/x86/include/asm/processor.h:cpu_relax
   Cyclomatic Complexity 1 arch/x86/include/asm/preempt.h:should_resched
   Cyclomatic Complexity 1 include/linux/lockdep.h:lock_is_held
   Cyclomatic Complexity 1 include/linux/spinlock.h:spinlock_check
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock_irq
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irq
   Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irqrestore
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:readl
   Cyclomatic Complexity 1 arch/x86/include/asm/io.h:writel
   Cyclomatic Complexity 1 include/linux/rbtree.h:rb_link_node
   Cyclomatic Complexity 1 include/linux/seqlock.h:raw_write_seqcount_begin
   Cyclomatic Complexity 1 include/linux/seqlock.h:raw_write_seqcount_end
   Cyclomatic Complexity 1 include/linux/seqlock.h:write_seqcount_begin_nested
   Cyclomatic Complexity 1 include/linux/seqlock.h:write_seqcount_begin
   Cyclomatic Complexity 1 include/linux/seqlock.h:write_seqcount_end
   Cyclomatic Complexity 1 include/linux/seqlock.h:__write_seqlock_irqsave
   Cyclomatic Complexity 1 include/linux/seqlock.h:write_sequnlock_irqrestore
   Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies
   Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies
   Cyclomatic Complexity 3 include/linux/ktime.h:ktime_compare
   Cyclomatic Complexity 1 include/linux/ktime.h:ktime_after
   Cyclomatic Complexity 1 include/linux/completion.h:__init_completion
   Cyclomatic Complexity 3 include/linux/mmzone.h:__nr_to_section
   Cyclomatic Complexity 1 include/linux/mmzone.h:__section_mem_map_addr
   Cyclomatic Complexity 1 include/linux/kref.h:kref_get
   Cyclomatic Complexity 2 include/linux/kref.h:kref_put
   Cyclomatic Complexity 2 include/linux/interrupt.h:tasklet_unlock_wait
   Cyclomatic Complexity 2 include/linux/interrupt.h:tasklet_schedule
   Cyclomatic Complexity 2 include/linux/interrupt.h:tasklet_hi_schedule
   Cyclomatic Complexity 1 include/linux/mm.h:page_to_section
   Cyclomatic Complexity 1 include/linux/mm.h:lowmem_page_address
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_inc
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_dec
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disable
   Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_enable
   Cyclomatic Complexity 1 include/linux/highmem.h:kmap_atomic
   Cyclomatic Complexity 2 include/linux/highmem.h:__kunmap_atomic
   Cyclomatic Complexity 2 include/linux/dma-fence.h:dma_fence_put
   Cyclomatic Complexity 2 include/linux/dma-fence.h:dma_fence_get
   Cyclomatic Complexity 3 include/linux/dma-fence.h:dma_fence_set_error
   Cyclomatic Complexity 1 include/drm/drm_print.h:drm_debug_printer
   Cyclomatic Complexity 1 include/drm/drm_mm.h:drm_mm_node_allocated
   Cyclomatic Complexity 1 include/drm/drm_gem.h:drm_gem_object_get
   Cyclomatic Complexity 1 include/drm/drm_gem.h:__drm_gem_object_put
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_reg.h:i915_mmio_reg_offset
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_utils.h:__list_del_many
   Cyclomatic Complexity 2 drivers/gpu//drm/i915/i915_gem.h:__tasklet_disable_sync_once
   Cyclomatic Complexity 2 drivers/gpu//drm/i915/i915_gem.h:__tasklet_enable_sync_once
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_gem.h:__tasklet_is_enabled
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_sw_fence.h:i915_sw_fence_done
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:dma_fence_is_i915
   Cyclomatic Complexity 4 drivers/gpu//drm/i915/i915_request.h:to_request
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_request_get
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_request_put
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_request_global_seqno
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_seqno_passed
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_timeline.h:i915_timeline_put
   Cyclomatic Complexity 1 drivers/gpu//drm/i915/intel_ringbuffer.h:intel_engine_has_preemption

vim +/engine +452 drivers/gpu//drm/i915/intel_lrc.c

   447	
   448	static u64 execlists_update_context(struct i915_request *rq)
   449	{
   450		struct intel_context *ce = rq->hw_context;
   451		struct i915_gem_context *ctx = rq->gem_context;
 > 452		struct intel_engine_cs *engine = rq->engine;
   453		struct i915_hw_ppgtt *ppgtt =
   454			rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
   455		u32 *reg_state = ce->lrc_reg_state;
   456		u32 rpcs_config = 0;
   457	
   458		reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
   459		if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
   460				ctx->update_render_config) {
   461			rpcs_config = get_context_rpcs_config(ctx);
   462			reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
   463			CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
   464					rpcs_config);
   465			ctx->update_render_config = 0;
   466		}
   467	
   468		/* True 32b PPGTT with dynamic page allocation: update PDP
   469		 * registers and point the unallocated PDPs to scratch page.
   470		 * PML4 is allocated during ppgtt init, so this is not needed
   471		 * in 48-bit mode.
   472		 */
   473		if (ppgtt && !i915_vm_is_48bit(&ppgtt->vm))
   474			execlists_update_context_pdps(ppgtt, reg_state);
   475	
   476		return ce->lrc_desc;
   477	}
   478	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35871 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
  2018-09-21 13:24   ` Tvrtko Ursulin
@ 2018-09-25  6:12     ` Navik, Ankit P
  2018-09-25  8:25       ` Tvrtko Ursulin
  0 siblings, 1 reply; 25+ messages in thread
From: Navik, Ankit P @ 2018-09-25  6:12 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx
  Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan

Hi Tvrtko, 

Thank you for your valuable comments. We have gone through it. 
I'll be submitting revised patch-sets after incorporating all your review comments.

> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > High resoluton timer is used for this purpose.
> >
> > Debugfs is provided to enable/disable/update timer configuration
> >
> > Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
> > ---
> >   drivers/gpu/drm/i915/i915_debugfs.c | 94
> ++++++++++++++++++++++++++++++++++++-
> >   drivers/gpu/drm/i915/i915_drv.h     |  1 +
> >   2 files changed, 94 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > b/drivers/gpu/drm/i915/i915_debugfs.c
> > index f9ce35d..81ba509 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -4740,6 +4740,97 @@ static const struct drm_info_list
> i915_debugfs_list[] = {
> >   	{"i915_drrs_status", i915_drrs_status, 0},
> >   	{"i915_rps_boost_info", i915_rps_boost_info, 0},
> >   };
> > +
> > +#define POLL_PERIOD_MS	(1000 * 1000)
> > +#define PENDING_REQ_0	0 /* No active request pending*/
> > +#define PENDING_REQ_3	3 /* Threshold value of 3 active request
> pending*/
> > +			  /* Anything above this is considered as HIGH load
> > +			   * context
> > +			   */
> > +			  /* And less is considered as LOW load*/
> > +			  /* And equal is considered as mediaum load */
> 
> Wonky comments and some typos up to here.
> 
> > +
> > +static int predictive_load_enable;
> > +static int predictive_load_timer_init;
> > +
> > +static enum hrtimer_restart predictive_load_cb(struct hrtimer
> > +*hrtimer) {
> > +	struct drm_i915_private *dev_priv =
> > +		container_of(hrtimer, typeof(*dev_priv),
> > +				pred_timer);
> > +	enum intel_engine_id id;
> > +	struct intel_engine_cs *engine;
> 
> Some unused's.
> 
> > +	struct i915_gem_context *ctx;
> > +	u64 req_pending;
> 
> unsigned long, and also please try to order declaration so the right edge of text
> is moving in one direction only.
> 
> > +
> > +	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> > +
> > +		if (!ctx->name)
> > +			continue;
> 
> What is this?
> 
> > +
> > +		mutex_lock(&dev_priv->pred_mutex);
> 
> Here the mutex bites you since you cannot sleep in the timer callback.
> atomic_t would solve it. Or would a native unsigned int/long since lock to read a
> native word on x86 is not needed.
> 
> > +		req_pending = ctx->req_cnt;
> > +		mutex_unlock(&dev_priv->pred_mutex);
> > +
> > +		if (req_pending == PENDING_REQ_0)
> > +			continue;
> > +
> > +		if (req_pending > PENDING_REQ_3)
> > +			ctx->load_type = LOAD_TYPE_HIGH;
> > +		else if (req_pending == PENDING_REQ_3)
> > +			ctx->load_type = LOAD_TYPE_MEDIUM;
> > +		else if (req_pending < PENDING_REQ_3)
> 
> Must be smaller if not greater or equal, but maybe the compiler does that for
> you.
> 
> > +			ctx->load_type = LOAD_TYPE_LOW;
> > +
> > +		i915_set_optimum_config(ctx->load_type, ctx,
> KABYLAKE_GT3);
> 
> Only KBL? Idea to put the table in dev_priv FTW! :)
> 
> ctx->load_type used only as a temporary uncovered here? :)
> 
> > +	}
> > +
> > +	hrtimer_forward_now(hrtimer,
> > +
> 	ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
> 
> Or HRTIMER_NORESTART if disabled? Hard to call it, details..
> 
> > +
> > +	return HRTIMER_RESTART;
> > +}
> > +
> > +static int
> > +i915_predictive_load_get(void *data, u64 *val) {
> > +	struct drm_i915_private *dev_priv = data;
> > +
> > +	*val = predictive_load_enable;
> > +	return 0;
> > +}
> > +
> > +static int
> > +i915_predictive_load_set(void *data, u64 val) {
> > +	struct drm_i915_private *dev_priv = data;
> > +	struct intel_device_info *info;
> > +
> > +	info = mkwrite_device_info(dev_priv);
> 
> Unused, why?
> 
> > +
> > +	predictive_load_enable = val;
> > +
> > +	if (predictive_load_enable) {
> > +		if (!predictive_load_timer_init) {
> > +			hrtimer_init(&dev_priv->pred_timer,
> CLOCK_MONOTONIC,
> > +					HRTIMER_MODE_REL);
> > +			dev_priv->pred_timer.function = predictive_load_cb;
> > +			predictive_load_timer_init = 1;
> 
> Move timer init to dev_priv setup.
> 
> > +		}
> > +		hrtimer_start(&dev_priv->pred_timer,
> > +
> 	ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
> > +			HRTIMER_MODE_REL_PINNED);
> 
> Two threads can race to here.
> 
> Also you can give some slack to the timer since precision is not critical to you
> and can save you some CPU power.
> 
> > +	} else {
> > +		hrtimer_cancel(&dev_priv->pred_timer);
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
> > +			i915_predictive_load_get, i915_predictive_load_set,
> > +			"%llu\n");
> > +
> >   #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
> >
> >   static const struct i915_debugfs_files { @@ -4769,7 +4860,8 @@
> > static const struct i915_debugfs_files {
> >   	{"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
> >   	{"i915_ipc_status", &i915_ipc_status_fops},
> >   	{"i915_drrs_ctl", &i915_drrs_ctl_fops},
> > -	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
> > +	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
> > +	{"i915_predictive_load_ctl", &i915_predictive_load_ctl}
> 
> And if the feature is to become real one day, given it's nature, the knob would
> probably have to go to sysfs, not debugfs.
> 
> >   };
> >
> >   int i915_debugfs_register(struct drm_i915_private *dev_priv) diff
> > --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h index 137ec33..0505c47 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
> >   };
> >
> >   struct drm_i915_private {
> > +	struct hrtimer pred_timer;
> 
> Surely not the first member! :)
> 
> Regards,
> 
> Tvrtko
> 
> >   	struct drm_device drm;
> >
> >   	struct kmem_cache *objects;
> >

Regards, 
Ankit
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
  2018-09-25  8:04   ` Jani Nikula
@ 2018-09-25  7:41     ` Kedar J. Karanje
  0 siblings, 0 replies; 25+ messages in thread
From: Kedar J. Karanje @ 2018-09-25  7:41 UTC (permalink / raw)
  To: Jani Nikula, intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar

Hello Jani,


On Tuesday 25 September 2018 01:34 PM, Jani Nikula wrote:
> On Fri, 21 Sep 2018, kedar.j.karanje@intel.com wrote:
>> From: Praveen Diwakar <praveen.diwakar@intel.com>
>>
>> This patch gives us the active pending request count which is yet
>> to be submitted to the GPU
>>
>> Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
> Please remove these.
We will remove the Change-Ids in subsequent patches, thanks.
>
>> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
>> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
>> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
>> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
>> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> What are you trying to convey with the five signoffs?
The guys who were working on this moved on to other teams, since we 
picked it up from where they left we have included the original authors 
and added new folks who would continue to work on it too, I am not sure 
of what is the protocol in such situations.
I referred to the guidelines at: 
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html 
please do let me know if there are any changes in the same.

Thanks,
Kedar
>
> BR,
> Jani.
>
>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.c            | 1 +
>>   drivers/gpu/drm/i915/i915_drv.h            | 5 +++++
>>   drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
>>   drivers/gpu/drm/i915/i915_gem_context.h    | 6 ++++++
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
>>   drivers/gpu/drm/i915/intel_lrc.c           | 6 ++++++
>>   6 files changed, 24 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index f8cfd16..d37c46e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
>>   	mutex_init(&dev_priv->av_mutex);
>>   	mutex_init(&dev_priv->wm.wm_mutex);
>>   	mutex_init(&dev_priv->pps_mutex);
>> +	mutex_init(&dev_priv->pred_mutex);
>>   
>>   	i915_memcpy_init_early(dev_priv);
>>   
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 4aca534..137ec33 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
>>   	 * controller on different i2c buses. */
>>   	struct mutex gmbus_mutex;
>>   
>> +	/** pred_mutex protects against councurrent usage of pending
>> +	 * request counter for multiple contexts
>> +	 */
>> +	struct mutex pred_mutex;
>> +
>>   	/**
>>   	 * Base address of the gmbus and gpio block.
>>   	 */
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index b10770c..30932d9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>>   	}
>>   
>>   	trace_i915_context_create(ctx);
>> +	ctx->req_cnt = 0;
>>   
>>   	return ctx;
>>   }
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
>> index b116e49..243ea22 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
>> @@ -194,6 +194,12 @@ struct i915_gem_context {
>>   	 * context close.
>>   	 */
>>   	struct list_head handles_list;
>> +
>> +	/** req_cnt: tracks the pending commands, based on which we decide to
>> +	 * go for low/medium/high load configuration of the GPU, this is
>> +	 * controlled via a mutex
>> +	 */
>> +	u64 req_cnt;
>>   };
>>   
>>   static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 3f0c612..f799ff9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>   		       struct drm_syncobj **fences)
>>   {
>>   	struct i915_execbuffer eb;
>> +	struct drm_i915_private *dev_priv = to_i915(dev);
>>   	struct dma_fence *in_fence = NULL;
>>   	struct sync_file *out_fence = NULL;
>>   	int out_fence_fd = -1;
>> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>>   	 */
>>   	eb.request->batch = eb.batch;
>>   
>> +	mutex_lock(&dev_priv->pred_mutex);
>> +	eb.ctx->req_cnt++;
>> +	mutex_unlock(&dev_priv->pred_mutex);
>> +
>>   	trace_i915_request_queue(eb.request, eb.batch_flags);
>>   	err = eb_submit(&eb);
>>   err_request:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 1744792..039fbdb 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>>   			trace_i915_request_in(rq, port_index(port, execlists));
>>   			last = rq;
>>   			submit = true;
>> +
>> +			mutex_lock(&rq->i915->pred_mutex);
>> +			if (rq->gem_context->req_cnt > 0) {
>> +				rq->gem_context->req_cnt--;
>> +			}
>> +			mutex_unlock(&rq->i915->pred_mutex);
>>   		}
>>   
>>   		rb_erase_cached(&p->node, &execlists->queue);

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
  2018-09-21  9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
  2018-09-21 12:39   ` Tvrtko Ursulin
@ 2018-09-25  8:04   ` Jani Nikula
  2018-09-25  7:41     ` Kedar J. Karanje
  1 sibling, 1 reply; 25+ messages in thread
From: Jani Nikula @ 2018-09-25  8:04 UTC (permalink / raw)
  To: kedar.j.karanje, intel-gfx
  Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar

On Fri, 21 Sep 2018, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
>
> Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d

Please remove these.

> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>

What are you trying to convey with the five signoffs?

BR,
Jani.


> ---
>  drivers/gpu/drm/i915/i915_drv.c            | 1 +
>  drivers/gpu/drm/i915/i915_drv.h            | 5 +++++
>  drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
>  drivers/gpu/drm/i915/i915_gem_context.h    | 6 ++++++
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
>  drivers/gpu/drm/i915/intel_lrc.c           | 6 ++++++
>  6 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index f8cfd16..d37c46e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
>  	mutex_init(&dev_priv->av_mutex);
>  	mutex_init(&dev_priv->wm.wm_mutex);
>  	mutex_init(&dev_priv->pps_mutex);
> +	mutex_init(&dev_priv->pred_mutex);
>  
>  	i915_memcpy_init_early(dev_priv);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4aca534..137ec33 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
>  	 * controller on different i2c buses. */
>  	struct mutex gmbus_mutex;
>  
> +	/** pred_mutex protects against councurrent usage of pending
> +	 * request counter for multiple contexts
> +	 */
> +	struct mutex pred_mutex;
> +
>  	/**
>  	 * Base address of the gmbus and gpio block.
>  	 */
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index b10770c..30932d9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>  	}
>  
>  	trace_i915_context_create(ctx);
> +	ctx->req_cnt = 0;
>  
>  	return ctx;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index b116e49..243ea22 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -194,6 +194,12 @@ struct i915_gem_context {
>  	 * context close.
>  	 */
>  	struct list_head handles_list;
> +
> +	/** req_cnt: tracks the pending commands, based on which we decide to
> +	 * go for low/medium/high load configuration of the GPU, this is
> +	 * controlled via a mutex
> +	 */
> +	u64 req_cnt;
>  };
>  
>  static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3f0c612..f799ff9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  		       struct drm_syncobj **fences)
>  {
>  	struct i915_execbuffer eb;
> +	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct dma_fence *in_fence = NULL;
>  	struct sync_file *out_fence = NULL;
>  	int out_fence_fd = -1;
> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>  	 */
>  	eb.request->batch = eb.batch;
>  
> +	mutex_lock(&dev_priv->pred_mutex);
> +	eb.ctx->req_cnt++;
> +	mutex_unlock(&dev_priv->pred_mutex);
> +
>  	trace_i915_request_queue(eb.request, eb.batch_flags);
>  	err = eb_submit(&eb);
>  err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 1744792..039fbdb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>  			trace_i915_request_in(rq, port_index(port, execlists));
>  			last = rq;
>  			submit = true;
> +
> +			mutex_lock(&rq->i915->pred_mutex);
> +			if (rq->gem_context->req_cnt > 0) {
> +				rq->gem_context->req_cnt--;
> +			}
> +			mutex_unlock(&rq->i915->pred_mutex);
>  		}
>  
>  		rb_erase_cached(&p->node, &execlists->queue);

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
  2018-09-25  6:12     ` Navik, Ankit P
@ 2018-09-25  8:25       ` Tvrtko Ursulin
  2018-11-06  4:46         ` Navik, Ankit P
  0 siblings, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-25  8:25 UTC (permalink / raw)
  To: Navik, Ankit P, intel-gfx
  Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan


On 25/09/2018 07:12, Navik, Ankit P wrote:
> Hi Tvrtko,
> 
> Thank you for your valuable comments. We have gone through it.
> I'll be submitting revised patch-sets after incorporating all your review comments.

You're welcome!

Btw one more thing - you can suspend your timer when GPU goes idle and 
restart it when active. See for instance i915_pmu_gt_parked/unparked 
where to put the hooks.

Regards,

Tvrtko

>> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
>>> From: Praveen Diwakar <praveen.diwakar@intel.com>
>>>
>>> High resoluton timer is used for this purpose.
>>>
>>> Debugfs is provided to enable/disable/update timer configuration
>>>
>>> Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
>>> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
>>> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
>>> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
>>> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
>>> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
>>> ---
>>>    drivers/gpu/drm/i915/i915_debugfs.c | 94
>> ++++++++++++++++++++++++++++++++++++-
>>>    drivers/gpu/drm/i915/i915_drv.h     |  1 +
>>>    2 files changed, 94 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index f9ce35d..81ba509 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -4740,6 +4740,97 @@ static const struct drm_info_list
>> i915_debugfs_list[] = {
>>>    	{"i915_drrs_status", i915_drrs_status, 0},
>>>    	{"i915_rps_boost_info", i915_rps_boost_info, 0},
>>>    };
>>> +
>>> +#define POLL_PERIOD_MS	(1000 * 1000)
>>> +#define PENDING_REQ_0	0 /* No active request pending*/
>>> +#define PENDING_REQ_3	3 /* Threshold value of 3 active request
>> pending*/
>>> +			  /* Anything above this is considered as HIGH load
>>> +			   * context
>>> +			   */
>>> +			  /* And less is considered as LOW load*/
>>> +			  /* And equal is considered as mediaum load */
>>
>> Wonky comments and some typos up to here.
>>
>>> +
>>> +static int predictive_load_enable;
>>> +static int predictive_load_timer_init;
>>> +
>>> +static enum hrtimer_restart predictive_load_cb(struct hrtimer
>>> +*hrtimer) {
>>> +	struct drm_i915_private *dev_priv =
>>> +		container_of(hrtimer, typeof(*dev_priv),
>>> +				pred_timer);
>>> +	enum intel_engine_id id;
>>> +	struct intel_engine_cs *engine;
>>
>> Some unused's.
>>
>>> +	struct i915_gem_context *ctx;
>>> +	u64 req_pending;
>>
>> unsigned long, and also please try to order declaration so the right edge of text
>> is moving in one direction only.
>>
>>> +
>>> +	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
>>> +
>>> +		if (!ctx->name)
>>> +			continue;
>>
>> What is this?
>>
>>> +
>>> +		mutex_lock(&dev_priv->pred_mutex);
>>
>> Here the mutex bites you since you cannot sleep in the timer callback.
>> atomic_t would solve it. Or would a native unsigned int/long since lock to read a
>> native word on x86 is not needed.
>>
>>> +		req_pending = ctx->req_cnt;
>>> +		mutex_unlock(&dev_priv->pred_mutex);
>>> +
>>> +		if (req_pending == PENDING_REQ_0)
>>> +			continue;
>>> +
>>> +		if (req_pending > PENDING_REQ_3)
>>> +			ctx->load_type = LOAD_TYPE_HIGH;
>>> +		else if (req_pending == PENDING_REQ_3)
>>> +			ctx->load_type = LOAD_TYPE_MEDIUM;
>>> +		else if (req_pending < PENDING_REQ_3)
>>
>> Must be smaller if not greater or equal, but maybe the compiler does that for
>> you.
>>
>>> +			ctx->load_type = LOAD_TYPE_LOW;
>>> +
>>> +		i915_set_optimum_config(ctx->load_type, ctx,
>> KABYLAKE_GT3);
>>
>> Only KBL? Idea to put the table in dev_priv FTW! :)
>>
>> ctx->load_type used only as a temporary uncovered here? :)
>>
>>> +	}
>>> +
>>> +	hrtimer_forward_now(hrtimer,
>>> +
>> 	ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
>>
>> Or HRTIMER_NORESTART if disabled? Hard to call it, details..
>>
>>> +
>>> +	return HRTIMER_RESTART;
>>> +}
>>> +
>>> +static int
>>> +i915_predictive_load_get(void *data, u64 *val) {
>>> +	struct drm_i915_private *dev_priv = data;
>>> +
>>> +	*val = predictive_load_enable;
>>> +	return 0;
>>> +}
>>> +
>>> +static int
>>> +i915_predictive_load_set(void *data, u64 val) {
>>> +	struct drm_i915_private *dev_priv = data;
>>> +	struct intel_device_info *info;
>>> +
>>> +	info = mkwrite_device_info(dev_priv);
>>
>> Unused, why?
>>
>>> +
>>> +	predictive_load_enable = val;
>>> +
>>> +	if (predictive_load_enable) {
>>> +		if (!predictive_load_timer_init) {
>>> +			hrtimer_init(&dev_priv->pred_timer,
>> CLOCK_MONOTONIC,
>>> +					HRTIMER_MODE_REL);
>>> +			dev_priv->pred_timer.function = predictive_load_cb;
>>> +			predictive_load_timer_init = 1;
>>
>> Move timer init to dev_priv setup.
>>
>>> +		}
>>> +		hrtimer_start(&dev_priv->pred_timer,
>>> +
>> 	ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
>>> +			HRTIMER_MODE_REL_PINNED);
>>
>> Two threads can race to here.
>>
>> Also you can give some slack to the timer since precision is not critical to you
>> and can save you some CPU power.
>>
>>> +	} else {
>>> +		hrtimer_cancel(&dev_priv->pred_timer);
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
>>> +			i915_predictive_load_get, i915_predictive_load_set,
>>> +			"%llu\n");
>>> +
>>>    #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>>>
>>>    static const struct i915_debugfs_files { @@ -4769,7 +4860,8 @@
>>> static const struct i915_debugfs_files {
>>>    	{"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
>>>    	{"i915_ipc_status", &i915_ipc_status_fops},
>>>    	{"i915_drrs_ctl", &i915_drrs_ctl_fops},
>>> -	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
>>> +	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
>>> +	{"i915_predictive_load_ctl", &i915_predictive_load_ctl}
>>
>> And if the feature is to become real one day, given it's nature, the knob would
>> probably have to go to sysfs, not debugfs.
>>
>>>    };
>>>
>>>    int i915_debugfs_register(struct drm_i915_private *dev_priv) diff
>>> --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h index 137ec33..0505c47 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
>>>    };
>>>
>>>    struct drm_i915_private {
>>> +	struct hrtimer pred_timer;
>>
>> Surely not the first member! :)
>>
>> Regards,
>>
>> Tvrtko
>>
>>>    	struct drm_device drm;
>>>
>>>    	struct kmem_cache *objects;
>>>
> 
> Regards,
> Ankit
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.
  2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
                   ` (5 preceding siblings ...)
  2018-09-21 12:31 ` [PATCH 0/4][RFC] " Tvrtko Ursulin
@ 2018-09-27 14:02 ` Joonas Lahtinen
  6 siblings, 0 replies; 25+ messages in thread
From: Joonas Lahtinen @ 2018-09-27 14:02 UTC (permalink / raw)
  To: intel-gfx, kedar.j.karanje

+ Tvrtko for adding the right media contacts

Quoting kedar.j.karanje@intel.com (2018-09-21 12:13:46)
> From: "Kedar J. Karanje" <kedar.j.karanje@intel.com>
> 
> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
> 
> Current GPU configuration code for i915 does not allow us to change
> EU/Slice/Sub-slice configuration dynamically. Its done only once while context
> is created.
> 
> While particular graphics application is running, if we examine the command
> requests from user space, we observe that command density is not consistent.
> It means there is scope to change the graphics configuration dynamically even
> while context is running actively. This patch series proposes the solution to
> find the active pending load for all active context at given time and based on
> that, dynamically perform graphics configuration for each context.
> 
> We use a hr (high resolution) timer with i915 driver in kernel to get a
> callback every few milliseconds (this timer value can be configured through
> debugfs, default is '0' indicating timer is in disabled state i.e. original
> system without any intervention).In the timer callback, we examine pending
> commands for a context in the queue, essentially, we intercept them before
> they are executed by GPU and we update context with required number of EUs.

The off-by-default and enabling through debugfs nature would make this
effectively dead code for upstream as debugfs is off limits in many
production systems for the security concerns.

So the algorithm should really be generic enough to be on-by-default
without regressing existing workloads. Otherwise there would need to
be some tuning tool to control the tables outside of debugfs. cgroups
could be one place, but the effort to bring cgroups doesn't seem to be
advancing, please see mailing list archive.

There's also an ongoing series about allowing userspace to control the
said SSEU register freely for Media, so I'd hope we could consolidate on
one control mechanism.

Regards, Joonas

> 
> Two questions, how did we arrive at right timer value? and what's the right
> number of EUs? For the prior one, empirical data to achieve best performance
> in least power was considered. For the later one, we roughly categorized number 
> of EUs logically based on platform. Now we compare number of pending commands
> with a particular threshold and then set number of EUs accordingly with update
> context. That threshold is also based on experiments & findings. If GPU is able
> to catch up with CPU, typically there are no pending commands, the EU config
> would remain unchanged there. In case there are more pending commands we
> reprogram context with higher number of EUs. Please note, here we are changing
> EUs even while context is running by examining pending commands every 'x'
> milliseconds.
> 
> With this solution in place, on KBL-GT3 + Android we saw following pnp
> benefits without any performance degradation, power numbers mentioned here
> are system power.
> 
> App /KPI               | % Power |
>                        | Benefit |
>                        |  (mW)   |
> ---------------------------------|
> 3D Mark (Ice storm)    | 2.30%   |
> TRex On screen         | 2.49%   |
> TRex Off screen        | 1.32%   |
> ManhattanOn screen     | 3.11%   |
> Manhattan Off screen   | 0.89%   |
> AnTuTu  6.1.4          | 3.42%   |
> 
> Note - For KBL (GEN9) we cannot control at sub-slice level, it was always  a
> constraint.
> We always controlled number of EUs rather than sub-slices/slices.
> 
> 
> Praveen Diwakar (4):
>   drm/i915: Get active pending request for given context
>   drm/i915: Update render power clock state configuration for given
>     context
>   drm/i915: set optimum eu/slice/sub-slice configuration based on load
>     type
>   drm/i915: Predictive governor to control eu/slice/subslice based on
>     workload
> 
>  drivers/gpu/drm/i915/i915_debugfs.c        | 94 +++++++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_drv.c            |  1 +
>  drivers/gpu/drm/i915/i915_drv.h            |  6 ++
>  drivers/gpu/drm/i915/i915_gem_context.c    | 52 +++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_context.h    | 52 +++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++
>  drivers/gpu/drm/i915/intel_lrc.c           | 47 +++++++++++++++
>  7 files changed, 256 insertions(+), 1 deletion(-)
> 
> --
> 2.7.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
  2018-09-21 12:39   ` Tvrtko Ursulin
@ 2018-11-06  4:18     ` Navik, Ankit P
  0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06  4:18 UTC (permalink / raw)
  To: Tvrtko Ursulin, J Karanje, Kedar, intel-gfx
  Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan

Hi Tvrtko,

> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Friday, September 21, 2018 6:10 PM
> To: J Karanje, Kedar <kedar.j.karanje@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Diwakar, Praveen <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Navik, Ankit P <ankit.p.navik@intel.com>;
> Muthukumar, Aravindan <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 1/4] drm/i915: Get active pending request for
> given context
> 
> 
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > This patch gives us the active pending request count which is yet to
> > be submitted to the GPU
> >
> > Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.c            | 1 +
> >   drivers/gpu/drm/i915/i915_drv.h            | 5 +++++
> >   drivers/gpu/drm/i915/i915_gem_context.c    | 1 +
> >   drivers/gpu/drm/i915/i915_gem_context.h    | 6 ++++++
> >   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
> >   drivers/gpu/drm/i915/intel_lrc.c           | 6 ++++++
> >   6 files changed, 24 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c index f8cfd16..d37c46e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct
> drm_i915_private *dev_priv,
> >   	mutex_init(&dev_priv->av_mutex);
> >   	mutex_init(&dev_priv->wm.wm_mutex);
> >   	mutex_init(&dev_priv->pps_mutex);
> > +	mutex_init(&dev_priv->pred_mutex);
> >
> >   	i915_memcpy_init_early(dev_priv);
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h index 4aca534..137ec33 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1609,6 +1609,11 @@ struct drm_i915_private {
> >   	 * controller on different i2c buses. */
> >   	struct mutex gmbus_mutex;
> >
> > +	/** pred_mutex protects against councurrent usage of pending
> > +	 * request counter for multiple contexts
> > +	 */
> > +	struct mutex pred_mutex;
> > +
> >   	/**
> >   	 * Base address of the gmbus and gpio block.
> >   	 */
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index b10770c..30932d9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private
> *dev_priv,
> >   	}
> >
> >   	trace_i915_context_create(ctx);
> > +	ctx->req_cnt = 0;
> >
> >   	return ctx;
> >   }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
> > b/drivers/gpu/drm/i915/i915_gem_context.h
> > index b116e49..243ea22 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -194,6 +194,12 @@ struct i915_gem_context {
> >   	 * context close.
> >   	 */
> >   	struct list_head handles_list;
> > +
> > +	/** req_cnt: tracks the pending commands, based on which we decide
> to
> > +	 * go for low/medium/high load configuration of the GPU, this is
> > +	 * controlled via a mutex
> > +	 */
> > +	u64 req_cnt;
> 
> 64-bit is too wide and mutex causes you problems later in the series.
> I'd suggest atomic_t.
Changes done in v2. 
> 
> >   };
> >
> >   static inline bool i915_gem_context_is_closed(const struct
> > i915_gem_context *ctx) diff --git
> > a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 3f0c612..f799ff9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >   		       struct drm_syncobj **fences)
> >   {
> >   	struct i915_execbuffer eb;
> > +	struct drm_i915_private *dev_priv = to_i915(dev);
> >   	struct dma_fence *in_fence = NULL;
> >   	struct sync_file *out_fence = NULL;
> >   	int out_fence_fd = -1;
> > @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> >   	 */
> >   	eb.request->batch = eb.batch;
> >
> > +	mutex_lock(&dev_priv->pred_mutex);
> > +	eb.ctx->req_cnt++;
> > +	mutex_unlock(&dev_priv->pred_mutex);
> > +
> >   	trace_i915_request_queue(eb.request, eb.batch_flags);
> >   	err = eb_submit(&eb);
> >   err_request:
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index 1744792..039fbdb 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs
> *engine)
> >   			trace_i915_request_in(rq, port_index(port, execlists));
> >   			last = rq;
> >   			submit = true;
> > +
> > +			mutex_lock(&rq->i915->pred_mutex);
> > +			if (rq->gem_context->req_cnt > 0) {
> 
> Presumably you hit underflow here and then added the protection. This was a
> hint the accounting does not work as expected. Preemption for instance would
> wreak havoc with it. :)
> 
> Have a look at my per engine queued/runnable counters for how to do it
> correctly, only that you would need to make it per context and merge into one
> aggregated queued+runnable instead of two separate counters, if you wanted
> to preserve your heuristics.
> 
> https://patchwork.freedesktop.org/patch/227981/
> https://patchwork.freedesktop.org/patch/227976/

We will wait for these patches to get merged and we will continue to investigate
on how to incorporate this in our changes. 

Regards, Ankit
> 
> Regards,
> 
> Tvrtko
> 
> > +				rq->gem_context->req_cnt--;
> > +			}
> > +			mutex_unlock(&rq->i915->pred_mutex);
> >   		}
> >
> >   		rb_erase_cached(&p->node, &execlists->queue);
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
  2018-09-21 12:51   ` Tvrtko Ursulin
@ 2018-11-06  4:23     ` Navik, Ankit P
  0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06  4:23 UTC (permalink / raw)
  To: Tvrtko Ursulin, J Karanje, Kedar, intel-gfx
  Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan

Hi Tvrtko, 

> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Friday, September 21, 2018 6:22 PM
> To: J Karanje, Kedar <kedar.j.karanje@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Diwakar, Praveen <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Navik, Ankit P <ankit.p.navik@intel.com>;
> Muthukumar, Aravindan <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 2/4] drm/i915: Update render power clock state
> configuration for given context
> 
> 
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > This patch will update power clock state register at runtime base on
> > the flag update_render_config which can set by any governor which
> > computes load and want to update rpcs register.
> > subsequent patches will have a timer based governor which computes
> > pending load/request.
> >
> > Change-Id: I4e7d2f484b957d5bd496e1decc59a69e3bc6d186
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_context.c |  5 ++++
> >   drivers/gpu/drm/i915/i915_gem_context.h | 14 +++++++++++
> >   drivers/gpu/drm/i915/intel_lrc.c        | 41
> +++++++++++++++++++++++++++++++++
> >   3 files changed, 60 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 30932d9..2838c1d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -388,6 +388,11 @@ i915_gem_create_context(struct drm_i915_private
> > *dev_priv,
> >
> >   	trace_i915_context_create(ctx);
> >   	ctx->req_cnt = 0;
> > +	ctx->update_render_config = 0;
> > +	ctx->slice_cnt = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
> > +	ctx->subslice_cnt = hweight8(
> > +			INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> > +	ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> >
> >   	return ctx;
> >   }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
> > b/drivers/gpu/drm/i915/i915_gem_context.h
> > index 243ea22..52e341c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -200,6 +200,20 @@ struct i915_gem_context {
> >   	 * controlled via a mutex
> >   	 */
> >   	u64 req_cnt;
> > +
> > +	/** slice_cnt: used to set the # of slices to be enabled. */
> > +	u8 slice_cnt;
> > +
> > +	/** subslice_cnt: used to set the # of subslices to be enabled. */
> > +	u8 subslice_cnt;
> > +
> > +	/** eu_cnt: used to set the # of eu to be enabled. */
> > +	u8 eu_cnt;
> > +
> > +	/** update_render_config: to track the updates to the render
> > +	 * configuration (S/SS/EU Configuration on the GPU)
> > +	 */
> > +	bool update_render_config;
> >   };
> >
> >   static inline bool i915_gem_context_is_closed(const struct
> > i915_gem_context *ctx) diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index 039fbdb..d2d0e7d 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -364,6 +364,36 @@ execlists_unwind_incomplete_requests(struct
> intel_engine_execlists *execlists)
> >   	spin_unlock_irqrestore(&engine->timeline.lock, flags);
> >   }
> >
> > +static u32
> > +get_context_rpcs_config(struct i915_gem_context *ctx) {
> > +	struct drm_i915_private *dev_priv = ctx->i915;
> > +	u32 rpcs = 0;
> > +
> > +	if (INTEL_GEN(dev_priv) < 8)
> > +		return 0;
> > +
> > +	if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
> > +		rpcs |= GEN8_RPCS_S_CNT_ENABLE;
> > +		rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
> > +		rpcs |= GEN8_RPCS_ENABLE;
> > +	}
> > +
> > +	if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
> > +		rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
> > +		rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
> > +		rpcs |= GEN8_RPCS_ENABLE;
> > +	}
> > +
> > +	if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
> > +		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
> > +		rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
> > +		rpcs |= GEN8_RPCS_ENABLE;
> > +	}
> > +
> > +	return rpcs;
> > +}
> 
> This function is very similar to make_rpcs so I'd suggest to extract some
> commonality and share the code.
Incorporated Changes in v2.
> 
> In any case you are likely to get overtaken by
> https://patchwork.freedesktop.org/series/48194/ after which you will be able
> to use struct intel_sseu to package the data you need in one ctx member, etc.
> 
> > +
> >   static inline void
> >   execlists_context_status_change(struct i915_request *rq, unsigned long
> status)
> >   {
> > @@ -418,11 +448,22 @@ execlists_update_context_pdps(struct
> i915_hw_ppgtt *ppgtt, u32 *reg_state)
> >   static u64 execlists_update_context(struct i915_request *rq)
> >   {
> >   	struct intel_context *ce = rq->hw_context;
> > +	struct i915_gem_context *ctx = rq->gem_context;
> > +	struct intel_engine_cs *engine = rq->engine;
> >   	struct i915_hw_ppgtt *ppgtt =
> >   		rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
> >   	u32 *reg_state = ce->lrc_reg_state;
> > +	u32 rpcs_config = 0;
> >
> >   	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring,
> > rq->tail);
> > +	if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
> > +			ctx->update_render_config) {
> > +		rpcs_config = get_context_rpcs_config(ctx);
> > +		reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> > +		CTX_REG(reg_state, CTX_R_PWR_CLK_STATE,
> GEN8_R_PWR_CLK_STATE,
> > +				rpcs_config);
> 
> For a first submission this works, but when appending to a running context the
> RPCS value you write here will probably get overwritten by context save. At least
> I would be surprised if lite-restore would write to the register.
> 
> So if I am correct the problem you have is that your configuration changes will
> not always become live when you expect it to. And then if you hit lite-restore
> you don't try to re-configure it later unless there is a config change.
> 
> But I don't think there is a cheap/reasonable way to fix this fully. So maybe you
> just give up on this chunk and rely on context image to be updated on context
> pin like in https://patchwork.freedesktop.org/patch/250007/.

We will wait for this patch to get merged, because We see changes between two
Patch for register access. We will analyse this further for
MI_LOAD_REGISTER_IMM

Regards, Ankit
> 
> Regards,
> 
> Tvrtko
> 
> > +		ctx->update_render_config = 0;
> > +	}
> >
> >   	/* True 32b PPGTT with dynamic page allocation: update PDP
> >   	 * registers and point the unallocated PDPs to scratch page.
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
  2018-09-21 13:05   ` Tvrtko Ursulin
@ 2018-11-06  4:25     ` Navik, Ankit P
  0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06  4:25 UTC (permalink / raw)
  To: Tvrtko Ursulin, J Karanje, Kedar, intel-gfx
  Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan

Hi Tvrtko, 
Your review changes are incorporated in v2. 
Regards, Ankit
> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Friday, September 21, 2018 6:36 PM
> To: J Karanje, Kedar <kedar.j.karanje@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Diwakar, Praveen <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Navik, Ankit P <ankit.p.navik@intel.com>;
> Muthukumar, Aravindan <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice
> configuration based on load type
> 
> 
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > This patch will select optimum eu/slice/sub-slice configuration based
> > on type of load (low, medium, high) as input.
> > Based on our readings and experiments we have predefined set of
> > optimum configuration for each platform(CHT, KBL).
> > i915_set_optimum_config will select optimum configuration from
> > pre-defined optimum configuration table(opt_config).
> >
> > Change-Id: I3a6a2a6bdddd01b3d3c97995f5403aef3c6fa989
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_gem_context.c | 46
> +++++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/i915/i915_gem_context.h | 32 +++++++++++++++++++++++
> >   2 files changed, 78 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2838c1d..1b76410 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -94,6 +94,32 @@
> >
> >   #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
> >
> > +static struct optimum_config
> opt_config[TIER_VERSION_MAX][LOAD_TYPE_MAX] = {
> > +	{
> > +		/* Cherry trail low */
> > +		{ 1, 1, 4},
> > +		/* Cherry trail medium */
> > +		{ 1, 1, 6},
> > +		/* Cherry trail high */
> > +		{ 1, 2, 6}
> > +	},
> > +	{
> > +		/* kbl gt2 low */
> > +		{ 2, 3, 4},
> > +		/* kbl gt2 medium */
> > +		{ 2, 3, 6},
> > +		/* kbl gt2 high */
> > +		{ 2, 3, 8}
> > +	},
> > +	{
> > +		/* kbl gt3 low */
> > +		{ 2, 3, 4},
> > +		/* kbl gt3 medium */
> > +		{ 2, 3, 6},
> > +		/* kbl gt3 high */
> > +		{ 2, 3, 8}
> > +	}
> > +};
> >   static void lut_close(struct i915_gem_context *ctx)
> >   {
> >   	struct i915_lut_handle *lut, *ln;
> > @@ -393,10 +419,30 @@ i915_gem_create_context(struct drm_i915_private
> *dev_priv,
> >   	ctx->subslice_cnt = hweight8(
> >   			INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> >   	ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> > +	ctx->load_type = 0;
> > +	ctx->prev_load_type = 0;
> >
> >   	return ctx;
> >   }
> >
> > +
> > +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
> 
> Type for type should be enum.
> 
> Name the function as i915_gem_context_set_load_type(ctx, type).
> 
> > +		enum gem_tier_versions version)
> > +{
> > +	struct intel_context *ce = &ctx->__engine[RCS];
> > +	u32 *reg_state = ce->lrc_reg_state;
> > +	u32 rpcs_config = 0;
> 
> Unused locals?
> 
> > +	/* Call opt_config to get correct configuration for eu,slice,subslice */
> > +	ctx->slice_cnt = (u8)opt_config[version][type].slice;
> > +	ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
> > +	ctx->eu_cnt = (u8)opt_config[version][type].eu;
> 
> You could store the correct table for the device in dev_priv and use the static
> table to assign/populate on device init time.
> 
> > +
> > +	/* Enabling this to update the rpcs */
> > +	if (ctx->prev_load_type != type)
> > +		ctx->update_render_config = 1;
> 
> ctx->update_render_config was unused in last patch. So patch ordering is
> a bit suboptimal but I don't know.
> 
> > +
> > +	ctx->prev_load_type = type;
> > +}
> >   /**
> >    * i915_gem_context_create_gvt - create a GVT GEM context
> >    * @dev: drm device *
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
> b/drivers/gpu/drm/i915/i915_gem_context.h
> > index 52e341c..50183e6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -53,6 +53,26 @@ struct intel_context_ops {
> >   	void (*destroy)(struct intel_context *ce);
> >   };
> >
> > +enum gem_load_type {
> > +	LOAD_TYPE_LOW,
> > +	LOAD_TYPE_MEDIUM,
> > +	LOAD_TYPE_HIGH,
> > +	LOAD_TYPE_MAX
> > +};
> > +
> > +enum gem_tier_versions {
> > +	CHERRYVIEW = 0,
> > +	KABYLAKE_GT2,
> > +	KABYLAKE_GT3,
> > +	TIER_VERSION_MAX
> > +};
> > +
> > +struct optimum_config {
> > +	int slice;
> > +	int subslice;
> > +	int eu;
> 
> u8 would do global table definitely.
> 
> > +};
> > +
> >   /**
> >    * struct i915_gem_context - client state
> >    *
> > @@ -210,6 +230,16 @@ struct i915_gem_context {
> >   	/** eu_cnt: used to set the # of eu to be enabled. */
> >   	u8 eu_cnt;
> >
> > +	/** load_type: The designated load_type (high/medium/low) for a given
> > +	 * number of pending commands in the command queue.
> > +	 */
> > +	u8 load_type;
> 
> Unused?
> 
> > +
> > +	/** prev_load_type: The earlier load type that the GPU was configured
> > +	 * for (high/medium/low).
> > +	 */
> > +	u8 prev_load_type;
> 
> Would pending_load_type sound more obvious?
> 
> Types should be enums for both.
> 
> Regards,
> 
> Tvrtko
> 
> > +
> >   	/** update_render_config: to track the updates to the render
> >   	 * configuration (S/SS/EU Configuration on the GPU)
> >   	 */
> > @@ -342,6 +372,8 @@ int i915_gem_context_setparam_ioctl(struct
> drm_device *dev, void *data,
> >   				    struct drm_file *file_priv);
> >   int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> >   				       struct drm_file *file);
> > +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
> > +			     enum gem_tier_versions version);
> >
> >   struct i915_gem_context *
> >   i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
  2018-09-25  8:25       ` Tvrtko Ursulin
@ 2018-11-06  4:46         ` Navik, Ankit P
  0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06  4:46 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx
  Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan

Hi Tvrtko, 

> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Tuesday, September 25, 2018 1:56 PM
> To: Navik, Ankit P <ankit.p.navik@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: J Karanje, Kedar <kedar.j.karanje@intel.com>; Diwakar, Praveen
> <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Muthukumar, Aravindan
> <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 4/4] drm/i915: Predictive governor to control
> eu/slice/subslice based on workload
> 
> 
> On 25/09/2018 07:12, Navik, Ankit P wrote:
> > Hi Tvrtko,
> >
> > Thank you for your valuable comments. We have gone through it.
> > I'll be submitting revised patch-sets after incorporating all your review
> comments.
> 
> You're welcome!
> 
> Btw one more thing - you can suspend your timer when GPU goes idle and
> restart it when active. See for instance i915_pmu_gt_parked/unparked where to
> put the hooks.

We are working on this and we will send as new patch on top of series.
> 
> Regards,
> 
> Tvrtko
> 
> >> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> >>> From: Praveen Diwakar <praveen.diwakar@intel.com>
> >>>
> >>> High resoluton timer is used for this purpose.
> >>>
> >>> Debugfs is provided to enable/disable/update timer configuration
> >>>
> >>> Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
> >>> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> >>> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> >>> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> >>> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> >>> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_debugfs.c | 94
> >> ++++++++++++++++++++++++++++++++++++-
> >>>    drivers/gpu/drm/i915/i915_drv.h     |  1 +
> >>>    2 files changed, 94 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> index f9ce35d..81ba509 100644
> >>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> @@ -4740,6 +4740,97 @@ static const struct drm_info_list
> >> i915_debugfs_list[] = {
> >>>    	{"i915_drrs_status", i915_drrs_status, 0},
> >>>    	{"i915_rps_boost_info", i915_rps_boost_info, 0},
> >>>    };
> >>> +
> >>> +#define POLL_PERIOD_MS	(1000 * 1000)
> >>> +#define PENDING_REQ_0	0 /* No active request pending*/
> >>> +#define PENDING_REQ_3	3 /* Threshold value of 3 active request
> >> pending*/
> >>> +			  /* Anything above this is considered as HIGH load
> >>> +			   * context
> >>> +			   */
> >>> +			  /* And less is considered as LOW load*/
> >>> +			  /* And equal is considered as mediaum load */
> >>
> >> Wonky comments and some typos up to here.
Changes done in v2.
> >>
> >>> +
> >>> +static int predictive_load_enable;
> >>> +static int predictive_load_timer_init;
> >>> +
> >>> +static enum hrtimer_restart predictive_load_cb(struct hrtimer
> >>> +*hrtimer) {
> >>> +	struct drm_i915_private *dev_priv =
> >>> +		container_of(hrtimer, typeof(*dev_priv),
> >>> +				pred_timer);
> >>> +	enum intel_engine_id id;
> >>> +	struct intel_engine_cs *engine;
> >>
> >> Some unused's.
Changes done in v2. 
> >>
> >>> +	struct i915_gem_context *ctx;
> >>> +	u64 req_pending;
> >>
> >> unsigned long, and also please try to order declaration so the right
> >> edge of text is moving in one direction only.
Changes done in v2.
> >>
> >>> +
> >>> +	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> >>> +
> >>> +		if (!ctx->name)
> >>> +			continue;
> >>
> >> What is this?
Just an error check for invalid context.
> >>
> >>> +
> >>> +		mutex_lock(&dev_priv->pred_mutex);
> >>
> >> Here the mutex bites you since you cannot sleep in the timer callback.
> >> atomic_t would solve it. Or would a native unsigned int/long since
> >> lock to read a native word on x86 is not needed.
Changes done in v2.
> >>
> >>> +		req_pending = ctx->req_cnt;
> >>> +		mutex_unlock(&dev_priv->pred_mutex);
> >>> +
> >>> +		if (req_pending == PENDING_REQ_0)
> >>> +			continue;
> >>> +
> >>> +		if (req_pending > PENDING_REQ_3)
> >>> +			ctx->load_type = LOAD_TYPE_HIGH;
> >>> +		else if (req_pending == PENDING_REQ_3)
> >>> +			ctx->load_type = LOAD_TYPE_MEDIUM;
> >>> +		else if (req_pending < PENDING_REQ_3)
> >>
> >> Must be smaller if not greater or equal, but maybe the compiler does
> >> that for you.
Changes done in v2. 
> >>
> >>> +			ctx->load_type = LOAD_TYPE_LOW;
> >>> +
> >>> +		i915_set_optimum_config(ctx->load_type, ctx,
> >> KABYLAKE_GT3);
> >>
> >> Only KBL? Idea to put the table in dev_priv FTW! :)
Changes done in v2.
> >>
> >> ctx->load_type used only as a temporary uncovered here? :)
> >>
> >>> +	}
> >>> +
> >>> +	hrtimer_forward_now(hrtimer,
> >>> +
> >> 	ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
> >>
> >> Or HRTIMER_NORESTART if disabled? Hard to call it, details..
We get timer value and we don’t start the timer if the value is 0.
> >>
> >>> +
> >>> +	return HRTIMER_RESTART;
> >>> +}
> >>> +
> >>> +static int
> >>> +i915_predictive_load_get(void *data, u64 *val) {
> >>> +	struct drm_i915_private *dev_priv = data;
> >>> +
> >>> +	*val = predictive_load_enable;
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int
> >>> +i915_predictive_load_set(void *data, u64 val) {
> >>> +	struct drm_i915_private *dev_priv = data;
> >>> +	struct intel_device_info *info;
> >>> +
> >>> +	info = mkwrite_device_info(dev_priv);
> >>
> >> Unused, why?
Changes done in v2.
> >>
> >>> +
> >>> +	predictive_load_enable = val;
> >>> +
> >>> +	if (predictive_load_enable) {
> >>> +		if (!predictive_load_timer_init) {
> >>> +			hrtimer_init(&dev_priv->pred_timer,
> >> CLOCK_MONOTONIC,
> >>> +					HRTIMER_MODE_REL);
> >>> +			dev_priv->pred_timer.function = predictive_load_cb;
> >>> +			predictive_load_timer_init = 1;
> >>
> >> Move timer init to dev_priv setup.
Changes done in v2.
> >>
> >>> +		}
> >>> +		hrtimer_start(&dev_priv->pred_timer,
> >>> +
> >> 	ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
> >>> +			HRTIMER_MODE_REL_PINNED);
> >>
> >> Two threads can race to here.
This is under mutex, please let me know scenario if any corner case which could
cause race condition.
> >>
> >> Also you can give some slack to the timer since precision is not
> >> critical to you and can save you some CPU power.
Changes done in v2.
> >>
> >>> +	} else {
> >>> +		hrtimer_cancel(&dev_priv->pred_timer);
> >>> +	}
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
> >>> +			i915_predictive_load_get, i915_predictive_load_set,
> >>> +			"%llu\n");
> >>> +
> >>>    #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
> >>>
> >>>    static const struct i915_debugfs_files { @@ -4769,7 +4860,8 @@
> >>> static const struct i915_debugfs_files {
> >>>    	{"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
> >>>    	{"i915_ipc_status", &i915_ipc_status_fops},
> >>>    	{"i915_drrs_ctl", &i915_drrs_ctl_fops},
> >>> -	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
> >>> +	{"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
> >>> +	{"i915_predictive_load_ctl", &i915_predictive_load_ctl}
> >>
> >> And if the feature is to become real one day, given it's nature, the
> >> knob would probably have to go to sysfs, not debugfs.
Added FIXME comment to migrate to sysfs on subsequent version along with other fixes.
> >>
> >>>    };
> >>>
> >>>    int i915_debugfs_register(struct drm_i915_private *dev_priv) diff
> >>> --git a/drivers/gpu/drm/i915/i915_drv.h
> >>> b/drivers/gpu/drm/i915/i915_drv.h index 137ec33..0505c47 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
> >>>    };
> >>>
> >>>    struct drm_i915_private {
> >>> +	struct hrtimer pred_timer;
> >>
> >> Surely not the first member! :)
Changes done in v2. 

Regards, Ankit
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>>    	struct drm_device drm;
> >>>
> >>>    	struct kmem_cache *objects;
> >>>
> >
> > Regards,
> > Ankit
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU.
  2018-12-11 11:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2018-12-11 13:02   ` Tvrtko Ursulin
  0 siblings, 0 replies; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-12-11 13:02 UTC (permalink / raw)
  To: intel-gfx, Patchwork, Navik, Ankit P


On 11/12/2018 11:12, Patchwork wrote:
> == Series Details ==
> 
> Series: Dynamic EU configuration of Slice/Subslice/EU.
> URL   : https://patchwork.freedesktop.org/series/53876/
> State : failure
> 
> == Summary ==
> 
> Applying: drm/i915: Get active pending request for given context
> Using index info to reconstruct a base tree...
> M	drivers/gpu/drm/i915/i915_gem_context.c
> M	drivers/gpu/drm/i915/i915_gem_context.h
> M	drivers/gpu/drm/i915/i915_request.c
> M	drivers/gpu/drm/i915/intel_lrc.c
> Falling back to patching base and 3-way merge...
> Auto-merging drivers/gpu/drm/i915/intel_lrc.c
> Auto-merging drivers/gpu/drm/i915/i915_request.c
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
> Applying: drm/i915: Update render power clock state configuration for given context
> Using index info to reconstruct a base tree...
> M	drivers/gpu/drm/i915/i915_gem_context.c
> M	drivers/gpu/drm/i915/i915_gem_context.h
> M	drivers/gpu/drm/i915/intel_lrc.c
> Falling back to patching base and 3-way merge...
> Auto-merging drivers/gpu/drm/i915/intel_lrc.c
> CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_lrc.c
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
> error: Failed to merge in the changes.
> Patch failed at 0002 drm/i915: Update render power clock state configuration for given context
> Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".

Always rebase before sending if you are not sure since drm-tip can be 
pretty dynamic.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

* ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU.
  2018-12-11 10:14 [PATCH v3 0/4] " Ankit Navik
@ 2018-12-11 11:12 ` Patchwork
  2018-12-11 13:02   ` Tvrtko Ursulin
  0 siblings, 1 reply; 25+ messages in thread
From: Patchwork @ 2018-12-11 11:12 UTC (permalink / raw)
  To: Navik, Ankit P; +Cc: intel-gfx

== Series Details ==

Series: Dynamic EU configuration of Slice/Subslice/EU.
URL   : https://patchwork.freedesktop.org/series/53876/
State : failure

== Summary ==

Applying: drm/i915: Get active pending request for given context
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem_context.c
M	drivers/gpu/drm/i915/i915_gem_context.h
M	drivers/gpu/drm/i915/i915_request.c
M	drivers/gpu/drm/i915/intel_lrc.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/i915_request.c
Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
Applying: drm/i915: Update render power clock state configuration for given context
Using index info to reconstruct a base tree...
M	drivers/gpu/drm/i915/i915_gem_context.c
M	drivers/gpu/drm/i915/i915_gem_context.h
M	drivers/gpu/drm/i915/intel_lrc.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_lrc.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
error: Failed to merge in the changes.
Patch failed at 0002 drm/i915: Update render power clock state configuration for given context
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2018-12-11 13:02 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-21  9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
2018-09-21  9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
2018-09-21 12:39   ` Tvrtko Ursulin
2018-11-06  4:18     ` Navik, Ankit P
2018-09-25  8:04   ` Jani Nikula
2018-09-25  7:41     ` Kedar J. Karanje
2018-09-21  9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
2018-09-21 12:51   ` Tvrtko Ursulin
2018-11-06  4:23     ` Navik, Ankit P
2018-09-22 19:13   ` kbuild test robot
2018-09-21  9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
2018-09-21 13:05   ` Tvrtko Ursulin
2018-11-06  4:25     ` Navik, Ankit P
2018-09-22 18:09   ` kbuild test robot
2018-09-21  9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
2018-09-21 13:24   ` Tvrtko Ursulin
2018-09-25  6:12     ` Navik, Ankit P
2018-09-25  8:25       ` Tvrtko Ursulin
2018-11-06  4:46         ` Navik, Ankit P
2018-09-22 18:31   ` kbuild test robot
2018-09-21 11:16 ` ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU Patchwork
2018-09-21 12:31 ` [PATCH 0/4][RFC] " Tvrtko Ursulin
2018-09-27 14:02 ` Joonas Lahtinen
2018-12-11 10:14 [PATCH v3 0/4] " Ankit Navik
2018-12-11 11:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
2018-12-11 13:02   ` Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.