* [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.
@ 2018-09-21 9:13 kedar.j.karanje
2018-09-21 9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
` (6 more replies)
0 siblings, 7 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21 9:13 UTC (permalink / raw)
To: intel-gfx
From: "Kedar J. Karanje" <kedar.j.karanje@intel.com>
drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
Current GPU configuration code for i915 does not allow us to change
EU/Slice/Sub-slice configuration dynamically. Its done only once while context
is created.
While particular graphics application is running, if we examine the command
requests from user space, we observe that command density is not consistent.
It means there is scope to change the graphics configuration dynamically even
while context is running actively. This patch series proposes the solution to
find the active pending load for all active context at given time and based on
that, dynamically perform graphics configuration for each context.
We use a hr (high resolution) timer with i915 driver in kernel to get a
callback every few milliseconds (this timer value can be configured through
debugfs, default is '0' indicating timer is in disabled state i.e. original
system without any intervention).In the timer callback, we examine pending
commands for a context in the queue, essentially, we intercept them before
they are executed by GPU and we update context with required number of EUs.
Two questions, how did we arrive at right timer value? and what's the right
number of EUs? For the prior one, empirical data to achieve best performance
in least power was considered. For the later one, we roughly categorized number
of EUs logically based on platform. Now we compare number of pending commands
with a particular threshold and then set number of EUs accordingly with update
context. That threshold is also based on experiments & findings. If GPU is able
to catch up with CPU, typically there are no pending commands, the EU config
would remain unchanged there. In case there are more pending commands we
reprogram context with higher number of EUs. Please note, here we are changing
EUs even while context is running by examining pending commands every 'x'
milliseconds.
With this solution in place, on KBL-GT3 + Android we saw following pnp
benefits without any performance degradation, power numbers mentioned here
are system power.
App /KPI | % Power |
| Benefit |
| (mW) |
---------------------------------|
3D Mark (Ice storm) | 2.30% |
TRex On screen | 2.49% |
TRex Off screen | 1.32% |
ManhattanOn screen | 3.11% |
Manhattan Off screen | 0.89% |
AnTuTu 6.1.4 | 3.42% |
Note - For KBL (GEN9) we cannot control at sub-slice level, it was always a
constraint.
We always controlled number of EUs rather than sub-slices/slices.
Praveen Diwakar (4):
drm/i915: Get active pending request for given context
drm/i915: Update render power clock state configuration for given
context
drm/i915: set optimum eu/slice/sub-slice configuration based on load
type
drm/i915: Predictive governor to control eu/slice/subslice based on
workload
drivers/gpu/drm/i915/i915_debugfs.c | 94 +++++++++++++++++++++++++++++-
drivers/gpu/drm/i915/i915_drv.c | 1 +
drivers/gpu/drm/i915/i915_drv.h | 6 ++
drivers/gpu/drm/i915/i915_gem_context.c | 52 +++++++++++++++++
drivers/gpu/drm/i915/i915_gem_context.h | 52 +++++++++++++++++
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 ++
drivers/gpu/drm/i915/intel_lrc.c | 47 +++++++++++++++
7 files changed, 256 insertions(+), 1 deletion(-)
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 1/4] drm/i915: Get active pending request for given context
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
@ 2018-09-21 9:13 ` kedar.j.karanje
2018-09-21 12:39 ` Tvrtko Ursulin
2018-09-25 8:04 ` Jani Nikula
2018-09-21 9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
` (5 subsequent siblings)
6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21 9:13 UTC (permalink / raw)
To: intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
From: Praveen Diwakar <praveen.diwakar@intel.com>
This patch gives us the active pending request count which is yet
to be submitted to the GPU
Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
drivers/gpu/drm/i915/i915_drv.c | 1 +
drivers/gpu/drm/i915/i915_drv.h | 5 +++++
drivers/gpu/drm/i915/i915_gem_context.c | 1 +
drivers/gpu/drm/i915/i915_gem_context.h | 6 ++++++
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++
6 files changed, 24 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f8cfd16..d37c46e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
mutex_init(&dev_priv->av_mutex);
mutex_init(&dev_priv->wm.wm_mutex);
mutex_init(&dev_priv->pps_mutex);
+ mutex_init(&dev_priv->pred_mutex);
i915_memcpy_init_early(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4aca534..137ec33 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1609,6 +1609,11 @@ struct drm_i915_private {
* controller on different i2c buses. */
struct mutex gmbus_mutex;
+ /** pred_mutex protects against councurrent usage of pending
+ * request counter for multiple contexts
+ */
+ struct mutex pred_mutex;
+
/**
* Base address of the gmbus and gpio block.
*/
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index b10770c..30932d9 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
}
trace_i915_context_create(ctx);
+ ctx->req_cnt = 0;
return ctx;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index b116e49..243ea22 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -194,6 +194,12 @@ struct i915_gem_context {
* context close.
*/
struct list_head handles_list;
+
+ /** req_cnt: tracks the pending commands, based on which we decide to
+ * go for low/medium/high load configuration of the GPU, this is
+ * controlled via a mutex
+ */
+ u64 req_cnt;
};
static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3f0c612..f799ff9 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
struct drm_syncobj **fences)
{
struct i915_execbuffer eb;
+ struct drm_i915_private *dev_priv = to_i915(dev);
struct dma_fence *in_fence = NULL;
struct sync_file *out_fence = NULL;
int out_fence_fd = -1;
@@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
*/
eb.request->batch = eb.batch;
+ mutex_lock(&dev_priv->pred_mutex);
+ eb.ctx->req_cnt++;
+ mutex_unlock(&dev_priv->pred_mutex);
+
trace_i915_request_queue(eb.request, eb.batch_flags);
err = eb_submit(&eb);
err_request:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 1744792..039fbdb 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
trace_i915_request_in(rq, port_index(port, execlists));
last = rq;
submit = true;
+
+ mutex_lock(&rq->i915->pred_mutex);
+ if (rq->gem_context->req_cnt > 0) {
+ rq->gem_context->req_cnt--;
+ }
+ mutex_unlock(&rq->i915->pred_mutex);
}
rb_erase_cached(&p->node, &execlists->queue);
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
2018-09-21 9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
@ 2018-09-21 9:13 ` kedar.j.karanje
2018-09-21 12:51 ` Tvrtko Ursulin
2018-09-22 19:13 ` kbuild test robot
2018-09-21 9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
` (4 subsequent siblings)
6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21 9:13 UTC (permalink / raw)
To: intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
From: Praveen Diwakar <praveen.diwakar@intel.com>
This patch will update power clock state register at runtime base on the
flag update_render_config which can set by any governor which computes load
and want to update rpcs register.
subsequent patches will have a timer based governor which computes pending
load/request.
Change-Id: I4e7d2f484b957d5bd496e1decc59a69e3bc6d186
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
drivers/gpu/drm/i915/i915_gem_context.c | 5 ++++
drivers/gpu/drm/i915/i915_gem_context.h | 14 +++++++++++
drivers/gpu/drm/i915/intel_lrc.c | 41 +++++++++++++++++++++++++++++++++
3 files changed, 60 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 30932d9..2838c1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -388,6 +388,11 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
trace_i915_context_create(ctx);
ctx->req_cnt = 0;
+ ctx->update_render_config = 0;
+ ctx->slice_cnt = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
+ ctx->subslice_cnt = hweight8(
+ INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
+ ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
return ctx;
}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 243ea22..52e341c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -200,6 +200,20 @@ struct i915_gem_context {
* controlled via a mutex
*/
u64 req_cnt;
+
+ /** slice_cnt: used to set the # of slices to be enabled. */
+ u8 slice_cnt;
+
+ /** subslice_cnt: used to set the # of subslices to be enabled. */
+ u8 subslice_cnt;
+
+ /** eu_cnt: used to set the # of eu to be enabled. */
+ u8 eu_cnt;
+
+ /** update_render_config: to track the updates to the render
+ * configuration (S/SS/EU Configuration on the GPU)
+ */
+ bool update_render_config;
};
static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 039fbdb..d2d0e7d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -364,6 +364,36 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
spin_unlock_irqrestore(&engine->timeline.lock, flags);
}
+static u32
+get_context_rpcs_config(struct i915_gem_context *ctx)
+{
+ struct drm_i915_private *dev_priv = ctx->i915;
+ u32 rpcs = 0;
+
+ if (INTEL_GEN(dev_priv) < 8)
+ return 0;
+
+ if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
+ rpcs |= GEN8_RPCS_S_CNT_ENABLE;
+ rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
+ rpcs |= GEN8_RPCS_ENABLE;
+ }
+
+ if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
+ rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
+ rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
+ rpcs |= GEN8_RPCS_ENABLE;
+ }
+
+ if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
+ rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
+ rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
+ rpcs |= GEN8_RPCS_ENABLE;
+ }
+
+ return rpcs;
+}
+
static inline void
execlists_context_status_change(struct i915_request *rq, unsigned long status)
{
@@ -418,11 +448,22 @@ execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
static u64 execlists_update_context(struct i915_request *rq)
{
struct intel_context *ce = rq->hw_context;
+ struct i915_gem_context *ctx = rq->gem_context;
+ struct intel_engine_cs *engine = rq->engine;
struct i915_hw_ppgtt *ppgtt =
rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
u32 *reg_state = ce->lrc_reg_state;
+ u32 rpcs_config = 0;
reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
+ if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
+ ctx->update_render_config) {
+ rpcs_config = get_context_rpcs_config(ctx);
+ reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
+ CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
+ rpcs_config);
+ ctx->update_render_config = 0;
+ }
/* True 32b PPGTT with dynamic page allocation: update PDP
* registers and point the unallocated PDPs to scratch page.
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
2018-09-21 9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
2018-09-21 9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
@ 2018-09-21 9:13 ` kedar.j.karanje
2018-09-21 13:05 ` Tvrtko Ursulin
2018-09-22 18:09 ` kbuild test robot
2018-09-21 9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
` (3 subsequent siblings)
6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21 9:13 UTC (permalink / raw)
To: intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
From: Praveen Diwakar <praveen.diwakar@intel.com>
This patch will select optimum eu/slice/sub-slice configuration based on type
of load (low, medium, high) as input.
Based on our readings and experiments we have predefined set of optimum
configuration for each platform(CHT, KBL).
i915_set_optimum_config will select optimum configuration from pre-defined
optimum configuration table(opt_config).
Change-Id: I3a6a2a6bdddd01b3d3c97995f5403aef3c6fa989
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
drivers/gpu/drm/i915/i915_gem_context.c | 46 +++++++++++++++++++++++++++++++++
drivers/gpu/drm/i915/i915_gem_context.h | 32 +++++++++++++++++++++++
2 files changed, 78 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2838c1d..1b76410 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -94,6 +94,32 @@
#define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
+static struct optimum_config opt_config[TIER_VERSION_MAX][LOAD_TYPE_MAX] = {
+ {
+ /* Cherry trail low */
+ { 1, 1, 4},
+ /* Cherry trail medium */
+ { 1, 1, 6},
+ /* Cherry trail high */
+ { 1, 2, 6}
+ },
+ {
+ /* kbl gt2 low */
+ { 2, 3, 4},
+ /* kbl gt2 medium */
+ { 2, 3, 6},
+ /* kbl gt2 high */
+ { 2, 3, 8}
+ },
+ {
+ /* kbl gt3 low */
+ { 2, 3, 4},
+ /* kbl gt3 medium */
+ { 2, 3, 6},
+ /* kbl gt3 high */
+ { 2, 3, 8}
+ }
+};
static void lut_close(struct i915_gem_context *ctx)
{
struct i915_lut_handle *lut, *ln;
@@ -393,10 +419,30 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
ctx->subslice_cnt = hweight8(
INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
+ ctx->load_type = 0;
+ ctx->prev_load_type = 0;
return ctx;
}
+
+void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
+ enum gem_tier_versions version)
+{
+ struct intel_context *ce = &ctx->__engine[RCS];
+ u32 *reg_state = ce->lrc_reg_state;
+ u32 rpcs_config = 0;
+ /* Call opt_config to get correct configuration for eu,slice,subslice */
+ ctx->slice_cnt = (u8)opt_config[version][type].slice;
+ ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
+ ctx->eu_cnt = (u8)opt_config[version][type].eu;
+
+ /* Enabling this to update the rpcs */
+ if (ctx->prev_load_type != type)
+ ctx->update_render_config = 1;
+
+ ctx->prev_load_type = type;
+}
/**
* i915_gem_context_create_gvt - create a GVT GEM context
* @dev: drm device *
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 52e341c..50183e6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -53,6 +53,26 @@ struct intel_context_ops {
void (*destroy)(struct intel_context *ce);
};
+enum gem_load_type {
+ LOAD_TYPE_LOW,
+ LOAD_TYPE_MEDIUM,
+ LOAD_TYPE_HIGH,
+ LOAD_TYPE_MAX
+};
+
+enum gem_tier_versions {
+ CHERRYVIEW = 0,
+ KABYLAKE_GT2,
+ KABYLAKE_GT3,
+ TIER_VERSION_MAX
+};
+
+struct optimum_config {
+ int slice;
+ int subslice;
+ int eu;
+};
+
/**
* struct i915_gem_context - client state
*
@@ -210,6 +230,16 @@ struct i915_gem_context {
/** eu_cnt: used to set the # of eu to be enabled. */
u8 eu_cnt;
+ /** load_type: The designated load_type (high/medium/low) for a given
+ * number of pending commands in the command queue.
+ */
+ u8 load_type;
+
+ /** prev_load_type: The earlier load type that the GPU was configured
+ * for (high/medium/low).
+ */
+ u8 prev_load_type;
+
/** update_render_config: to track the updates to the render
* configuration (S/SS/EU Configuration on the GPU)
*/
@@ -342,6 +372,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv);
int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
struct drm_file *file);
+void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
+ enum gem_tier_versions version);
struct i915_gem_context *
i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
` (2 preceding siblings ...)
2018-09-21 9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
@ 2018-09-21 9:13 ` kedar.j.karanje
2018-09-21 13:24 ` Tvrtko Ursulin
2018-09-22 18:31 ` kbuild test robot
2018-09-21 11:16 ` ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU Patchwork
` (2 subsequent siblings)
6 siblings, 2 replies; 25+ messages in thread
From: kedar.j.karanje @ 2018-09-21 9:13 UTC (permalink / raw)
To: intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
From: Praveen Diwakar <praveen.diwakar@intel.com>
High resoluton timer is used for this purpose.
Debugfs is provided to enable/disable/update timer configuration
Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
---
drivers/gpu/drm/i915/i915_debugfs.c | 94 ++++++++++++++++++++++++++++++++++++-
drivers/gpu/drm/i915/i915_drv.h | 1 +
2 files changed, 94 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f9ce35d..81ba509 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4740,6 +4740,97 @@ static const struct drm_info_list i915_debugfs_list[] = {
{"i915_drrs_status", i915_drrs_status, 0},
{"i915_rps_boost_info", i915_rps_boost_info, 0},
};
+
+#define POLL_PERIOD_MS (1000 * 1000)
+#define PENDING_REQ_0 0 /* No active request pending*/
+#define PENDING_REQ_3 3 /* Threshold value of 3 active request pending*/
+ /* Anything above this is considered as HIGH load
+ * context
+ */
+ /* And less is considered as LOW load*/
+ /* And equal is considered as mediaum load */
+
+static int predictive_load_enable;
+static int predictive_load_timer_init;
+
+static enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
+{
+ struct drm_i915_private *dev_priv =
+ container_of(hrtimer, typeof(*dev_priv),
+ pred_timer);
+ enum intel_engine_id id;
+ struct intel_engine_cs *engine;
+ struct i915_gem_context *ctx;
+ u64 req_pending;
+
+ list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
+
+ if (!ctx->name)
+ continue;
+
+ mutex_lock(&dev_priv->pred_mutex);
+ req_pending = ctx->req_cnt;
+ mutex_unlock(&dev_priv->pred_mutex);
+
+ if (req_pending == PENDING_REQ_0)
+ continue;
+
+ if (req_pending > PENDING_REQ_3)
+ ctx->load_type = LOAD_TYPE_HIGH;
+ else if (req_pending == PENDING_REQ_3)
+ ctx->load_type = LOAD_TYPE_MEDIUM;
+ else if (req_pending < PENDING_REQ_3)
+ ctx->load_type = LOAD_TYPE_LOW;
+
+ i915_set_optimum_config(ctx->load_type, ctx, KABYLAKE_GT3);
+ }
+
+ hrtimer_forward_now(hrtimer,
+ ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
+
+ return HRTIMER_RESTART;
+}
+
+static int
+i915_predictive_load_get(void *data, u64 *val)
+{
+ struct drm_i915_private *dev_priv = data;
+
+ *val = predictive_load_enable;
+ return 0;
+}
+
+static int
+i915_predictive_load_set(void *data, u64 val)
+{
+ struct drm_i915_private *dev_priv = data;
+ struct intel_device_info *info;
+
+ info = mkwrite_device_info(dev_priv);
+
+ predictive_load_enable = val;
+
+ if (predictive_load_enable) {
+ if (!predictive_load_timer_init) {
+ hrtimer_init(&dev_priv->pred_timer, CLOCK_MONOTONIC,
+ HRTIMER_MODE_REL);
+ dev_priv->pred_timer.function = predictive_load_cb;
+ predictive_load_timer_init = 1;
+ }
+ hrtimer_start(&dev_priv->pred_timer,
+ ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
+ HRTIMER_MODE_REL_PINNED);
+ } else {
+ hrtimer_cancel(&dev_priv->pred_timer);
+ }
+
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
+ i915_predictive_load_get, i915_predictive_load_set,
+ "%llu\n");
+
#define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
static const struct i915_debugfs_files {
@@ -4769,7 +4860,8 @@ static const struct i915_debugfs_files {
{"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
{"i915_ipc_status", &i915_ipc_status_fops},
{"i915_drrs_ctl", &i915_drrs_ctl_fops},
- {"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
+ {"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
+ {"i915_predictive_load_ctl", &i915_predictive_load_ctl}
};
int i915_debugfs_register(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 137ec33..0505c47 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
};
struct drm_i915_private {
+ struct hrtimer pred_timer;
struct drm_device drm;
struct kmem_cache *objects;
--
2.7.4
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply related [flat|nested] 25+ messages in thread
* ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU.
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
` (3 preceding siblings ...)
2018-09-21 9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
@ 2018-09-21 11:16 ` Patchwork
2018-09-21 12:31 ` [PATCH 0/4][RFC] " Tvrtko Ursulin
2018-09-27 14:02 ` Joonas Lahtinen
6 siblings, 0 replies; 25+ messages in thread
From: Patchwork @ 2018-09-21 11:16 UTC (permalink / raw)
To: kedar.j.karanje; +Cc: intel-gfx
== Series Details ==
Series: Dynamic EU configuration of Slice/Subslice/EU.
URL : https://patchwork.freedesktop.org/series/50006/
State : failure
== Summary ==
CALL scripts/checksyscalls.sh
DESCEND objtool
CHK include/generated/compile.h
CC [M] drivers/gpu/drm/i915/i915_debugfs.o
drivers/gpu/drm/i915/i915_debugfs.c: In function ‘predictive_load_cb’:
drivers/gpu/drm/i915/i915_debugfs.c:4813:26: error: unused variable ‘engine’ [-Werror=unused-variable]
struct intel_engine_cs *engine;
^~~~~~
drivers/gpu/drm/i915/i915_debugfs.c:4812:23: error: unused variable ‘id’ [-Werror=unused-variable]
enum intel_engine_id id;
^~
drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_predictive_load_get’:
drivers/gpu/drm/i915/i915_debugfs.c:4848:27: error: unused variable ‘dev_priv’ [-Werror=unused-variable]
struct drm_i915_private *dev_priv = data;
^~~~~~~~
cc1: all warnings being treated as errors
scripts/Makefile.build:305: recipe for target 'drivers/gpu/drm/i915/i915_debugfs.o' failed
make[4]: *** [drivers/gpu/drm/i915/i915_debugfs.o] Error 1
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm/i915' failed
make[3]: *** [drivers/gpu/drm/i915] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu/drm' failed
make[2]: *** [drivers/gpu/drm] Error 2
scripts/Makefile.build:546: recipe for target 'drivers/gpu' failed
make[1]: *** [drivers/gpu] Error 2
Makefile:1062: recipe for target 'drivers' failed
make: *** [drivers] Error 2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
` (4 preceding siblings ...)
2018-09-21 11:16 ` ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU Patchwork
@ 2018-09-21 12:31 ` Tvrtko Ursulin
2018-09-27 14:02 ` Joonas Lahtinen
6 siblings, 0 replies; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 12:31 UTC (permalink / raw)
To: kedar.j.karanje, intel-gfx
Hi,
On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: "Kedar J. Karanje" <kedar.j.karanje@intel.com>
>
> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
>
> Current GPU configuration code for i915 does not allow us to change
> EU/Slice/Sub-slice configuration dynamically. Its done only once while context
> is created.
>
> While particular graphics application is running, if we examine the command
> requests from user space, we observe that command density is not consistent.
> It means there is scope to change the graphics configuration dynamically even
> while context is running actively. This patch series proposes the solution to
> find the active pending load for all active context at given time and based on
> that, dynamically perform graphics configuration for each context.
>
> We use a hr (high resolution) timer with i915 driver in kernel to get a
> callback every few milliseconds (this timer value can be configured through
> debugfs, default is '0' indicating timer is in disabled state i.e. original
> system without any intervention).In the timer callback, we examine pending
> commands for a context in the queue, essentially, we intercept them before
> they are executed by GPU and we update context with required number of EUs.
>
> Two questions, how did we arrive at right timer value? and what's the right
> number of EUs? For the prior one, empirical data to achieve best performance
> in least power was considered. For the later one, we roughly categorized number
> of EUs logically based on platform. Now we compare number of pending commands
> with a particular threshold and then set number of EUs accordingly with update
> context. That threshold is also based on experiments & findings. If GPU is able
> to catch up with CPU, typically there are no pending commands, the EU config
> would remain unchanged there. In case there are more pending commands we
> reprogram context with higher number of EUs. Please note, here we are changing
> EUs even while context is running by examining pending commands every 'x'
> milliseconds.
>
> With this solution in place, on KBL-GT3 + Android we saw following pnp
> benefits without any performance degradation, power numbers mentioned here
> are system power.
>
> App /KPI | % Power |
> | Benefit |
> | (mW) |
> ---------------------------------|
> 3D Mark (Ice storm) | 2.30% |
> TRex On screen | 2.49% |
> TRex Off screen | 1.32% |
> ManhattanOn screen | 3.11% |
> Manhattan Off screen | 0.89% |
> AnTuTu 6.1.4 | 3.42% |
Good numbers! It is hard to argue against them. :) Even though obviously
the heuristics like the one you implemented can be easily fooled by
different workloads... Since you make no distinction between different
engines and EUs vs fixed functions, external fences vs GPU
over-subscription and similar.
But then again you have a control knob and it is off by default. So if
it genuinely always only helps typical use cases, and people can stomach
a control knob, perhaps we can have it.
In any case I think you would need to test against a lot more benchmarks
to pass the threshold of whether upstream could consider this. And
definitely not only under Android. It is just that these days I don't
know with whom to put you in touch in order to recommend a list of
benchmarks, or even provide some automated system to run them. Anybody?
Regards,
Tvrtko
>
> Note - For KBL (GEN9) we cannot control at sub-slice level, it was always a
> constraint.
> We always controlled number of EUs rather than sub-slices/slices.
>
>
> Praveen Diwakar (4):
> drm/i915: Get active pending request for given context
> drm/i915: Update render power clock state configuration for given
> context
> drm/i915: set optimum eu/slice/sub-slice configuration based on load
> type
> drm/i915: Predictive governor to control eu/slice/subslice based on
> workload
>
> drivers/gpu/drm/i915/i915_debugfs.c | 94 +++++++++++++++++++++++++++++-
> drivers/gpu/drm/i915/i915_drv.c | 1 +
> drivers/gpu/drm/i915/i915_drv.h | 6 ++
> drivers/gpu/drm/i915/i915_gem_context.c | 52 +++++++++++++++++
> drivers/gpu/drm/i915/i915_gem_context.h | 52 +++++++++++++++++
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 ++
> drivers/gpu/drm/i915/intel_lrc.c | 47 +++++++++++++++
> 7 files changed, 256 insertions(+), 1 deletion(-)
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
2018-09-21 9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
@ 2018-09-21 12:39 ` Tvrtko Ursulin
2018-11-06 4:18 ` Navik, Ankit P
2018-09-25 8:04 ` Jani Nikula
1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 12:39 UTC (permalink / raw)
To: kedar.j.karanje, intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
>
> Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
> drivers/gpu/drm/i915/i915_drv.c | 1 +
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem_context.c | 1 +
> drivers/gpu/drm/i915/i915_gem_context.h | 6 ++++++
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
> drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++
> 6 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index f8cfd16..d37c46e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
> mutex_init(&dev_priv->av_mutex);
> mutex_init(&dev_priv->wm.wm_mutex);
> mutex_init(&dev_priv->pps_mutex);
> + mutex_init(&dev_priv->pred_mutex);
>
> i915_memcpy_init_early(dev_priv);
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4aca534..137ec33 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
> * controller on different i2c buses. */
> struct mutex gmbus_mutex;
>
> + /** pred_mutex protects against councurrent usage of pending
> + * request counter for multiple contexts
> + */
> + struct mutex pred_mutex;
> +
> /**
> * Base address of the gmbus and gpio block.
> */
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index b10770c..30932d9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
> }
>
> trace_i915_context_create(ctx);
> + ctx->req_cnt = 0;
>
> return ctx;
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index b116e49..243ea22 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -194,6 +194,12 @@ struct i915_gem_context {
> * context close.
> */
> struct list_head handles_list;
> +
> + /** req_cnt: tracks the pending commands, based on which we decide to
> + * go for low/medium/high load configuration of the GPU, this is
> + * controlled via a mutex
> + */
> + u64 req_cnt;
64-bit is too wide and mutex causes you problems later in the series.
I'd suggest atomic_t.
> };
>
> static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3f0c612..f799ff9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> struct drm_syncobj **fences)
> {
> struct i915_execbuffer eb;
> + struct drm_i915_private *dev_priv = to_i915(dev);
> struct dma_fence *in_fence = NULL;
> struct sync_file *out_fence = NULL;
> int out_fence_fd = -1;
> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> */
> eb.request->batch = eb.batch;
>
> + mutex_lock(&dev_priv->pred_mutex);
> + eb.ctx->req_cnt++;
> + mutex_unlock(&dev_priv->pred_mutex);
> +
> trace_i915_request_queue(eb.request, eb.batch_flags);
> err = eb_submit(&eb);
> err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 1744792..039fbdb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> trace_i915_request_in(rq, port_index(port, execlists));
> last = rq;
> submit = true;
> +
> + mutex_lock(&rq->i915->pred_mutex);
> + if (rq->gem_context->req_cnt > 0) {
Presumably you hit underflow here and then added the protection. This
was a hint the accounting does not work as expected. Preemption for
instance would wreak havoc with it. :)
Have a look at my per engine queued/runnable counters for how to do it
correctly, only that you would need to make it per context and merge
into one aggregated queued+runnable instead of two separate counters, if
you wanted to preserve your heuristics.
https://patchwork.freedesktop.org/patch/227981/
https://patchwork.freedesktop.org/patch/227976/
Regards,
Tvrtko
> + rq->gem_context->req_cnt--;
> + }
> + mutex_unlock(&rq->i915->pred_mutex);
> }
>
> rb_erase_cached(&p->node, &execlists->queue);
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
2018-09-21 9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
@ 2018-09-21 12:51 ` Tvrtko Ursulin
2018-11-06 4:23 ` Navik, Ankit P
2018-09-22 19:13 ` kbuild test robot
1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 12:51 UTC (permalink / raw)
To: kedar.j.karanje, intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> This patch will update power clock state register at runtime base on the
> flag update_render_config which can set by any governor which computes load
> and want to update rpcs register.
> subsequent patches will have a timer based governor which computes pending
> load/request.
>
> Change-Id: I4e7d2f484b957d5bd496e1decc59a69e3bc6d186
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_context.c | 5 ++++
> drivers/gpu/drm/i915/i915_gem_context.h | 14 +++++++++++
> drivers/gpu/drm/i915/intel_lrc.c | 41 +++++++++++++++++++++++++++++++++
> 3 files changed, 60 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 30932d9..2838c1d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -388,6 +388,11 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>
> trace_i915_context_create(ctx);
> ctx->req_cnt = 0;
> + ctx->update_render_config = 0;
> + ctx->slice_cnt = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
> + ctx->subslice_cnt = hweight8(
> + INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> + ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
>
> return ctx;
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 243ea22..52e341c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -200,6 +200,20 @@ struct i915_gem_context {
> * controlled via a mutex
> */
> u64 req_cnt;
> +
> + /** slice_cnt: used to set the # of slices to be enabled. */
> + u8 slice_cnt;
> +
> + /** subslice_cnt: used to set the # of subslices to be enabled. */
> + u8 subslice_cnt;
> +
> + /** eu_cnt: used to set the # of eu to be enabled. */
> + u8 eu_cnt;
> +
> + /** update_render_config: to track the updates to the render
> + * configuration (S/SS/EU Configuration on the GPU)
> + */
> + bool update_render_config;
> };
>
> static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 039fbdb..d2d0e7d 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -364,6 +364,36 @@ execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists)
> spin_unlock_irqrestore(&engine->timeline.lock, flags);
> }
>
> +static u32
> +get_context_rpcs_config(struct i915_gem_context *ctx)
> +{
> + struct drm_i915_private *dev_priv = ctx->i915;
> + u32 rpcs = 0;
> +
> + if (INTEL_GEN(dev_priv) < 8)
> + return 0;
> +
> + if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
> + rpcs |= GEN8_RPCS_S_CNT_ENABLE;
> + rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
> + rpcs |= GEN8_RPCS_ENABLE;
> + }
> +
> + if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
> + rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
> + rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
> + rpcs |= GEN8_RPCS_ENABLE;
> + }
> +
> + if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
> + rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
> + rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
> + rpcs |= GEN8_RPCS_ENABLE;
> + }
> +
> + return rpcs;
> +}
This function is very similar to make_rpcs so I'd suggest to extract
some commonality and share the code.
In any case you are likely to get overtaken by
https://patchwork.freedesktop.org/series/48194/ after which you will be
able to use struct intel_sseu to package the data you need in one ctx
member, etc.
> +
> static inline void
> execlists_context_status_change(struct i915_request *rq, unsigned long status)
> {
> @@ -418,11 +448,22 @@ execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
> static u64 execlists_update_context(struct i915_request *rq)
> {
> struct intel_context *ce = rq->hw_context;
> + struct i915_gem_context *ctx = rq->gem_context;
> + struct intel_engine_cs *engine = rq->engine;
> struct i915_hw_ppgtt *ppgtt =
> rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
> u32 *reg_state = ce->lrc_reg_state;
> + u32 rpcs_config = 0;
>
> reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
> + if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
> + ctx->update_render_config) {
> + rpcs_config = get_context_rpcs_config(ctx);
> + reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
> + rpcs_config);
For a first submission this works, but when appending to a running
context the RPCS value you write here will probably get overwritten by
context save. At least I would be surprised if lite-restore would write
to the register.
So if I am correct the problem you have is that your configuration
changes will not always become live when you expect it to. And then if
you hit lite-restore you don't try to re-configure it later unless there
is a config change.
But I don't think there is a cheap/reasonable way to fix this fully. So
maybe you just give up on this chunk and rely on context image to be
updated on context pin like in
https://patchwork.freedesktop.org/patch/250007/.
Regards,
Tvrtko
> + ctx->update_render_config = 0;
> + }
>
> /* True 32b PPGTT with dynamic page allocation: update PDP
> * registers and point the unallocated PDPs to scratch page.
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
2018-09-21 9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
@ 2018-09-21 13:05 ` Tvrtko Ursulin
2018-11-06 4:25 ` Navik, Ankit P
2018-09-22 18:09 ` kbuild test robot
1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 13:05 UTC (permalink / raw)
To: kedar.j.karanje, intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> This patch will select optimum eu/slice/sub-slice configuration based on type
> of load (low, medium, high) as input.
> Based on our readings and experiments we have predefined set of optimum
> configuration for each platform(CHT, KBL).
> i915_set_optimum_config will select optimum configuration from pre-defined
> optimum configuration table(opt_config).
>
> Change-Id: I3a6a2a6bdddd01b3d3c97995f5403aef3c6fa989
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> ---
> drivers/gpu/drm/i915/i915_gem_context.c | 46 +++++++++++++++++++++++++++++++++
> drivers/gpu/drm/i915/i915_gem_context.h | 32 +++++++++++++++++++++++
> 2 files changed, 78 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 2838c1d..1b76410 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -94,6 +94,32 @@
>
> #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
>
> +static struct optimum_config opt_config[TIER_VERSION_MAX][LOAD_TYPE_MAX] = {
> + {
> + /* Cherry trail low */
> + { 1, 1, 4},
> + /* Cherry trail medium */
> + { 1, 1, 6},
> + /* Cherry trail high */
> + { 1, 2, 6}
> + },
> + {
> + /* kbl gt2 low */
> + { 2, 3, 4},
> + /* kbl gt2 medium */
> + { 2, 3, 6},
> + /* kbl gt2 high */
> + { 2, 3, 8}
> + },
> + {
> + /* kbl gt3 low */
> + { 2, 3, 4},
> + /* kbl gt3 medium */
> + { 2, 3, 6},
> + /* kbl gt3 high */
> + { 2, 3, 8}
> + }
> +};
> static void lut_close(struct i915_gem_context *ctx)
> {
> struct i915_lut_handle *lut, *ln;
> @@ -393,10 +419,30 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
> ctx->subslice_cnt = hweight8(
> INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> + ctx->load_type = 0;
> + ctx->prev_load_type = 0;
>
> return ctx;
> }
>
> +
> +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
Type for type should be enum.
Name the function as i915_gem_context_set_load_type(ctx, type).
> + enum gem_tier_versions version)
> +{
> + struct intel_context *ce = &ctx->__engine[RCS];
> + u32 *reg_state = ce->lrc_reg_state;
> + u32 rpcs_config = 0;
Unused locals?
> + /* Call opt_config to get correct configuration for eu,slice,subslice */
> + ctx->slice_cnt = (u8)opt_config[version][type].slice;
> + ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
> + ctx->eu_cnt = (u8)opt_config[version][type].eu;
You could store the correct table for the device in dev_priv and use the
static table to assign/populate on device init time.
> +
> + /* Enabling this to update the rpcs */
> + if (ctx->prev_load_type != type)
> + ctx->update_render_config = 1;
ctx->update_render_config was unused in last patch. So patch ordering is
a bit suboptimal but I don't know.
> +
> + ctx->prev_load_type = type;
> +}
> /**
> * i915_gem_context_create_gvt - create a GVT GEM context
> * @dev: drm device *
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index 52e341c..50183e6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -53,6 +53,26 @@ struct intel_context_ops {
> void (*destroy)(struct intel_context *ce);
> };
>
> +enum gem_load_type {
> + LOAD_TYPE_LOW,
> + LOAD_TYPE_MEDIUM,
> + LOAD_TYPE_HIGH,
> + LOAD_TYPE_MAX
> +};
> +
> +enum gem_tier_versions {
> + CHERRYVIEW = 0,
> + KABYLAKE_GT2,
> + KABYLAKE_GT3,
> + TIER_VERSION_MAX
> +};
> +
> +struct optimum_config {
> + int slice;
> + int subslice;
> + int eu;
u8 would do global table definitely.
> +};
> +
> /**
> * struct i915_gem_context - client state
> *
> @@ -210,6 +230,16 @@ struct i915_gem_context {
> /** eu_cnt: used to set the # of eu to be enabled. */
> u8 eu_cnt;
>
> + /** load_type: The designated load_type (high/medium/low) for a given
> + * number of pending commands in the command queue.
> + */
> + u8 load_type;
Unused?
> +
> + /** prev_load_type: The earlier load type that the GPU was configured
> + * for (high/medium/low).
> + */
> + u8 prev_load_type;
Would pending_load_type sound more obvious?
Types should be enums for both.
Regards,
Tvrtko
> +
> /** update_render_config: to track the updates to the render
> * configuration (S/SS/EU Configuration on the GPU)
> */
> @@ -342,6 +372,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file_priv);
> int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> struct drm_file *file);
> +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
> + enum gem_tier_versions version);
>
> struct i915_gem_context *
> i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
2018-09-21 9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
@ 2018-09-21 13:24 ` Tvrtko Ursulin
2018-09-25 6:12 ` Navik, Ankit P
2018-09-22 18:31 ` kbuild test robot
1 sibling, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-21 13:24 UTC (permalink / raw)
To: kedar.j.karanje, intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> High resoluton timer is used for this purpose.
>
> Debugfs is provided to enable/disable/update timer configuration
>
> Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
> ---
> drivers/gpu/drm/i915/i915_debugfs.c | 94 ++++++++++++++++++++++++++++++++++++-
> drivers/gpu/drm/i915/i915_drv.h | 1 +
> 2 files changed, 94 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index f9ce35d..81ba509 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4740,6 +4740,97 @@ static const struct drm_info_list i915_debugfs_list[] = {
> {"i915_drrs_status", i915_drrs_status, 0},
> {"i915_rps_boost_info", i915_rps_boost_info, 0},
> };
> +
> +#define POLL_PERIOD_MS (1000 * 1000)
> +#define PENDING_REQ_0 0 /* No active request pending*/
> +#define PENDING_REQ_3 3 /* Threshold value of 3 active request pending*/
> + /* Anything above this is considered as HIGH load
> + * context
> + */
> + /* And less is considered as LOW load*/
> + /* And equal is considered as mediaum load */
Wonky comments and some typos up to here.
> +
> +static int predictive_load_enable;
> +static int predictive_load_timer_init;
> +
> +static enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
> +{
> + struct drm_i915_private *dev_priv =
> + container_of(hrtimer, typeof(*dev_priv),
> + pred_timer);
> + enum intel_engine_id id;
> + struct intel_engine_cs *engine;
Some unused's.
> + struct i915_gem_context *ctx;
> + u64 req_pending;
unsigned long, and also please try to order declaration so the right
edge of text is moving in one direction only.
> +
> + list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> +
> + if (!ctx->name)
> + continue;
What is this?
> +
> + mutex_lock(&dev_priv->pred_mutex);
Here the mutex bites you since you cannot sleep in the timer callback.
atomic_t would solve it. Or would a native unsigned int/long since lock
to read a native word on x86 is not needed.
> + req_pending = ctx->req_cnt;
> + mutex_unlock(&dev_priv->pred_mutex);
> +
> + if (req_pending == PENDING_REQ_0)
> + continue;
> +
> + if (req_pending > PENDING_REQ_3)
> + ctx->load_type = LOAD_TYPE_HIGH;
> + else if (req_pending == PENDING_REQ_3)
> + ctx->load_type = LOAD_TYPE_MEDIUM;
> + else if (req_pending < PENDING_REQ_3)
Must be smaller if not greater or equal, but maybe the compiler does
that for you.
> + ctx->load_type = LOAD_TYPE_LOW;
> +
> + i915_set_optimum_config(ctx->load_type, ctx, KABYLAKE_GT3);
Only KBL? Idea to put the table in dev_priv FTW! :)
ctx->load_type used only as a temporary uncovered here? :)
> + }
> +
> + hrtimer_forward_now(hrtimer,
> + ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
Or HRTIMER_NORESTART if disabled? Hard to call it, details..
> +
> + return HRTIMER_RESTART;
> +}
> +
> +static int
> +i915_predictive_load_get(void *data, u64 *val)
> +{
> + struct drm_i915_private *dev_priv = data;
> +
> + *val = predictive_load_enable;
> + return 0;
> +}
> +
> +static int
> +i915_predictive_load_set(void *data, u64 val)
> +{
> + struct drm_i915_private *dev_priv = data;
> + struct intel_device_info *info;
> +
> + info = mkwrite_device_info(dev_priv);
Unused, why?
> +
> + predictive_load_enable = val;
> +
> + if (predictive_load_enable) {
> + if (!predictive_load_timer_init) {
> + hrtimer_init(&dev_priv->pred_timer, CLOCK_MONOTONIC,
> + HRTIMER_MODE_REL);
> + dev_priv->pred_timer.function = predictive_load_cb;
> + predictive_load_timer_init = 1;
Move timer init to dev_priv setup.
> + }
> + hrtimer_start(&dev_priv->pred_timer,
> + ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
> + HRTIMER_MODE_REL_PINNED);
Two threads can race to here.
Also you can give some slack to the timer since precision is not
critical to you and can save you some CPU power.
> + } else {
> + hrtimer_cancel(&dev_priv->pred_timer);
> + }
> +
> + return 0;
> +}
> +
> +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
> + i915_predictive_load_get, i915_predictive_load_set,
> + "%llu\n");
> +
> #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>
> static const struct i915_debugfs_files {
> @@ -4769,7 +4860,8 @@ static const struct i915_debugfs_files {
> {"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
> {"i915_ipc_status", &i915_ipc_status_fops},
> {"i915_drrs_ctl", &i915_drrs_ctl_fops},
> - {"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
> + {"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
> + {"i915_predictive_load_ctl", &i915_predictive_load_ctl}
And if the feature is to become real one day, given it's nature, the
knob would probably have to go to sysfs, not debugfs.
> };
>
> int i915_debugfs_register(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 137ec33..0505c47 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
> };
>
> struct drm_i915_private {
> + struct hrtimer pred_timer;
Surely not the first member! :)
Regards,
Tvrtko
> struct drm_device drm;
>
> struct kmem_cache *objects;
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
2018-09-21 9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
2018-09-21 13:05 ` Tvrtko Ursulin
@ 2018-09-22 18:09 ` kbuild test robot
1 sibling, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2018-09-22 18:09 UTC (permalink / raw)
To: kedar.j.karanje
Cc: Aravindan Muthukumar, intel-gfx, Yogesh Marathe, kbuild-all,
Ankit Navik, Praveen Diwakar
[-- Attachment #1: Type: text/plain, Size: 3375 bytes --]
Hi Praveen,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/kedar-j-karanje-intel-com/drm-i915-Get-active-pending-request-for-given-context/20180923-012250
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x014-201838 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/gpu/drm/i915/i915_gem_context.c: In function 'i915_set_optimum_config':
>> drivers/gpu/drm/i915/i915_gem_context.c:434:6: error: unused variable 'rpcs_config' [-Werror=unused-variable]
u32 rpcs_config = 0;
^~~~~~~~~~~
>> drivers/gpu/drm/i915/i915_gem_context.c:433:7: error: unused variable 'reg_state' [-Werror=unused-variable]
u32 *reg_state = ce->lrc_reg_state;
^~~~~~~~~
cc1: all warnings being treated as errors
vim +/rpcs_config +434 drivers/gpu/drm/i915/i915_gem_context.c
427
428
429 void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
430 enum gem_tier_versions version)
431 {
432 struct intel_context *ce = &ctx->__engine[RCS];
> 433 u32 *reg_state = ce->lrc_reg_state;
> 434 u32 rpcs_config = 0;
435 /* Call opt_config to get correct configuration for eu,slice,subslice */
436 ctx->slice_cnt = (u8)opt_config[version][type].slice;
437 ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
438 ctx->eu_cnt = (u8)opt_config[version][type].eu;
439
440 /* Enabling this to update the rpcs */
441 if (ctx->prev_load_type != type)
442 ctx->update_render_config = 1;
443
444 ctx->prev_load_type = type;
445 }
446 /**
447 * i915_gem_context_create_gvt - create a GVT GEM context
448 * @dev: drm device *
449 *
450 * This function is used to create a GVT specific GEM context.
451 *
452 * Returns:
453 * pointer to i915_gem_context on success, error pointer if failed
454 *
455 */
456 struct i915_gem_context *
457 i915_gem_context_create_gvt(struct drm_device *dev)
458 {
459 struct i915_gem_context *ctx;
460 int ret;
461
462 if (!IS_ENABLED(CONFIG_DRM_I915_GVT))
463 return ERR_PTR(-ENODEV);
464
465 ret = i915_mutex_lock_interruptible(dev);
466 if (ret)
467 return ERR_PTR(ret);
468
469 ctx = __create_hw_context(to_i915(dev), NULL);
470 if (IS_ERR(ctx))
471 goto out;
472
473 ctx->file_priv = ERR_PTR(-EBADF);
474 i915_gem_context_set_closed(ctx); /* not user accessible */
475 i915_gem_context_clear_bannable(ctx);
476 i915_gem_context_set_force_single_submission(ctx);
477 if (!USES_GUC_SUBMISSION(to_i915(dev)))
478 ctx->ring_size = 512 * PAGE_SIZE; /* Max ring buffer size */
479
480 GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
481 out:
482 mutex_unlock(&dev->struct_mutex);
483 return ctx;
484 }
485
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35992 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
2018-09-21 9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
2018-09-21 13:24 ` Tvrtko Ursulin
@ 2018-09-22 18:31 ` kbuild test robot
1 sibling, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2018-09-22 18:31 UTC (permalink / raw)
To: kedar.j.karanje
Cc: Aravindan Muthukumar, intel-gfx, Yogesh Marathe, kbuild-all,
Ankit Navik, Praveen Diwakar
[-- Attachment #1: Type: text/plain, Size: 3087 bytes --]
Hi Praveen,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/kedar-j-karanje-intel-com/drm-i915-Get-active-pending-request-for-given-context/20180923-012250
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: i386-randconfig-x014-201838 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
drivers/gpu//drm/i915/i915_debugfs.c: In function 'predictive_load_cb':
>> drivers/gpu//drm/i915/i915_debugfs.c:4762:26: error: unused variable 'engine' [-Werror=unused-variable]
struct intel_engine_cs *engine;
^~~~~~
>> drivers/gpu//drm/i915/i915_debugfs.c:4761:23: error: unused variable 'id' [-Werror=unused-variable]
enum intel_engine_id id;
^~
drivers/gpu//drm/i915/i915_debugfs.c: In function 'i915_predictive_load_get':
>> drivers/gpu//drm/i915/i915_debugfs.c:4797:27: error: unused variable 'dev_priv' [-Werror=unused-variable]
struct drm_i915_private *dev_priv = data;
^~~~~~~~
cc1: all warnings being treated as errors
vim +/dev_priv +4797 drivers/gpu//drm/i915/i915_debugfs.c
4755
4756 static enum hrtimer_restart predictive_load_cb(struct hrtimer *hrtimer)
4757 {
4758 struct drm_i915_private *dev_priv =
4759 container_of(hrtimer, typeof(*dev_priv),
4760 pred_timer);
> 4761 enum intel_engine_id id;
> 4762 struct intel_engine_cs *engine;
4763 struct i915_gem_context *ctx;
4764 u64 req_pending;
4765
4766 list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
4767
4768 if (!ctx->name)
4769 continue;
4770
4771 mutex_lock(&dev_priv->pred_mutex);
4772 req_pending = ctx->req_cnt;
4773 mutex_unlock(&dev_priv->pred_mutex);
4774
4775 if (req_pending == PENDING_REQ_0)
4776 continue;
4777
4778 if (req_pending > PENDING_REQ_3)
4779 ctx->load_type = LOAD_TYPE_HIGH;
4780 else if (req_pending == PENDING_REQ_3)
4781 ctx->load_type = LOAD_TYPE_MEDIUM;
4782 else if (req_pending < PENDING_REQ_3)
4783 ctx->load_type = LOAD_TYPE_LOW;
4784
4785 i915_set_optimum_config(ctx->load_type, ctx, KABYLAKE_GT3);
4786 }
4787
4788 hrtimer_forward_now(hrtimer,
4789 ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
4790
4791 return HRTIMER_RESTART;
4792 }
4793
4794 static int
4795 i915_predictive_load_get(void *data, u64 *val)
4796 {
> 4797 struct drm_i915_private *dev_priv = data;
4798
4799 *val = predictive_load_enable;
4800 return 0;
4801 }
4802
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35992 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
2018-09-21 9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
2018-09-21 12:51 ` Tvrtko Ursulin
@ 2018-09-22 19:13 ` kbuild test robot
1 sibling, 0 replies; 25+ messages in thread
From: kbuild test robot @ 2018-09-22 19:13 UTC (permalink / raw)
To: kedar.j.karanje
Cc: Aravindan Muthukumar, intel-gfx, Yogesh Marathe, kbuild-all,
Ankit Navik, Praveen Diwakar
[-- Attachment #1: Type: text/plain, Size: 9434 bytes --]
Hi Praveen,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/kedar-j-karanje-intel-com/drm-i915-Get-active-pending-request-for-given-context/20180923-012250
base: git://anongit.freedesktop.org/drm-intel for-linux-next
config: x86_64-randconfig-x014-201838 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
drivers/gpu//drm/i915/intel_lrc.c: In function 'execlists_update_context':
>> drivers/gpu//drm/i915/intel_lrc.c:452:26: error: unused variable 'engine' [-Werror=unused-variable]
struct intel_engine_cs *engine = rq->engine;
^~~~~~
Cyclomatic Complexity 5 include/linux/compiler.h:__read_once_size
Cyclomatic Complexity 5 include/linux/compiler.h:__write_once_size
Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_read
Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write
Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit
Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:__set_bit
Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:clear_bit
Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:__clear_bit
Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:test_and_set_bit
Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:__test_and_set_bit
Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
Cyclomatic Complexity 1 arch/x86/include/asm/arch_hweight.h:__arch_hweight32
Cyclomatic Complexity 1 arch/x86/include/asm/arch_hweight.h:__arch_hweight8
Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read
Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_dec
Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_add_return
Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_sub_return
Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read
Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec
Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc_return
Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec_return
Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD
Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid
Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid
Cyclomatic Complexity 2 include/linux/list.h:__list_add
Cyclomatic Complexity 1 include/linux/list.h:list_add
Cyclomatic Complexity 1 include/linux/list.h:list_add_tail
Cyclomatic Complexity 1 include/linux/list.h:__list_del
Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry
Cyclomatic Complexity 1 include/linux/list.h:list_move_tail
Cyclomatic Complexity 1 include/linux/list.h:list_empty
Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current
Cyclomatic Complexity 1 include/linux/err.h:ERR_PTR
Cyclomatic Complexity 1 include/linux/err.h:PTR_ERR
Cyclomatic Complexity 1 include/linux/err.h:IS_ERR
Cyclomatic Complexity 3 include/linux/err.h:IS_ERR_OR_NULL
Cyclomatic Complexity 1 arch/x86/include/asm/processor.h:rep_nop
Cyclomatic Complexity 1 arch/x86/include/asm/processor.h:cpu_relax
Cyclomatic Complexity 1 arch/x86/include/asm/preempt.h:should_resched
Cyclomatic Complexity 1 include/linux/lockdep.h:lock_is_held
Cyclomatic Complexity 1 include/linux/spinlock.h:spinlock_check
Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock
Cyclomatic Complexity 1 include/linux/spinlock.h:spin_lock_irq
Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock
Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irq
Cyclomatic Complexity 1 include/linux/spinlock.h:spin_unlock_irqrestore
Cyclomatic Complexity 1 arch/x86/include/asm/io.h:readl
Cyclomatic Complexity 1 arch/x86/include/asm/io.h:writel
Cyclomatic Complexity 1 include/linux/rbtree.h:rb_link_node
Cyclomatic Complexity 1 include/linux/seqlock.h:raw_write_seqcount_begin
Cyclomatic Complexity 1 include/linux/seqlock.h:raw_write_seqcount_end
Cyclomatic Complexity 1 include/linux/seqlock.h:write_seqcount_begin_nested
Cyclomatic Complexity 1 include/linux/seqlock.h:write_seqcount_begin
Cyclomatic Complexity 1 include/linux/seqlock.h:write_seqcount_end
Cyclomatic Complexity 1 include/linux/seqlock.h:__write_seqlock_irqsave
Cyclomatic Complexity 1 include/linux/seqlock.h:write_sequnlock_irqrestore
Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies
Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies
Cyclomatic Complexity 3 include/linux/ktime.h:ktime_compare
Cyclomatic Complexity 1 include/linux/ktime.h:ktime_after
Cyclomatic Complexity 1 include/linux/completion.h:__init_completion
Cyclomatic Complexity 3 include/linux/mmzone.h:__nr_to_section
Cyclomatic Complexity 1 include/linux/mmzone.h:__section_mem_map_addr
Cyclomatic Complexity 1 include/linux/kref.h:kref_get
Cyclomatic Complexity 2 include/linux/kref.h:kref_put
Cyclomatic Complexity 2 include/linux/interrupt.h:tasklet_unlock_wait
Cyclomatic Complexity 2 include/linux/interrupt.h:tasklet_schedule
Cyclomatic Complexity 2 include/linux/interrupt.h:tasklet_hi_schedule
Cyclomatic Complexity 1 include/linux/mm.h:page_to_section
Cyclomatic Complexity 1 include/linux/mm.h:lowmem_page_address
Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_inc
Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disabled_dec
Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_disable
Cyclomatic Complexity 1 include/linux/uaccess.h:pagefault_enable
Cyclomatic Complexity 1 include/linux/highmem.h:kmap_atomic
Cyclomatic Complexity 2 include/linux/highmem.h:__kunmap_atomic
Cyclomatic Complexity 2 include/linux/dma-fence.h:dma_fence_put
Cyclomatic Complexity 2 include/linux/dma-fence.h:dma_fence_get
Cyclomatic Complexity 3 include/linux/dma-fence.h:dma_fence_set_error
Cyclomatic Complexity 1 include/drm/drm_print.h:drm_debug_printer
Cyclomatic Complexity 1 include/drm/drm_mm.h:drm_mm_node_allocated
Cyclomatic Complexity 1 include/drm/drm_gem.h:drm_gem_object_get
Cyclomatic Complexity 1 include/drm/drm_gem.h:__drm_gem_object_put
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_reg.h:i915_mmio_reg_offset
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_utils.h:__list_del_many
Cyclomatic Complexity 2 drivers/gpu//drm/i915/i915_gem.h:__tasklet_disable_sync_once
Cyclomatic Complexity 2 drivers/gpu//drm/i915/i915_gem.h:__tasklet_enable_sync_once
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_gem.h:__tasklet_is_enabled
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_sw_fence.h:i915_sw_fence_done
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:dma_fence_is_i915
Cyclomatic Complexity 4 drivers/gpu//drm/i915/i915_request.h:to_request
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_request_get
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_request_put
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_request_global_seqno
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_request.h:i915_seqno_passed
Cyclomatic Complexity 1 drivers/gpu//drm/i915/i915_timeline.h:i915_timeline_put
Cyclomatic Complexity 1 drivers/gpu//drm/i915/intel_ringbuffer.h:intel_engine_has_preemption
vim +/engine +452 drivers/gpu//drm/i915/intel_lrc.c
447
448 static u64 execlists_update_context(struct i915_request *rq)
449 {
450 struct intel_context *ce = rq->hw_context;
451 struct i915_gem_context *ctx = rq->gem_context;
> 452 struct intel_engine_cs *engine = rq->engine;
453 struct i915_hw_ppgtt *ppgtt =
454 rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
455 u32 *reg_state = ce->lrc_reg_state;
456 u32 rpcs_config = 0;
457
458 reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
459 if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
460 ctx->update_render_config) {
461 rpcs_config = get_context_rpcs_config(ctx);
462 reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
463 CTX_REG(reg_state, CTX_R_PWR_CLK_STATE, GEN8_R_PWR_CLK_STATE,
464 rpcs_config);
465 ctx->update_render_config = 0;
466 }
467
468 /* True 32b PPGTT with dynamic page allocation: update PDP
469 * registers and point the unallocated PDPs to scratch page.
470 * PML4 is allocated during ppgtt init, so this is not needed
471 * in 48-bit mode.
472 */
473 if (ppgtt && !i915_vm_is_48bit(&ppgtt->vm))
474 execlists_update_context_pdps(ppgtt, reg_state);
475
476 return ce->lrc_desc;
477 }
478
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 35871 bytes --]
[-- Attachment #3: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
2018-09-21 13:24 ` Tvrtko Ursulin
@ 2018-09-25 6:12 ` Navik, Ankit P
2018-09-25 8:25 ` Tvrtko Ursulin
0 siblings, 1 reply; 25+ messages in thread
From: Navik, Ankit P @ 2018-09-25 6:12 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan
Hi Tvrtko,
Thank you for your valuable comments. We have gone through it.
I'll be submitting revised patch-sets after incorporating all your review comments.
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > High resoluton timer is used for this purpose.
> >
> > Debugfs is provided to enable/disable/update timer configuration
> >
> > Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
> > ---
> > drivers/gpu/drm/i915/i915_debugfs.c | 94
> ++++++++++++++++++++++++++++++++++++-
> > drivers/gpu/drm/i915/i915_drv.h | 1 +
> > 2 files changed, 94 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > b/drivers/gpu/drm/i915/i915_debugfs.c
> > index f9ce35d..81ba509 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -4740,6 +4740,97 @@ static const struct drm_info_list
> i915_debugfs_list[] = {
> > {"i915_drrs_status", i915_drrs_status, 0},
> > {"i915_rps_boost_info", i915_rps_boost_info, 0},
> > };
> > +
> > +#define POLL_PERIOD_MS (1000 * 1000)
> > +#define PENDING_REQ_0 0 /* No active request pending*/
> > +#define PENDING_REQ_3 3 /* Threshold value of 3 active request
> pending*/
> > + /* Anything above this is considered as HIGH load
> > + * context
> > + */
> > + /* And less is considered as LOW load*/
> > + /* And equal is considered as mediaum load */
>
> Wonky comments and some typos up to here.
>
> > +
> > +static int predictive_load_enable;
> > +static int predictive_load_timer_init;
> > +
> > +static enum hrtimer_restart predictive_load_cb(struct hrtimer
> > +*hrtimer) {
> > + struct drm_i915_private *dev_priv =
> > + container_of(hrtimer, typeof(*dev_priv),
> > + pred_timer);
> > + enum intel_engine_id id;
> > + struct intel_engine_cs *engine;
>
> Some unused's.
>
> > + struct i915_gem_context *ctx;
> > + u64 req_pending;
>
> unsigned long, and also please try to order declaration so the right edge of text
> is moving in one direction only.
>
> > +
> > + list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> > +
> > + if (!ctx->name)
> > + continue;
>
> What is this?
>
> > +
> > + mutex_lock(&dev_priv->pred_mutex);
>
> Here the mutex bites you since you cannot sleep in the timer callback.
> atomic_t would solve it. Or would a native unsigned int/long since lock to read a
> native word on x86 is not needed.
>
> > + req_pending = ctx->req_cnt;
> > + mutex_unlock(&dev_priv->pred_mutex);
> > +
> > + if (req_pending == PENDING_REQ_0)
> > + continue;
> > +
> > + if (req_pending > PENDING_REQ_3)
> > + ctx->load_type = LOAD_TYPE_HIGH;
> > + else if (req_pending == PENDING_REQ_3)
> > + ctx->load_type = LOAD_TYPE_MEDIUM;
> > + else if (req_pending < PENDING_REQ_3)
>
> Must be smaller if not greater or equal, but maybe the compiler does that for
> you.
>
> > + ctx->load_type = LOAD_TYPE_LOW;
> > +
> > + i915_set_optimum_config(ctx->load_type, ctx,
> KABYLAKE_GT3);
>
> Only KBL? Idea to put the table in dev_priv FTW! :)
>
> ctx->load_type used only as a temporary uncovered here? :)
>
> > + }
> > +
> > + hrtimer_forward_now(hrtimer,
> > +
> ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
>
> Or HRTIMER_NORESTART if disabled? Hard to call it, details..
>
> > +
> > + return HRTIMER_RESTART;
> > +}
> > +
> > +static int
> > +i915_predictive_load_get(void *data, u64 *val) {
> > + struct drm_i915_private *dev_priv = data;
> > +
> > + *val = predictive_load_enable;
> > + return 0;
> > +}
> > +
> > +static int
> > +i915_predictive_load_set(void *data, u64 val) {
> > + struct drm_i915_private *dev_priv = data;
> > + struct intel_device_info *info;
> > +
> > + info = mkwrite_device_info(dev_priv);
>
> Unused, why?
>
> > +
> > + predictive_load_enable = val;
> > +
> > + if (predictive_load_enable) {
> > + if (!predictive_load_timer_init) {
> > + hrtimer_init(&dev_priv->pred_timer,
> CLOCK_MONOTONIC,
> > + HRTIMER_MODE_REL);
> > + dev_priv->pred_timer.function = predictive_load_cb;
> > + predictive_load_timer_init = 1;
>
> Move timer init to dev_priv setup.
>
> > + }
> > + hrtimer_start(&dev_priv->pred_timer,
> > +
> ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
> > + HRTIMER_MODE_REL_PINNED);
>
> Two threads can race to here.
>
> Also you can give some slack to the timer since precision is not critical to you
> and can save you some CPU power.
>
> > + } else {
> > + hrtimer_cancel(&dev_priv->pred_timer);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
> > + i915_predictive_load_get, i915_predictive_load_set,
> > + "%llu\n");
> > +
> > #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
> >
> > static const struct i915_debugfs_files { @@ -4769,7 +4860,8 @@
> > static const struct i915_debugfs_files {
> > {"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
> > {"i915_ipc_status", &i915_ipc_status_fops},
> > {"i915_drrs_ctl", &i915_drrs_ctl_fops},
> > - {"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
> > + {"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
> > + {"i915_predictive_load_ctl", &i915_predictive_load_ctl}
>
> And if the feature is to become real one day, given it's nature, the knob would
> probably have to go to sysfs, not debugfs.
>
> > };
> >
> > int i915_debugfs_register(struct drm_i915_private *dev_priv) diff
> > --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h index 137ec33..0505c47 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
> > };
> >
> > struct drm_i915_private {
> > + struct hrtimer pred_timer;
>
> Surely not the first member! :)
>
> Regards,
>
> Tvrtko
>
> > struct drm_device drm;
> >
> > struct kmem_cache *objects;
> >
Regards,
Ankit
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
2018-09-25 8:04 ` Jani Nikula
@ 2018-09-25 7:41 ` Kedar J. Karanje
0 siblings, 0 replies; 25+ messages in thread
From: Kedar J. Karanje @ 2018-09-25 7:41 UTC (permalink / raw)
To: Jani Nikula, intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
Hello Jani,
On Tuesday 25 September 2018 01:34 PM, Jani Nikula wrote:
> On Fri, 21 Sep 2018, kedar.j.karanje@intel.com wrote:
>> From: Praveen Diwakar <praveen.diwakar@intel.com>
>>
>> This patch gives us the active pending request count which is yet
>> to be submitted to the GPU
>>
>> Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
> Please remove these.
We will remove the Change-Ids in subsequent patches, thanks.
>
>> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
>> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
>> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
>> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
>> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> What are you trying to convey with the five signoffs?
The guys who were working on this moved on to other teams, since we
picked it up from where they left we have included the original authors
and added new folks who would continue to work on it too, I am not sure
of what is the protocol in such situations.
I referred to the guidelines at:
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html
please do let me know if there are any changes in the same.
Thanks,
Kedar
>
> BR,
> Jani.
>
>
>> ---
>> drivers/gpu/drm/i915/i915_drv.c | 1 +
>> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
>> drivers/gpu/drm/i915/i915_gem_context.c | 1 +
>> drivers/gpu/drm/i915/i915_gem_context.h | 6 ++++++
>> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
>> drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++
>> 6 files changed, 24 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>> index f8cfd16..d37c46e 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.c
>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
>> mutex_init(&dev_priv->av_mutex);
>> mutex_init(&dev_priv->wm.wm_mutex);
>> mutex_init(&dev_priv->pps_mutex);
>> + mutex_init(&dev_priv->pred_mutex);
>>
>> i915_memcpy_init_early(dev_priv);
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 4aca534..137ec33 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
>> * controller on different i2c buses. */
>> struct mutex gmbus_mutex;
>>
>> + /** pred_mutex protects against councurrent usage of pending
>> + * request counter for multiple contexts
>> + */
>> + struct mutex pred_mutex;
>> +
>> /**
>> * Base address of the gmbus and gpio block.
>> */
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index b10770c..30932d9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
>> }
>>
>> trace_i915_context_create(ctx);
>> + ctx->req_cnt = 0;
>>
>> return ctx;
>> }
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
>> index b116e49..243ea22 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.h
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
>> @@ -194,6 +194,12 @@ struct i915_gem_context {
>> * context close.
>> */
>> struct list_head handles_list;
>> +
>> + /** req_cnt: tracks the pending commands, based on which we decide to
>> + * go for low/medium/high load configuration of the GPU, this is
>> + * controlled via a mutex
>> + */
>> + u64 req_cnt;
>> };
>>
>> static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 3f0c612..f799ff9 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>> struct drm_syncobj **fences)
>> {
>> struct i915_execbuffer eb;
>> + struct drm_i915_private *dev_priv = to_i915(dev);
>> struct dma_fence *in_fence = NULL;
>> struct sync_file *out_fence = NULL;
>> int out_fence_fd = -1;
>> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
>> */
>> eb.request->batch = eb.batch;
>>
>> + mutex_lock(&dev_priv->pred_mutex);
>> + eb.ctx->req_cnt++;
>> + mutex_unlock(&dev_priv->pred_mutex);
>> +
>> trace_i915_request_queue(eb.request, eb.batch_flags);
>> err = eb_submit(&eb);
>> err_request:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index 1744792..039fbdb 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
>> trace_i915_request_in(rq, port_index(port, execlists));
>> last = rq;
>> submit = true;
>> +
>> + mutex_lock(&rq->i915->pred_mutex);
>> + if (rq->gem_context->req_cnt > 0) {
>> + rq->gem_context->req_cnt--;
>> + }
>> + mutex_unlock(&rq->i915->pred_mutex);
>> }
>>
>> rb_erase_cached(&p->node, &execlists->queue);
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
2018-09-21 9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
2018-09-21 12:39 ` Tvrtko Ursulin
@ 2018-09-25 8:04 ` Jani Nikula
2018-09-25 7:41 ` Kedar J. Karanje
1 sibling, 1 reply; 25+ messages in thread
From: Jani Nikula @ 2018-09-25 8:04 UTC (permalink / raw)
To: kedar.j.karanje, intel-gfx
Cc: Praveen Diwakar, Yogesh Marathe, Ankit Navik, Aravindan Muthukumar
On Fri, 21 Sep 2018, kedar.j.karanje@intel.com wrote:
> From: Praveen Diwakar <praveen.diwakar@intel.com>
>
> This patch gives us the active pending request count which is yet
> to be submitted to the GPU
>
> Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
Please remove these.
> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
What are you trying to convey with the five signoffs?
BR,
Jani.
> ---
> drivers/gpu/drm/i915/i915_drv.c | 1 +
> drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> drivers/gpu/drm/i915/i915_gem_context.c | 1 +
> drivers/gpu/drm/i915/i915_gem_context.h | 6 ++++++
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
> drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++
> 6 files changed, 24 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index f8cfd16..d37c46e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
> mutex_init(&dev_priv->av_mutex);
> mutex_init(&dev_priv->wm.wm_mutex);
> mutex_init(&dev_priv->pps_mutex);
> + mutex_init(&dev_priv->pred_mutex);
>
> i915_memcpy_init_early(dev_priv);
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 4aca534..137ec33 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1609,6 +1609,11 @@ struct drm_i915_private {
> * controller on different i2c buses. */
> struct mutex gmbus_mutex;
>
> + /** pred_mutex protects against councurrent usage of pending
> + * request counter for multiple contexts
> + */
> + struct mutex pred_mutex;
> +
> /**
> * Base address of the gmbus and gpio block.
> */
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index b10770c..30932d9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
> }
>
> trace_i915_context_create(ctx);
> + ctx->req_cnt = 0;
>
> return ctx;
> }
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
> index b116e49..243ea22 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.h
> +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> @@ -194,6 +194,12 @@ struct i915_gem_context {
> * context close.
> */
> struct list_head handles_list;
> +
> + /** req_cnt: tracks the pending commands, based on which we decide to
> + * go for low/medium/high load configuration of the GPU, this is
> + * controlled via a mutex
> + */
> + u64 req_cnt;
> };
>
> static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 3f0c612..f799ff9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> struct drm_syncobj **fences)
> {
> struct i915_execbuffer eb;
> + struct drm_i915_private *dev_priv = to_i915(dev);
> struct dma_fence *in_fence = NULL;
> struct sync_file *out_fence = NULL;
> int out_fence_fd = -1;
> @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> */
> eb.request->batch = eb.batch;
>
> + mutex_lock(&dev_priv->pred_mutex);
> + eb.ctx->req_cnt++;
> + mutex_unlock(&dev_priv->pred_mutex);
> +
> trace_i915_request_queue(eb.request, eb.batch_flags);
> err = eb_submit(&eb);
> err_request:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 1744792..039fbdb 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
> trace_i915_request_in(rq, port_index(port, execlists));
> last = rq;
> submit = true;
> +
> + mutex_lock(&rq->i915->pred_mutex);
> + if (rq->gem_context->req_cnt > 0) {
> + rq->gem_context->req_cnt--;
> + }
> + mutex_unlock(&rq->i915->pred_mutex);
> }
>
> rb_erase_cached(&p->node, &execlists->queue);
--
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
2018-09-25 6:12 ` Navik, Ankit P
@ 2018-09-25 8:25 ` Tvrtko Ursulin
2018-11-06 4:46 ` Navik, Ankit P
0 siblings, 1 reply; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-09-25 8:25 UTC (permalink / raw)
To: Navik, Ankit P, intel-gfx
Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan
On 25/09/2018 07:12, Navik, Ankit P wrote:
> Hi Tvrtko,
>
> Thank you for your valuable comments. We have gone through it.
> I'll be submitting revised patch-sets after incorporating all your review comments.
You're welcome!
Btw one more thing - you can suspend your timer when GPU goes idle and
restart it when active. See for instance i915_pmu_gt_parked/unparked
where to put the hooks.
Regards,
Tvrtko
>> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
>>> From: Praveen Diwakar <praveen.diwakar@intel.com>
>>>
>>> High resoluton timer is used for this purpose.
>>>
>>> Debugfs is provided to enable/disable/update timer configuration
>>>
>>> Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
>>> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
>>> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
>>> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
>>> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
>>> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
>>> ---
>>> drivers/gpu/drm/i915/i915_debugfs.c | 94
>> ++++++++++++++++++++++++++++++++++++-
>>> drivers/gpu/drm/i915/i915_drv.h | 1 +
>>> 2 files changed, 94 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
>>> b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index f9ce35d..81ba509 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -4740,6 +4740,97 @@ static const struct drm_info_list
>> i915_debugfs_list[] = {
>>> {"i915_drrs_status", i915_drrs_status, 0},
>>> {"i915_rps_boost_info", i915_rps_boost_info, 0},
>>> };
>>> +
>>> +#define POLL_PERIOD_MS (1000 * 1000)
>>> +#define PENDING_REQ_0 0 /* No active request pending*/
>>> +#define PENDING_REQ_3 3 /* Threshold value of 3 active request
>> pending*/
>>> + /* Anything above this is considered as HIGH load
>>> + * context
>>> + */
>>> + /* And less is considered as LOW load*/
>>> + /* And equal is considered as mediaum load */
>>
>> Wonky comments and some typos up to here.
>>
>>> +
>>> +static int predictive_load_enable;
>>> +static int predictive_load_timer_init;
>>> +
>>> +static enum hrtimer_restart predictive_load_cb(struct hrtimer
>>> +*hrtimer) {
>>> + struct drm_i915_private *dev_priv =
>>> + container_of(hrtimer, typeof(*dev_priv),
>>> + pred_timer);
>>> + enum intel_engine_id id;
>>> + struct intel_engine_cs *engine;
>>
>> Some unused's.
>>
>>> + struct i915_gem_context *ctx;
>>> + u64 req_pending;
>>
>> unsigned long, and also please try to order declaration so the right edge of text
>> is moving in one direction only.
>>
>>> +
>>> + list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
>>> +
>>> + if (!ctx->name)
>>> + continue;
>>
>> What is this?
>>
>>> +
>>> + mutex_lock(&dev_priv->pred_mutex);
>>
>> Here the mutex bites you since you cannot sleep in the timer callback.
>> atomic_t would solve it. Or would a native unsigned int/long since lock to read a
>> native word on x86 is not needed.
>>
>>> + req_pending = ctx->req_cnt;
>>> + mutex_unlock(&dev_priv->pred_mutex);
>>> +
>>> + if (req_pending == PENDING_REQ_0)
>>> + continue;
>>> +
>>> + if (req_pending > PENDING_REQ_3)
>>> + ctx->load_type = LOAD_TYPE_HIGH;
>>> + else if (req_pending == PENDING_REQ_3)
>>> + ctx->load_type = LOAD_TYPE_MEDIUM;
>>> + else if (req_pending < PENDING_REQ_3)
>>
>> Must be smaller if not greater or equal, but maybe the compiler does that for
>> you.
>>
>>> + ctx->load_type = LOAD_TYPE_LOW;
>>> +
>>> + i915_set_optimum_config(ctx->load_type, ctx,
>> KABYLAKE_GT3);
>>
>> Only KBL? Idea to put the table in dev_priv FTW! :)
>>
>> ctx->load_type used only as a temporary uncovered here? :)
>>
>>> + }
>>> +
>>> + hrtimer_forward_now(hrtimer,
>>> +
>> ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
>>
>> Or HRTIMER_NORESTART if disabled? Hard to call it, details..
>>
>>> +
>>> + return HRTIMER_RESTART;
>>> +}
>>> +
>>> +static int
>>> +i915_predictive_load_get(void *data, u64 *val) {
>>> + struct drm_i915_private *dev_priv = data;
>>> +
>>> + *val = predictive_load_enable;
>>> + return 0;
>>> +}
>>> +
>>> +static int
>>> +i915_predictive_load_set(void *data, u64 val) {
>>> + struct drm_i915_private *dev_priv = data;
>>> + struct intel_device_info *info;
>>> +
>>> + info = mkwrite_device_info(dev_priv);
>>
>> Unused, why?
>>
>>> +
>>> + predictive_load_enable = val;
>>> +
>>> + if (predictive_load_enable) {
>>> + if (!predictive_load_timer_init) {
>>> + hrtimer_init(&dev_priv->pred_timer,
>> CLOCK_MONOTONIC,
>>> + HRTIMER_MODE_REL);
>>> + dev_priv->pred_timer.function = predictive_load_cb;
>>> + predictive_load_timer_init = 1;
>>
>> Move timer init to dev_priv setup.
>>
>>> + }
>>> + hrtimer_start(&dev_priv->pred_timer,
>>> +
>> ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
>>> + HRTIMER_MODE_REL_PINNED);
>>
>> Two threads can race to here.
>>
>> Also you can give some slack to the timer since precision is not critical to you
>> and can save you some CPU power.
>>
>>> + } else {
>>> + hrtimer_cancel(&dev_priv->pred_timer);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
>>> + i915_predictive_load_get, i915_predictive_load_set,
>>> + "%llu\n");
>>> +
>>> #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
>>>
>>> static const struct i915_debugfs_files { @@ -4769,7 +4860,8 @@
>>> static const struct i915_debugfs_files {
>>> {"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
>>> {"i915_ipc_status", &i915_ipc_status_fops},
>>> {"i915_drrs_ctl", &i915_drrs_ctl_fops},
>>> - {"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
>>> + {"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
>>> + {"i915_predictive_load_ctl", &i915_predictive_load_ctl}
>>
>> And if the feature is to become real one day, given it's nature, the knob would
>> probably have to go to sysfs, not debugfs.
>>
>>> };
>>>
>>> int i915_debugfs_register(struct drm_i915_private *dev_priv) diff
>>> --git a/drivers/gpu/drm/i915/i915_drv.h
>>> b/drivers/gpu/drm/i915/i915_drv.h index 137ec33..0505c47 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
>>> };
>>>
>>> struct drm_i915_private {
>>> + struct hrtimer pred_timer;
>>
>> Surely not the first member! :)
>>
>> Regards,
>>
>> Tvrtko
>>
>>> struct drm_device drm;
>>>
>>> struct kmem_cache *objects;
>>>
>
> Regards,
> Ankit
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU.
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
` (5 preceding siblings ...)
2018-09-21 12:31 ` [PATCH 0/4][RFC] " Tvrtko Ursulin
@ 2018-09-27 14:02 ` Joonas Lahtinen
6 siblings, 0 replies; 25+ messages in thread
From: Joonas Lahtinen @ 2018-09-27 14:02 UTC (permalink / raw)
To: intel-gfx, kedar.j.karanje
+ Tvrtko for adding the right media contacts
Quoting kedar.j.karanje@intel.com (2018-09-21 12:13:46)
> From: "Kedar J. Karanje" <kedar.j.karanje@intel.com>
>
> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel
>
> Current GPU configuration code for i915 does not allow us to change
> EU/Slice/Sub-slice configuration dynamically. Its done only once while context
> is created.
>
> While particular graphics application is running, if we examine the command
> requests from user space, we observe that command density is not consistent.
> It means there is scope to change the graphics configuration dynamically even
> while context is running actively. This patch series proposes the solution to
> find the active pending load for all active context at given time and based on
> that, dynamically perform graphics configuration for each context.
>
> We use a hr (high resolution) timer with i915 driver in kernel to get a
> callback every few milliseconds (this timer value can be configured through
> debugfs, default is '0' indicating timer is in disabled state i.e. original
> system without any intervention).In the timer callback, we examine pending
> commands for a context in the queue, essentially, we intercept them before
> they are executed by GPU and we update context with required number of EUs.
The off-by-default and enabling through debugfs nature would make this
effectively dead code for upstream as debugfs is off limits in many
production systems for the security concerns.
So the algorithm should really be generic enough to be on-by-default
without regressing existing workloads. Otherwise there would need to
be some tuning tool to control the tables outside of debugfs. cgroups
could be one place, but the effort to bring cgroups doesn't seem to be
advancing, please see mailing list archive.
There's also an ongoing series about allowing userspace to control the
said SSEU register freely for Media, so I'd hope we could consolidate on
one control mechanism.
Regards, Joonas
>
> Two questions, how did we arrive at right timer value? and what's the right
> number of EUs? For the prior one, empirical data to achieve best performance
> in least power was considered. For the later one, we roughly categorized number
> of EUs logically based on platform. Now we compare number of pending commands
> with a particular threshold and then set number of EUs accordingly with update
> context. That threshold is also based on experiments & findings. If GPU is able
> to catch up with CPU, typically there are no pending commands, the EU config
> would remain unchanged there. In case there are more pending commands we
> reprogram context with higher number of EUs. Please note, here we are changing
> EUs even while context is running by examining pending commands every 'x'
> milliseconds.
>
> With this solution in place, on KBL-GT3 + Android we saw following pnp
> benefits without any performance degradation, power numbers mentioned here
> are system power.
>
> App /KPI | % Power |
> | Benefit |
> | (mW) |
> ---------------------------------|
> 3D Mark (Ice storm) | 2.30% |
> TRex On screen | 2.49% |
> TRex Off screen | 1.32% |
> ManhattanOn screen | 3.11% |
> Manhattan Off screen | 0.89% |
> AnTuTu 6.1.4 | 3.42% |
>
> Note - For KBL (GEN9) we cannot control at sub-slice level, it was always a
> constraint.
> We always controlled number of EUs rather than sub-slices/slices.
>
>
> Praveen Diwakar (4):
> drm/i915: Get active pending request for given context
> drm/i915: Update render power clock state configuration for given
> context
> drm/i915: set optimum eu/slice/sub-slice configuration based on load
> type
> drm/i915: Predictive governor to control eu/slice/subslice based on
> workload
>
> drivers/gpu/drm/i915/i915_debugfs.c | 94 +++++++++++++++++++++++++++++-
> drivers/gpu/drm/i915/i915_drv.c | 1 +
> drivers/gpu/drm/i915/i915_drv.h | 6 ++
> drivers/gpu/drm/i915/i915_gem_context.c | 52 +++++++++++++++++
> drivers/gpu/drm/i915/i915_gem_context.h | 52 +++++++++++++++++
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 ++
> drivers/gpu/drm/i915/intel_lrc.c | 47 +++++++++++++++
> 7 files changed, 256 insertions(+), 1 deletion(-)
>
> --
> 2.7.4
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/4] drm/i915: Get active pending request for given context
2018-09-21 12:39 ` Tvrtko Ursulin
@ 2018-11-06 4:18 ` Navik, Ankit P
0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06 4:18 UTC (permalink / raw)
To: Tvrtko Ursulin, J Karanje, Kedar, intel-gfx
Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan
Hi Tvrtko,
> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Friday, September 21, 2018 6:10 PM
> To: J Karanje, Kedar <kedar.j.karanje@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Diwakar, Praveen <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Navik, Ankit P <ankit.p.navik@intel.com>;
> Muthukumar, Aravindan <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 1/4] drm/i915: Get active pending request for
> given context
>
>
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > This patch gives us the active pending request count which is yet to
> > be submitted to the GPU
> >
> > Change-Id: I10c2828ad0f1a0b7af147835737134e07a2d5b6d
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_drv.c | 1 +
> > drivers/gpu/drm/i915/i915_drv.h | 5 +++++
> > drivers/gpu/drm/i915/i915_gem_context.c | 1 +
> > drivers/gpu/drm/i915/i915_gem_context.h | 6 ++++++
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 5 +++++
> > drivers/gpu/drm/i915/intel_lrc.c | 6 ++++++
> > 6 files changed, 24 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > b/drivers/gpu/drm/i915/i915_drv.c index f8cfd16..d37c46e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -903,6 +903,7 @@ static int i915_driver_init_early(struct
> drm_i915_private *dev_priv,
> > mutex_init(&dev_priv->av_mutex);
> > mutex_init(&dev_priv->wm.wm_mutex);
> > mutex_init(&dev_priv->pps_mutex);
> > + mutex_init(&dev_priv->pred_mutex);
> >
> > i915_memcpy_init_early(dev_priv);
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > b/drivers/gpu/drm/i915/i915_drv.h index 4aca534..137ec33 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1609,6 +1609,11 @@ struct drm_i915_private {
> > * controller on different i2c buses. */
> > struct mutex gmbus_mutex;
> >
> > + /** pred_mutex protects against councurrent usage of pending
> > + * request counter for multiple contexts
> > + */
> > + struct mutex pred_mutex;
> > +
> > /**
> > * Base address of the gmbus and gpio block.
> > */
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index b10770c..30932d9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -387,6 +387,7 @@ i915_gem_create_context(struct drm_i915_private
> *dev_priv,
> > }
> >
> > trace_i915_context_create(ctx);
> > + ctx->req_cnt = 0;
> >
> > return ctx;
> > }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
> > b/drivers/gpu/drm/i915/i915_gem_context.h
> > index b116e49..243ea22 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -194,6 +194,12 @@ struct i915_gem_context {
> > * context close.
> > */
> > struct list_head handles_list;
> > +
> > + /** req_cnt: tracks the pending commands, based on which we decide
> to
> > + * go for low/medium/high load configuration of the GPU, this is
> > + * controlled via a mutex
> > + */
> > + u64 req_cnt;
>
> 64-bit is too wide and mutex causes you problems later in the series.
> I'd suggest atomic_t.
Changes done in v2.
>
> > };
> >
> > static inline bool i915_gem_context_is_closed(const struct
> > i915_gem_context *ctx) diff --git
> > a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 3f0c612..f799ff9 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -2178,6 +2178,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > struct drm_syncobj **fences)
> > {
> > struct i915_execbuffer eb;
> > + struct drm_i915_private *dev_priv = to_i915(dev);
> > struct dma_fence *in_fence = NULL;
> > struct sync_file *out_fence = NULL;
> > int out_fence_fd = -1;
> > @@ -2390,6 +2391,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
> > */
> > eb.request->batch = eb.batch;
> >
> > + mutex_lock(&dev_priv->pred_mutex);
> > + eb.ctx->req_cnt++;
> > + mutex_unlock(&dev_priv->pred_mutex);
> > +
> > trace_i915_request_queue(eb.request, eb.batch_flags);
> > err = eb_submit(&eb);
> > err_request:
> > diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index 1744792..039fbdb 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -728,6 +728,12 @@ static void execlists_dequeue(struct intel_engine_cs
> *engine)
> > trace_i915_request_in(rq, port_index(port, execlists));
> > last = rq;
> > submit = true;
> > +
> > + mutex_lock(&rq->i915->pred_mutex);
> > + if (rq->gem_context->req_cnt > 0) {
>
> Presumably you hit underflow here and then added the protection. This was a
> hint the accounting does not work as expected. Preemption for instance would
> wreak havoc with it. :)
>
> Have a look at my per engine queued/runnable counters for how to do it
> correctly, only that you would need to make it per context and merge into one
> aggregated queued+runnable instead of two separate counters, if you wanted
> to preserve your heuristics.
>
> https://patchwork.freedesktop.org/patch/227981/
> https://patchwork.freedesktop.org/patch/227976/
We will wait for these patches to get merged and we will continue to investigate
on how to incorporate this in our changes.
Regards, Ankit
>
> Regards,
>
> Tvrtko
>
> > + rq->gem_context->req_cnt--;
> > + }
> > + mutex_unlock(&rq->i915->pred_mutex);
> > }
> >
> > rb_erase_cached(&p->node, &execlists->queue);
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 2/4] drm/i915: Update render power clock state configuration for given context
2018-09-21 12:51 ` Tvrtko Ursulin
@ 2018-11-06 4:23 ` Navik, Ankit P
0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06 4:23 UTC (permalink / raw)
To: Tvrtko Ursulin, J Karanje, Kedar, intel-gfx
Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan
Hi Tvrtko,
> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Friday, September 21, 2018 6:22 PM
> To: J Karanje, Kedar <kedar.j.karanje@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Diwakar, Praveen <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Navik, Ankit P <ankit.p.navik@intel.com>;
> Muthukumar, Aravindan <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 2/4] drm/i915: Update render power clock state
> configuration for given context
>
>
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > This patch will update power clock state register at runtime base on
> > the flag update_render_config which can set by any governor which
> > computes load and want to update rpcs register.
> > subsequent patches will have a timer based governor which computes
> > pending load/request.
> >
> > Change-Id: I4e7d2f484b957d5bd496e1decc59a69e3bc6d186
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gem_context.c | 5 ++++
> > drivers/gpu/drm/i915/i915_gem_context.h | 14 +++++++++++
> > drivers/gpu/drm/i915/intel_lrc.c | 41
> +++++++++++++++++++++++++++++++++
> > 3 files changed, 60 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 30932d9..2838c1d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -388,6 +388,11 @@ i915_gem_create_context(struct drm_i915_private
> > *dev_priv,
> >
> > trace_i915_context_create(ctx);
> > ctx->req_cnt = 0;
> > + ctx->update_render_config = 0;
> > + ctx->slice_cnt = hweight8(INTEL_INFO(dev_priv)->sseu.slice_mask);
> > + ctx->subslice_cnt = hweight8(
> > + INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> > + ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> >
> > return ctx;
> > }
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
> > b/drivers/gpu/drm/i915/i915_gem_context.h
> > index 243ea22..52e341c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -200,6 +200,20 @@ struct i915_gem_context {
> > * controlled via a mutex
> > */
> > u64 req_cnt;
> > +
> > + /** slice_cnt: used to set the # of slices to be enabled. */
> > + u8 slice_cnt;
> > +
> > + /** subslice_cnt: used to set the # of subslices to be enabled. */
> > + u8 subslice_cnt;
> > +
> > + /** eu_cnt: used to set the # of eu to be enabled. */
> > + u8 eu_cnt;
> > +
> > + /** update_render_config: to track the updates to the render
> > + * configuration (S/SS/EU Configuration on the GPU)
> > + */
> > + bool update_render_config;
> > };
> >
> > static inline bool i915_gem_context_is_closed(const struct
> > i915_gem_context *ctx) diff --git a/drivers/gpu/drm/i915/intel_lrc.c
> > b/drivers/gpu/drm/i915/intel_lrc.c
> > index 039fbdb..d2d0e7d 100644
> > --- a/drivers/gpu/drm/i915/intel_lrc.c
> > +++ b/drivers/gpu/drm/i915/intel_lrc.c
> > @@ -364,6 +364,36 @@ execlists_unwind_incomplete_requests(struct
> intel_engine_execlists *execlists)
> > spin_unlock_irqrestore(&engine->timeline.lock, flags);
> > }
> >
> > +static u32
> > +get_context_rpcs_config(struct i915_gem_context *ctx) {
> > + struct drm_i915_private *dev_priv = ctx->i915;
> > + u32 rpcs = 0;
> > +
> > + if (INTEL_GEN(dev_priv) < 8)
> > + return 0;
> > +
> > + if (INTEL_INFO(dev_priv)->sseu.has_slice_pg) {
> > + rpcs |= GEN8_RPCS_S_CNT_ENABLE;
> > + rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT;
> > + rpcs |= GEN8_RPCS_ENABLE;
> > + }
> > +
> > + if (INTEL_INFO(dev_priv)->sseu.has_subslice_pg) {
> > + rpcs |= GEN8_RPCS_SS_CNT_ENABLE;
> > + rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT;
> > + rpcs |= GEN8_RPCS_ENABLE;
> > + }
> > +
> > + if (INTEL_INFO(dev_priv)->sseu.has_eu_pg) {
> > + rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT;
> > + rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT;
> > + rpcs |= GEN8_RPCS_ENABLE;
> > + }
> > +
> > + return rpcs;
> > +}
>
> This function is very similar to make_rpcs so I'd suggest to extract some
> commonality and share the code.
Incorporated Changes in v2.
>
> In any case you are likely to get overtaken by
> https://patchwork.freedesktop.org/series/48194/ after which you will be able
> to use struct intel_sseu to package the data you need in one ctx member, etc.
>
> > +
> > static inline void
> > execlists_context_status_change(struct i915_request *rq, unsigned long
> status)
> > {
> > @@ -418,11 +448,22 @@ execlists_update_context_pdps(struct
> i915_hw_ppgtt *ppgtt, u32 *reg_state)
> > static u64 execlists_update_context(struct i915_request *rq)
> > {
> > struct intel_context *ce = rq->hw_context;
> > + struct i915_gem_context *ctx = rq->gem_context;
> > + struct intel_engine_cs *engine = rq->engine;
> > struct i915_hw_ppgtt *ppgtt =
> > rq->gem_context->ppgtt ?: rq->i915->mm.aliasing_ppgtt;
> > u32 *reg_state = ce->lrc_reg_state;
> > + u32 rpcs_config = 0;
> >
> > reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring,
> > rq->tail);
> > + if (ctx->pid && ctx->name && (rq->engine->id == RCS) &&
> > + ctx->update_render_config) {
> > + rpcs_config = get_context_rpcs_config(ctx);
> > + reg_state[CTX_LRI_HEADER_2] = MI_LOAD_REGISTER_IMM(1);
> > + CTX_REG(reg_state, CTX_R_PWR_CLK_STATE,
> GEN8_R_PWR_CLK_STATE,
> > + rpcs_config);
>
> For a first submission this works, but when appending to a running context the
> RPCS value you write here will probably get overwritten by context save. At least
> I would be surprised if lite-restore would write to the register.
>
> So if I am correct the problem you have is that your configuration changes will
> not always become live when you expect it to. And then if you hit lite-restore
> you don't try to re-configure it later unless there is a config change.
>
> But I don't think there is a cheap/reasonable way to fix this fully. So maybe you
> just give up on this chunk and rely on context image to be updated on context
> pin like in https://patchwork.freedesktop.org/patch/250007/.
We will wait for this patch to get merged, because We see changes between two
Patch for register access. We will analyse this further for
MI_LOAD_REGISTER_IMM
Regards, Ankit
>
> Regards,
>
> Tvrtko
>
> > + ctx->update_render_config = 0;
> > + }
> >
> > /* True 32b PPGTT with dynamic page allocation: update PDP
> > * registers and point the unallocated PDPs to scratch page.
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type
2018-09-21 13:05 ` Tvrtko Ursulin
@ 2018-11-06 4:25 ` Navik, Ankit P
0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06 4:25 UTC (permalink / raw)
To: Tvrtko Ursulin, J Karanje, Kedar, intel-gfx
Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan
Hi Tvrtko,
Your review changes are incorporated in v2.
Regards, Ankit
> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Friday, September 21, 2018 6:36 PM
> To: J Karanje, Kedar <kedar.j.karanje@intel.com>; intel-
> gfx@lists.freedesktop.org
> Cc: Diwakar, Praveen <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Navik, Ankit P <ankit.p.navik@intel.com>;
> Muthukumar, Aravindan <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice
> configuration based on load type
>
>
> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> > From: Praveen Diwakar <praveen.diwakar@intel.com>
> >
> > This patch will select optimum eu/slice/sub-slice configuration based
> > on type of load (low, medium, high) as input.
> > Based on our readings and experiments we have predefined set of
> > optimum configuration for each platform(CHT, KBL).
> > i915_set_optimum_config will select optimum configuration from
> > pre-defined optimum configuration table(opt_config).
> >
> > Change-Id: I3a6a2a6bdddd01b3d3c97995f5403aef3c6fa989
> > Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> > Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> > Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> > Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com>
> > ---
> > drivers/gpu/drm/i915/i915_gem_context.c | 46
> +++++++++++++++++++++++++++++++++
> > drivers/gpu/drm/i915/i915_gem_context.h | 32 +++++++++++++++++++++++
> > 2 files changed, 78 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c
> > b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 2838c1d..1b76410 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -94,6 +94,32 @@
> >
> > #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
> >
> > +static struct optimum_config
> opt_config[TIER_VERSION_MAX][LOAD_TYPE_MAX] = {
> > + {
> > + /* Cherry trail low */
> > + { 1, 1, 4},
> > + /* Cherry trail medium */
> > + { 1, 1, 6},
> > + /* Cherry trail high */
> > + { 1, 2, 6}
> > + },
> > + {
> > + /* kbl gt2 low */
> > + { 2, 3, 4},
> > + /* kbl gt2 medium */
> > + { 2, 3, 6},
> > + /* kbl gt2 high */
> > + { 2, 3, 8}
> > + },
> > + {
> > + /* kbl gt3 low */
> > + { 2, 3, 4},
> > + /* kbl gt3 medium */
> > + { 2, 3, 6},
> > + /* kbl gt3 high */
> > + { 2, 3, 8}
> > + }
> > +};
> > static void lut_close(struct i915_gem_context *ctx)
> > {
> > struct i915_lut_handle *lut, *ln;
> > @@ -393,10 +419,30 @@ i915_gem_create_context(struct drm_i915_private
> *dev_priv,
> > ctx->subslice_cnt = hweight8(
> > INTEL_INFO(dev_priv)->sseu.subslice_mask[0]);
> > ctx->eu_cnt = INTEL_INFO(dev_priv)->sseu.eu_per_subslice;
> > + ctx->load_type = 0;
> > + ctx->prev_load_type = 0;
> >
> > return ctx;
> > }
> >
> > +
> > +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
>
> Type for type should be enum.
>
> Name the function as i915_gem_context_set_load_type(ctx, type).
>
> > + enum gem_tier_versions version)
> > +{
> > + struct intel_context *ce = &ctx->__engine[RCS];
> > + u32 *reg_state = ce->lrc_reg_state;
> > + u32 rpcs_config = 0;
>
> Unused locals?
>
> > + /* Call opt_config to get correct configuration for eu,slice,subslice */
> > + ctx->slice_cnt = (u8)opt_config[version][type].slice;
> > + ctx->subslice_cnt = (u8)opt_config[version][type].subslice;
> > + ctx->eu_cnt = (u8)opt_config[version][type].eu;
>
> You could store the correct table for the device in dev_priv and use the static
> table to assign/populate on device init time.
>
> > +
> > + /* Enabling this to update the rpcs */
> > + if (ctx->prev_load_type != type)
> > + ctx->update_render_config = 1;
>
> ctx->update_render_config was unused in last patch. So patch ordering is
> a bit suboptimal but I don't know.
>
> > +
> > + ctx->prev_load_type = type;
> > +}
> > /**
> > * i915_gem_context_create_gvt - create a GVT GEM context
> > * @dev: drm device *
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.h
> b/drivers/gpu/drm/i915/i915_gem_context.h
> > index 52e341c..50183e6 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.h
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.h
> > @@ -53,6 +53,26 @@ struct intel_context_ops {
> > void (*destroy)(struct intel_context *ce);
> > };
> >
> > +enum gem_load_type {
> > + LOAD_TYPE_LOW,
> > + LOAD_TYPE_MEDIUM,
> > + LOAD_TYPE_HIGH,
> > + LOAD_TYPE_MAX
> > +};
> > +
> > +enum gem_tier_versions {
> > + CHERRYVIEW = 0,
> > + KABYLAKE_GT2,
> > + KABYLAKE_GT3,
> > + TIER_VERSION_MAX
> > +};
> > +
> > +struct optimum_config {
> > + int slice;
> > + int subslice;
> > + int eu;
>
> u8 would do global table definitely.
>
> > +};
> > +
> > /**
> > * struct i915_gem_context - client state
> > *
> > @@ -210,6 +230,16 @@ struct i915_gem_context {
> > /** eu_cnt: used to set the # of eu to be enabled. */
> > u8 eu_cnt;
> >
> > + /** load_type: The designated load_type (high/medium/low) for a given
> > + * number of pending commands in the command queue.
> > + */
> > + u8 load_type;
>
> Unused?
>
> > +
> > + /** prev_load_type: The earlier load type that the GPU was configured
> > + * for (high/medium/low).
> > + */
> > + u8 prev_load_type;
>
> Would pending_load_type sound more obvious?
>
> Types should be enums for both.
>
> Regards,
>
> Tvrtko
>
> > +
> > /** update_render_config: to track the updates to the render
> > * configuration (S/SS/EU Configuration on the GPU)
> > */
> > @@ -342,6 +372,8 @@ int i915_gem_context_setparam_ioctl(struct
> drm_device *dev, void *data,
> > struct drm_file *file_priv);
> > int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
> > struct drm_file *file);
> > +void i915_set_optimum_config(int type, struct i915_gem_context *ctx,
> > + enum gem_tier_versions version);
> >
> > struct i915_gem_context *
> > i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio);
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload
2018-09-25 8:25 ` Tvrtko Ursulin
@ 2018-11-06 4:46 ` Navik, Ankit P
0 siblings, 0 replies; 25+ messages in thread
From: Navik, Ankit P @ 2018-11-06 4:46 UTC (permalink / raw)
To: Tvrtko Ursulin, intel-gfx
Cc: Diwakar, Praveen, Marathe, Yogesh, Muthukumar, Aravindan
Hi Tvrtko,
> -----Original Message-----
> From: Tvrtko Ursulin [mailto:tvrtko.ursulin@linux.intel.com]
> Sent: Tuesday, September 25, 2018 1:56 PM
> To: Navik, Ankit P <ankit.p.navik@intel.com>; intel-gfx@lists.freedesktop.org
> Cc: J Karanje, Kedar <kedar.j.karanje@intel.com>; Diwakar, Praveen
> <praveen.diwakar@intel.com>; Marathe, Yogesh
> <yogesh.marathe@intel.com>; Muthukumar, Aravindan
> <aravindan.muthukumar@intel.com>
> Subject: Re: [Intel-gfx] [PATCH 4/4] drm/i915: Predictive governor to control
> eu/slice/subslice based on workload
>
>
> On 25/09/2018 07:12, Navik, Ankit P wrote:
> > Hi Tvrtko,
> >
> > Thank you for your valuable comments. We have gone through it.
> > I'll be submitting revised patch-sets after incorporating all your review
> comments.
>
> You're welcome!
>
> Btw one more thing - you can suspend your timer when GPU goes idle and
> restart it when active. See for instance i915_pmu_gt_parked/unparked where to
> put the hooks.
We are working on this and we will send as new patch on top of series.
>
> Regards,
>
> Tvrtko
>
> >> On 21/09/2018 10:13, kedar.j.karanje@intel.com wrote:
> >>> From: Praveen Diwakar <praveen.diwakar@intel.com>
> >>>
> >>> High resoluton timer is used for this purpose.
> >>>
> >>> Debugfs is provided to enable/disable/update timer configuration
> >>>
> >>> Change-Id: I35d692c5afe962fcad4573185bc6f744487711d0
> >>> Signed-off-by: Praveen Diwakar <praveen.diwakar@intel.com>
> >>> Signed-off-by: Yogesh Marathe <yogesh.marathe@intel.com>
> >>> Signed-off-by: Aravindan Muthukumar <aravindan.muthukumar@intel.com>
> >>> Signed-off-by: Kedar J Karanje <kedar.j.karanje@intel.com>
> >>> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com
> >>> ---
> >>> drivers/gpu/drm/i915/i915_debugfs.c | 94
> >> ++++++++++++++++++++++++++++++++++++-
> >>> drivers/gpu/drm/i915/i915_drv.h | 1 +
> >>> 2 files changed, 94 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> index f9ce35d..81ba509 100644
> >>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> >>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >>> @@ -4740,6 +4740,97 @@ static const struct drm_info_list
> >> i915_debugfs_list[] = {
> >>> {"i915_drrs_status", i915_drrs_status, 0},
> >>> {"i915_rps_boost_info", i915_rps_boost_info, 0},
> >>> };
> >>> +
> >>> +#define POLL_PERIOD_MS (1000 * 1000)
> >>> +#define PENDING_REQ_0 0 /* No active request pending*/
> >>> +#define PENDING_REQ_3 3 /* Threshold value of 3 active request
> >> pending*/
> >>> + /* Anything above this is considered as HIGH load
> >>> + * context
> >>> + */
> >>> + /* And less is considered as LOW load*/
> >>> + /* And equal is considered as mediaum load */
> >>
> >> Wonky comments and some typos up to here.
Changes done in v2.
> >>
> >>> +
> >>> +static int predictive_load_enable;
> >>> +static int predictive_load_timer_init;
> >>> +
> >>> +static enum hrtimer_restart predictive_load_cb(struct hrtimer
> >>> +*hrtimer) {
> >>> + struct drm_i915_private *dev_priv =
> >>> + container_of(hrtimer, typeof(*dev_priv),
> >>> + pred_timer);
> >>> + enum intel_engine_id id;
> >>> + struct intel_engine_cs *engine;
> >>
> >> Some unused's.
Changes done in v2.
> >>
> >>> + struct i915_gem_context *ctx;
> >>> + u64 req_pending;
> >>
> >> unsigned long, and also please try to order declaration so the right
> >> edge of text is moving in one direction only.
Changes done in v2.
> >>
> >>> +
> >>> + list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
> >>> +
> >>> + if (!ctx->name)
> >>> + continue;
> >>
> >> What is this?
Just an error check for invalid context.
> >>
> >>> +
> >>> + mutex_lock(&dev_priv->pred_mutex);
> >>
> >> Here the mutex bites you since you cannot sleep in the timer callback.
> >> atomic_t would solve it. Or would a native unsigned int/long since
> >> lock to read a native word on x86 is not needed.
Changes done in v2.
> >>
> >>> + req_pending = ctx->req_cnt;
> >>> + mutex_unlock(&dev_priv->pred_mutex);
> >>> +
> >>> + if (req_pending == PENDING_REQ_0)
> >>> + continue;
> >>> +
> >>> + if (req_pending > PENDING_REQ_3)
> >>> + ctx->load_type = LOAD_TYPE_HIGH;
> >>> + else if (req_pending == PENDING_REQ_3)
> >>> + ctx->load_type = LOAD_TYPE_MEDIUM;
> >>> + else if (req_pending < PENDING_REQ_3)
> >>
> >> Must be smaller if not greater or equal, but maybe the compiler does
> >> that for you.
Changes done in v2.
> >>
> >>> + ctx->load_type = LOAD_TYPE_LOW;
> >>> +
> >>> + i915_set_optimum_config(ctx->load_type, ctx,
> >> KABYLAKE_GT3);
> >>
> >> Only KBL? Idea to put the table in dev_priv FTW! :)
Changes done in v2.
> >>
> >> ctx->load_type used only as a temporary uncovered here? :)
> >>
> >>> + }
> >>> +
> >>> + hrtimer_forward_now(hrtimer,
> >>> +
> >> ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS));
> >>
> >> Or HRTIMER_NORESTART if disabled? Hard to call it, details..
We get timer value and we don’t start the timer if the value is 0.
> >>
> >>> +
> >>> + return HRTIMER_RESTART;
> >>> +}
> >>> +
> >>> +static int
> >>> +i915_predictive_load_get(void *data, u64 *val) {
> >>> + struct drm_i915_private *dev_priv = data;
> >>> +
> >>> + *val = predictive_load_enable;
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static int
> >>> +i915_predictive_load_set(void *data, u64 val) {
> >>> + struct drm_i915_private *dev_priv = data;
> >>> + struct intel_device_info *info;
> >>> +
> >>> + info = mkwrite_device_info(dev_priv);
> >>
> >> Unused, why?
Changes done in v2.
> >>
> >>> +
> >>> + predictive_load_enable = val;
> >>> +
> >>> + if (predictive_load_enable) {
> >>> + if (!predictive_load_timer_init) {
> >>> + hrtimer_init(&dev_priv->pred_timer,
> >> CLOCK_MONOTONIC,
> >>> + HRTIMER_MODE_REL);
> >>> + dev_priv->pred_timer.function = predictive_load_cb;
> >>> + predictive_load_timer_init = 1;
> >>
> >> Move timer init to dev_priv setup.
Changes done in v2.
> >>
> >>> + }
> >>> + hrtimer_start(&dev_priv->pred_timer,
> >>> +
> >> ns_to_ktime(predictive_load_enable*POLL_PERIOD_MS),
> >>> + HRTIMER_MODE_REL_PINNED);
> >>
> >> Two threads can race to here.
This is under mutex, please let me know scenario if any corner case which could
cause race condition.
> >>
> >> Also you can give some slack to the timer since precision is not
> >> critical to you and can save you some CPU power.
Changes done in v2.
> >>
> >>> + } else {
> >>> + hrtimer_cancel(&dev_priv->pred_timer);
> >>> + }
> >>> +
> >>> + return 0;
> >>> +}
> >>> +
> >>> +DEFINE_SIMPLE_ATTRIBUTE(i915_predictive_load_ctl,
> >>> + i915_predictive_load_get, i915_predictive_load_set,
> >>> + "%llu\n");
> >>> +
> >>> #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
> >>>
> >>> static const struct i915_debugfs_files { @@ -4769,7 +4860,8 @@
> >>> static const struct i915_debugfs_files {
> >>> {"i915_hpd_storm_ctl", &i915_hpd_storm_ctl_fops},
> >>> {"i915_ipc_status", &i915_ipc_status_fops},
> >>> {"i915_drrs_ctl", &i915_drrs_ctl_fops},
> >>> - {"i915_edp_psr_debug", &i915_edp_psr_debug_fops}
> >>> + {"i915_edp_psr_debug", &i915_edp_psr_debug_fops},
> >>> + {"i915_predictive_load_ctl", &i915_predictive_load_ctl}
> >>
> >> And if the feature is to become real one day, given it's nature, the
> >> knob would probably have to go to sysfs, not debugfs.
Added FIXME comment to migrate to sysfs on subsequent version along with other fixes.
> >>
> >>> };
> >>>
> >>> int i915_debugfs_register(struct drm_i915_private *dev_priv) diff
> >>> --git a/drivers/gpu/drm/i915/i915_drv.h
> >>> b/drivers/gpu/drm/i915/i915_drv.h index 137ec33..0505c47 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.h
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.h
> >>> @@ -1552,6 +1552,7 @@ struct intel_cdclk_state {
> >>> };
> >>>
> >>> struct drm_i915_private {
> >>> + struct hrtimer pred_timer;
> >>
> >> Surely not the first member! :)
Changes done in v2.
Regards, Ankit
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>> struct drm_device drm;
> >>>
> >>> struct kmem_cache *objects;
> >>>
> >
> > Regards,
> > Ankit
> >
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU.
2018-12-11 11:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
@ 2018-12-11 13:02 ` Tvrtko Ursulin
0 siblings, 0 replies; 25+ messages in thread
From: Tvrtko Ursulin @ 2018-12-11 13:02 UTC (permalink / raw)
To: intel-gfx, Patchwork, Navik, Ankit P
On 11/12/2018 11:12, Patchwork wrote:
> == Series Details ==
>
> Series: Dynamic EU configuration of Slice/Subslice/EU.
> URL : https://patchwork.freedesktop.org/series/53876/
> State : failure
>
> == Summary ==
>
> Applying: drm/i915: Get active pending request for given context
> Using index info to reconstruct a base tree...
> M drivers/gpu/drm/i915/i915_gem_context.c
> M drivers/gpu/drm/i915/i915_gem_context.h
> M drivers/gpu/drm/i915/i915_request.c
> M drivers/gpu/drm/i915/intel_lrc.c
> Falling back to patching base and 3-way merge...
> Auto-merging drivers/gpu/drm/i915/intel_lrc.c
> Auto-merging drivers/gpu/drm/i915/i915_request.c
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
> Applying: drm/i915: Update render power clock state configuration for given context
> Using index info to reconstruct a base tree...
> M drivers/gpu/drm/i915/i915_gem_context.c
> M drivers/gpu/drm/i915/i915_gem_context.h
> M drivers/gpu/drm/i915/intel_lrc.c
> Falling back to patching base and 3-way merge...
> Auto-merging drivers/gpu/drm/i915/intel_lrc.c
> CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_lrc.c
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
> Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
> error: Failed to merge in the changes.
> Patch failed at 0002 drm/i915: Update render power clock state configuration for given context
> Use 'git am --show-current-patch' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
Always rebase before sending if you are not sure since drm-tip can be
pretty dynamic.
Regards,
Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
* ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU.
2018-12-11 10:14 [PATCH v3 0/4] " Ankit Navik
@ 2018-12-11 11:12 ` Patchwork
2018-12-11 13:02 ` Tvrtko Ursulin
0 siblings, 1 reply; 25+ messages in thread
From: Patchwork @ 2018-12-11 11:12 UTC (permalink / raw)
To: Navik, Ankit P; +Cc: intel-gfx
== Series Details ==
Series: Dynamic EU configuration of Slice/Subslice/EU.
URL : https://patchwork.freedesktop.org/series/53876/
State : failure
== Summary ==
Applying: drm/i915: Get active pending request for given context
Using index info to reconstruct a base tree...
M drivers/gpu/drm/i915/i915_gem_context.c
M drivers/gpu/drm/i915/i915_gem_context.h
M drivers/gpu/drm/i915/i915_request.c
M drivers/gpu/drm/i915/intel_lrc.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/i915_request.c
Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
Applying: drm/i915: Update render power clock state configuration for given context
Using index info to reconstruct a base tree...
M drivers/gpu/drm/i915/i915_gem_context.c
M drivers/gpu/drm/i915/i915_gem_context.h
M drivers/gpu/drm/i915/intel_lrc.c
Falling back to patching base and 3-way merge...
Auto-merging drivers/gpu/drm/i915/intel_lrc.c
CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/intel_lrc.c
Auto-merging drivers/gpu/drm/i915/i915_gem_context.h
Auto-merging drivers/gpu/drm/i915/i915_gem_context.c
error: Failed to merge in the changes.
Patch failed at 0002 drm/i915: Update render power clock state configuration for given context
Use 'git am --show-current-patch' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2018-12-11 13:02 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-21 9:13 [PATCH 0/4][RFC] Dynamic EU configuration of Slice/Subslice/EU kedar.j.karanje
2018-09-21 9:13 ` [PATCH 1/4] drm/i915: Get active pending request for given context kedar.j.karanje
2018-09-21 12:39 ` Tvrtko Ursulin
2018-11-06 4:18 ` Navik, Ankit P
2018-09-25 8:04 ` Jani Nikula
2018-09-25 7:41 ` Kedar J. Karanje
2018-09-21 9:13 ` [PATCH 2/4] drm/i915: Update render power clock state configuration " kedar.j.karanje
2018-09-21 12:51 ` Tvrtko Ursulin
2018-11-06 4:23 ` Navik, Ankit P
2018-09-22 19:13 ` kbuild test robot
2018-09-21 9:13 ` [PATCH 3/4] drm/i915: set optimum eu/slice/sub-slice configuration based on load type kedar.j.karanje
2018-09-21 13:05 ` Tvrtko Ursulin
2018-11-06 4:25 ` Navik, Ankit P
2018-09-22 18:09 ` kbuild test robot
2018-09-21 9:13 ` [PATCH 4/4] drm/i915: Predictive governor to control eu/slice/subslice based on workload kedar.j.karanje
2018-09-21 13:24 ` Tvrtko Ursulin
2018-09-25 6:12 ` Navik, Ankit P
2018-09-25 8:25 ` Tvrtko Ursulin
2018-11-06 4:46 ` Navik, Ankit P
2018-09-22 18:31 ` kbuild test robot
2018-09-21 11:16 ` ✗ Fi.CI.BAT: failure for Dynamic EU configuration of Slice/Subslice/EU Patchwork
2018-09-21 12:31 ` [PATCH 0/4][RFC] " Tvrtko Ursulin
2018-09-27 14:02 ` Joonas Lahtinen
2018-12-11 10:14 [PATCH v3 0/4] " Ankit Navik
2018-12-11 11:12 ` ✗ Fi.CI.BAT: failure for " Patchwork
2018-12-11 13:02 ` Tvrtko Ursulin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.