* [Intel-gfx] [PATCH v7 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel @ 2020-03-16 13:29 Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Ankit Navik @ 2020-03-16 13:29 UTC (permalink / raw) To: intel-gfx; +Cc: ankit.p.navik This patch sets improves GPU power consumption on Linux kernel based OS such as Chromium OS, Ubuntu, etc. Following are the power savings. Power savings on GLK-GT1 Bobba platform running on Chrome OS. -----------------------------------------------| App /KPI | % Power Benefit (mW) | ------------------------|----------------------| Hangout Call- 20 minute | 1.8% | Youtube 4K VPB | 14.13% | WebGL Aquarium | 13.76% | Unity3D | 6.78% | | | ------------------------|----------------------| Chrome PLT | BatteryLife Improves | | by ~45 minute | -----------------------------------------------| Power savings on KBL-GT3 running on Android and Ubuntu (Linux). -----------------------------------------------| App /KPI | % Power Benefit (mW) | |----------------------| | Android | Ubuntu | ------------------------|----------|-----------| 3D Mark (Ice storm) | 2.30% | N.A. | TRex On screen | 2.49% | 2.97% | Manhattan On screen | 3.11% | 4.90% | Carchase On Screen | N.A. | 5.06% | AnTuTu 6.1.4 | 3.42% | N.A. | SynMark2 | N.A. | 1.7% | -----------------------------------------------| We have also observed GPU core residencies improves by 1.035%. Technical Insights of the patch: Current GPU configuration code for i915 does not allow us to change EU/Slice/Sub-slice configuration dynamically. Its done only once while context is created. While particular graphics application is running, if we examine the command requests from user space, we observe that command density is not consistent. It means there is scope to change the graphics configuration dynamically even while context is running actively. This patch series proposes the solution to find the active pending load for all active context at given time and based on that, dynamically perform graphics configuration for each context. The feature can be enabled using sysfs. we examine pending commands for a context in the queue, essentially, we intercept them before they are executed by GPU and we update context with required number of EUs. For the prior one, empirical data to achieve best performance in least power was considered. For the later one, we roughly categorized number of EUs logically based on platform. Now we compare number of pending commands with a particular threshold and then set number of EUs accordingly with update context. That threshold is also based on experiments & findings. If GPU is able to catch up with CPU, typically there are no pending commands, the EU config would remain unchanged there. In case there are more pending commands we reprogram context with higher number of EUs. Ankit Navik (3): drm/i915: Get active pending request for given context drm/i915: set optimum eu/slice/sub-slice configuration based on load type drm/i915: Predictive governor to control slice/subslice/eu drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++ drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 37 +++++++++++ drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 + drivers/gpu/drm/i915/gt/intel_context_sseu.c | 2 + drivers/gpu/drm/i915/gt/intel_context_types.h | 2 + drivers/gpu/drm/i915/gt/intel_lrc.c | 79 ++++++++++++++++++++++- drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/i915_sysfs.c | 32 +++++++++ drivers/gpu/drm/i915/intel_device_info.c | 55 +++++++++++++++- 9 files changed, 214 insertions(+), 4 deletions(-) -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context 2020-03-16 13:29 [Intel-gfx] [PATCH v7 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik @ 2020-03-16 13:29 ` Ankit Navik 2020-03-16 13:43 ` Chris Wilson 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik ` (2 subsequent siblings) 3 siblings, 1 reply; 8+ messages in thread From: Ankit Navik @ 2020-03-16 13:29 UTC (permalink / raw) To: intel-gfx; +Cc: ankit.p.navik This patch gives us the active pending request count which is yet to be submitted to the GPU. V2: * Change 64-bit to atomic for request count. (Tvrtko Ursulin) V3: * Remove mutex for request count. * Rebase. * Fixes hitting underflow for predictive request. (Tvrtko Ursulin) V4: * Rebase. V5: * Rebase. V6: * Rebase. V7: * Rebase. * Add GEM_BUG_ON for req_cnt. Cc: Vipin Anand <vipin.anand@intel.com> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 5 +++++ drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 ++ drivers/gpu/drm/i915/gt/intel_lrc.c | 9 +++++++++ 4 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 026999b34abd..d0ff999429ff 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -879,6 +879,7 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) } trace_i915_context_create(ctx); + atomic_set(&ctx->req_cnt, 0); return ctx; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 28760bd03265..a9ba13f8865e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -171,6 +171,11 @@ struct i915_gem_context { */ struct radix_tree_root handles_vma; + /** req_cnt: tracks the pending commands, based on which we decide to + * go for low/medium/high load configuration of the GPU. + */ + atomic_t req_cnt; + /** * @name: arbitrary name, used for user debug * diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d3f4f28e9468..f90c968f95cd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2565,6 +2565,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (batch->private) intel_engine_pool_mark_active(batch->private, eb.request); + atomic_inc(&eb.gem_context->req_cnt); + trace_i915_request_queue(eb.request, eb.batch_flags); err = eb_submit(&eb, batch); err_request: diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 112531b29f59..ccfebebb0071 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2143,6 +2143,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } if (__i915_request_submit(rq)) { + struct i915_gem_context *ctx; + if (!merge) { *port = execlists_schedule_in(last, port - execlists->pending); port++; @@ -2158,6 +2160,13 @@ static void execlists_dequeue(struct intel_engine_cs *engine) submit = true; last = rq; + + ctx = rcu_dereference_protected( + rq->context->gem_context, true); + + GEM_BUG_ON(atomic_read(&ctx->req_cnt)); + if (atomic_read(&ctx->req_cnt) > 0) + atomic_dec(&ctx->req_cnt); } } -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik @ 2020-03-16 13:43 ` Chris Wilson 0 siblings, 0 replies; 8+ messages in thread From: Chris Wilson @ 2020-03-16 13:43 UTC (permalink / raw) To: Ankit Navik, intel-gfx; +Cc: ankit.p.navik Quoting Ankit Navik (2020-03-16 13:29:49) > This patch gives us the active pending request count which is yet > to be submitted to the GPU. > > V2: > * Change 64-bit to atomic for request count. (Tvrtko Ursulin) > > V3: > * Remove mutex for request count. > * Rebase. > * Fixes hitting underflow for predictive request. (Tvrtko Ursulin) > > V4: > * Rebase. > > V5: > * Rebase. > > V6: > * Rebase. > > V7: > * Rebase. > * Add GEM_BUG_ON for req_cnt. > > Cc: Vipin Anand <vipin.anand@intel.com> > Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + > drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 5 +++++ > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 ++ > drivers/gpu/drm/i915/gt/intel_lrc.c | 9 +++++++++ > 4 files changed, 17 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c > index 026999b34abd..d0ff999429ff 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c > @@ -879,6 +879,7 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) > } > > trace_i915_context_create(ctx); > + atomic_set(&ctx->req_cnt, 0); > > return ctx; > } > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > index 28760bd03265..a9ba13f8865e 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h > @@ -171,6 +171,11 @@ struct i915_gem_context { > */ > struct radix_tree_root handles_vma; > > + /** req_cnt: tracks the pending commands, based on which we decide to > + * go for low/medium/high load configuration of the GPU. > + */ > + atomic_t req_cnt; > + > /** > * @name: arbitrary name, used for user debug > * > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index d3f4f28e9468..f90c968f95cd 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -2565,6 +2565,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, > if (batch->private) > intel_engine_pool_mark_active(batch->private, eb.request); > > + atomic_inc(&eb.gem_context->req_cnt); > + > trace_i915_request_queue(eb.request, eb.batch_flags); > err = eb_submit(&eb, batch); > err_request: > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > index 112531b29f59..ccfebebb0071 100644 > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > @@ -2143,6 +2143,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > } > > if (__i915_request_submit(rq)) { > + struct i915_gem_context *ctx; > + > if (!merge) { > *port = execlists_schedule_in(last, port - execlists->pending); > port++; > @@ -2158,6 +2160,13 @@ static void execlists_dequeue(struct intel_engine_cs *engine) > > submit = true; > last = rq; > + > + ctx = rcu_dereference_protected( > + rq->context->gem_context, true); > + > + GEM_BUG_ON(atomic_read(&ctx->req_cnt)); > + if (atomic_read(&ctx->req_cnt) > 0) > + atomic_dec(&ctx->req_cnt); This is wrong on so many levels. The GEM context is an opaque pointer here, and often not available. The rcu_dereference_protected is woeful. There is not even a 1:1 relationship between execbuf and requests -- you should have recognised that the moment you "handled" the bug. Please do look at the other metrics we have time and time again pointed you towards. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type 2020-03-16 13:29 [Intel-gfx] [PATCH v7 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik @ 2020-03-16 13:29 ` Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 3/3] drm/i915: Predictive governor to control slice/subslice/eu Ankit Navik 2020-03-16 21:50 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel (rev3) Patchwork 3 siblings, 0 replies; 8+ messages in thread From: Ankit Navik @ 2020-03-16 13:29 UTC (permalink / raw) To: intel-gfx; +Cc: ankit.p.navik This patch will select optimum eu/slice/sub-slice configuration based on type of load (low, medium, high) as input. Based on our readings and experiments we have predefined set of optimum configuration for each platform(CHT, KBL). i915_gem_context_set_load_type will select optimum configuration from pre-defined optimum configuration table(opt_config). It also introduce flag update_render_config which can set by any governor. v2: * Move static optimum_config to device init time. * Rename function to appropriate name, fix data types and patch ordering. * Rename prev_load_type to pending_load_type. (Tvrtko Ursulin) v3: * Add safe guard check in i915_gem_context_set_load_type. * Rename struct from optimum_config to i915_sseu_optimum_config to avoid namespace clashes. * Reduces memcpy for space efficient. * Rebase. * Improved commit message. (Tvrtko Ursulin) v4: * Move optimum config table to file scope. (Tvrtko Ursulin) v5: * Adds optimal table of slice/sub-slice/EU for Gen 9 GT1. * Rebase. v6: * Rebase. * Fix warnings. v7: * Fix return conditions. * Remove i915_gem_context_set_load_type and move logic to __execlists_update_reg_state. (Tvrtko Ursulin) Cc: Vipin Anand <vipin.anand@intel.com> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 + drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 32 +++++++++++ drivers/gpu/drm/i915/gt/intel_context_sseu.c | 2 + drivers/gpu/drm/i915/gt/intel_context_types.h | 2 + drivers/gpu/drm/i915/gt/intel_lrc.c | 70 ++++++++++++++++++++++- drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/intel_device_info.c | 55 +++++++++++++++++- 7 files changed, 165 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index d0ff999429ff..3aad45b0ba5a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -880,6 +880,9 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) trace_i915_context_create(ctx); atomic_set(&ctx->req_cnt, 0); + ctx->slice_cnt = hweight8(RUNTIME_INFO(i915)->sseu.slice_mask); + ctx->subslice_cnt = hweight8(RUNTIME_INFO(i915)->sseu.subslice_mask[0]); + ctx->eu_cnt = RUNTIME_INFO(i915)->sseu.eu_per_subslice; return ctx; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index a9ba13f8865e..1af1acd73794 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -46,6 +46,19 @@ struct i915_gem_engines_iter { const struct i915_gem_engines *engines; }; +enum gem_load_type { + LOAD_TYPE_LOW, + LOAD_TYPE_MEDIUM, + LOAD_TYPE_HIGH, + LOAD_TYPE_LAST +}; + +struct i915_sseu_optimum_config { + u8 slice; + u8 subslice; + u8 eu; +}; + /** * struct i915_gem_context - client state * @@ -155,6 +168,25 @@ struct i915_gem_context { */ atomic_t active_count; + /** slice_cnt: used to set the # of slices to be enabled. */ + u8 slice_cnt; + + /** subslice_cnt: used to set the # of subslices to be enabled. */ + u8 subslice_cnt; + + /** eu_cnt: used to set the # of eu to be enabled. */ + u8 eu_cnt; + + /** load_type: The designated load_type (high/medium/low) for a given + * number of pending commands in the command queue. + */ + enum gem_load_type load_type; + + /** pending_load_type: The earlier load type that the GPU was configured + * for (high/medium/low). + */ + enum gem_load_type pending_load_type; + /** * @hang_timestamp: The last time(s) this context caused a GPU hang */ diff --git a/drivers/gpu/drm/i915/gt/intel_context_sseu.c b/drivers/gpu/drm/i915/gt/intel_context_sseu.c index 57a30956c922..4f51bfb9690c 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_sseu.c +++ b/drivers/gpu/drm/i915/gt/intel_context_sseu.c @@ -84,6 +84,8 @@ intel_context_reconfigure_sseu(struct intel_context *ce, if (ret) return ret; + ce->user_sseu = true; + /* Nothing to do if unmodified. */ if (!memcmp(&ce->sseu, &sseu, sizeof(sseu))) goto unlock; diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 0f3b68b95c56..fd5811110026 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -93,6 +93,8 @@ struct intel_context { const struct intel_context_ops *ops; + bool user_sseu; + /** sseu: Control eu/slice partitioning */ struct intel_sseu sseu; }; diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index ccfebebb0071..7c5f05886278 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -177,6 +177,14 @@ /* Typical size of the average request (2 pipecontrols and a MI_BB) */ #define EXECLISTS_REQUEST_SIZE 64 /* bytes */ +/* + * Anything above threshold is considered as HIGH load, less is considered + * as LOW load and equal is considered as MEDIUM load. + * + * The threshold value of three active requests pending. + */ +#define PENDING_THRESHOLD_MEDIUM 3 + struct virtual_engine { struct intel_engine_cs base; struct intel_context context; @@ -3002,6 +3010,36 @@ static void execlists_context_unpin(struct intel_context *ce) i915_gem_object_unpin_map(ce->state->obj); } +static u32 +get_context_rpcs_config(struct i915_gem_context *ctx) +{ + u32 rpcs = 0; + struct drm_i915_private *dev_priv = ctx->i915; + + if (INTEL_GEN(dev_priv) < 8) + return 0; + + if (RUNTIME_INFO(dev_priv)->sseu.has_slice_pg) { + rpcs |= GEN8_RPCS_S_CNT_ENABLE; + rpcs |= ctx->slice_cnt << GEN8_RPCS_S_CNT_SHIFT; + rpcs |= GEN8_RPCS_ENABLE; + } + + if (RUNTIME_INFO(dev_priv)->sseu.has_subslice_pg) { + rpcs |= GEN8_RPCS_SS_CNT_ENABLE; + rpcs |= ctx->subslice_cnt << GEN8_RPCS_SS_CNT_SHIFT; + rpcs |= GEN8_RPCS_ENABLE; + } + + if (RUNTIME_INFO(dev_priv)->sseu.has_eu_pg) { + rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MIN_SHIFT; + rpcs |= ctx->eu_cnt << GEN8_RPCS_EU_MAX_SHIFT; + rpcs |= GEN8_RPCS_ENABLE; + } + + return rpcs; +} + static void __execlists_update_reg_state(const struct intel_context *ce, const struct intel_engine_cs *engine, @@ -3009,6 +3047,10 @@ __execlists_update_reg_state(const struct intel_context *ce, { struct intel_ring *ring = ce->ring; u32 *regs = ce->lrc_reg_state; + const struct i915_sseu_optimum_config *cfg; + struct i915_gem_context *ctx; + enum gem_load_type load_type; + u32 req_pending; GEM_BUG_ON(!intel_ring_offset_valid(ring, head)); GEM_BUG_ON(!intel_ring_offset_valid(ring, ring->tail)); @@ -3018,10 +3060,31 @@ __execlists_update_reg_state(const struct intel_context *ce, regs[CTX_RING_TAIL] = ring->tail; regs[CTX_RING_CTL] = RING_CTL_SIZE(ring->size) | RING_VALID; + GEM_BUG_ON(ce->engine->class != RENDER_CLASS); + ctx = rcu_dereference_protected(ce->gem_context, true); + + req_pending = atomic_read(&ctx->req_cnt); + + if (req_pending > PENDING_THRESHOLD_MEDIUM) + load_type = LOAD_TYPE_HIGH; + else if (req_pending == PENDING_THRESHOLD_MEDIUM) + load_type = LOAD_TYPE_MEDIUM; + else + load_type = LOAD_TYPE_LOW; + + cfg = &ctx->i915->opt_config[load_type]; + /* RPCS */ if (engine->class == RENDER_CLASS) { - regs[CTX_R_PWR_CLK_STATE] = - intel_sseu_make_rpcs(engine->i915, &ce->sseu); + + if (!ctx || !ctx->i915->predictive_load_enable + || ce->user_sseu) { + regs[CTX_R_PWR_CLK_STATE] = + intel_sseu_make_rpcs(engine->i915, &ce->sseu); + } else { + regs[CTX_R_PWR_CLK_STATE] = + get_context_rpcs_config(ce->gem_context); + } i915_oa_init_reg_state(ce, engine); } @@ -3046,6 +3109,9 @@ __execlists_context_pin(struct intel_context *ce, ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE; __execlists_update_reg_state(ce, engine, ce->ring->tail); + if (ce->gem_context->load_type != ce->gem_context->pending_load_type) + ce->gem_context->load_type = ce->gem_context->pending_load_type; + return 0; } diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 1f5b9a584f71..304d95aa4974 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -926,6 +926,11 @@ struct drm_i915_private { /* protects panel power sequencer state */ struct mutex pps_mutex; + /* optimal slice/subslice/EU configration state */ + struct i915_sseu_optimum_config *opt_config; + + bool predictive_load_enable; + unsigned int fsb_freq, mem_freq, is_ddr3; unsigned int skl_preferred_vco_freq; unsigned int max_cdclk_freq; diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c index d7fe12734db8..53d966a9097e 100644 --- a/drivers/gpu/drm/i915/intel_device_info.c +++ b/drivers/gpu/drm/i915/intel_device_info.c @@ -899,6 +899,34 @@ void intel_device_info_subplatform_init(struct drm_i915_private *i915) RUNTIME_INFO(i915)->platform_mask[pi] |= mask; } +/* static table of slice/subslice/EU for Cherryview */ +static const struct i915_sseu_optimum_config chv_config[LOAD_TYPE_LAST] = { + {1, 1, 4}, /* Low */ + {1, 1, 6}, /* Medium */ + {1, 2, 6} /* High */ +}; + +/* static table of slice/subslice/EU for GLK GT1 */ +static const struct i915_sseu_optimum_config glk_gt1_config[LOAD_TYPE_LAST] = { + {1, 2, 2}, /* Low */ + {1, 2, 3}, /* Medium */ + {1, 2, 6} /* High */ +}; + +/* static table of slice/subslice/EU for KBL GT2 */ +static const struct i915_sseu_optimum_config kbl_gt2_config[LOAD_TYPE_LAST] = { + {1, 3, 2}, /* Low */ + {1, 3, 4}, /* Medium */ + {1, 3, 8} /* High */ +}; + +/* static table of slice/subslice/EU for KBL GT3 */ +static const struct i915_sseu_optimum_config kbl_gt3_config[LOAD_TYPE_LAST] = { + {2, 3, 4}, /* Low */ + {2, 3, 6}, /* Medium */ + {2, 3, 8} /* High */ +}; + /** * intel_device_info_runtime_init - initialize runtime info * @dev_priv: the i915 device @@ -1027,12 +1055,35 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv) /* Initialize slice/subslice/EU info */ if (IS_HASWELL(dev_priv)) hsw_sseu_info_init(dev_priv); - else if (IS_CHERRYVIEW(dev_priv)) + else if (IS_CHERRYVIEW(dev_priv)) { cherryview_sseu_info_init(dev_priv); + BUILD_BUG_ON(ARRAY_SIZE(chv_config) != LOAD_TYPE_LAST); + dev_priv->opt_config = chv_config; + } else if (IS_BROADWELL(dev_priv)) bdw_sseu_info_init(dev_priv); - else if (IS_GEN(dev_priv, 9)) + else if (IS_GEN(dev_priv, 9)) { gen9_sseu_info_init(dev_priv); + + switch (info->gt) { + default: /* fall through */ + case 1: + BUILD_BUG_ON(ARRAY_SIZE(glk_gt1_config) != + LOAD_TYPE_LAST); + dev_priv->opt_config = glk_gt1_config; + break; + case 2: + BUILD_BUG_ON(ARRAY_SIZE(kbl_gt2_config) != + LOAD_TYPE_LAST); + dev_priv->opt_config = kbl_gt2_config; + break; + case 3: + BUILD_BUG_ON(ARRAY_SIZE(kbl_gt3_config) != + LOAD_TYPE_LAST); + dev_priv->opt_config = kbl_gt3_config; + break; + } + } else if (IS_GEN(dev_priv, 10)) gen10_sseu_info_init(dev_priv); else if (IS_GEN(dev_priv, 11)) -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 3/3] drm/i915: Predictive governor to control slice/subslice/eu 2020-03-16 13:29 [Intel-gfx] [PATCH v7 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik @ 2020-03-16 13:29 ` Ankit Navik 2020-03-16 21:50 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel (rev3) Patchwork 3 siblings, 0 replies; 8+ messages in thread From: Ankit Navik @ 2020-03-16 13:29 UTC (permalink / raw) To: intel-gfx; +Cc: ankit.p.navik Load classification is used for predictive governor to control eu/slice/subslice based on workloads. sysfs is provided to enable/disable the feature V2: * Fix code style. * Move predictive_load_timer into a drm_i915_private structure. * Make generic function to set optimum config. (Tvrtko Ursulin) V3: * Rebase. * Fix race condition for predictive load set. * Add slack to start hrtimer for more power efficient. (Tvrtko Ursulin) V4: * Fix data type and initialization of mutex to protect predictive load state. * Move predictive timer init to i915_gem_init_early. (Tvrtko Ursulin) * Move debugfs to kernel parameter. V5: * Rebase. * Remove mutex for pred_timer V6: * Rebase. * Fix warnings. V7: * Drop timer and move logic to __execlists_update_reg_state. (Tvrtko Ursulin) * Remove kernel boot param and make it to sysfs entry. (Jani Nikula) Cc: Vipin Anand <vipin.anand@intel.com> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> --- drivers/gpu/drm/i915/i915_sysfs.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c index 45d32ef42787..5d76e4992c8d 100644 --- a/drivers/gpu/drm/i915/i915_sysfs.c +++ b/drivers/gpu/drm/i915/i915_sysfs.c @@ -433,12 +433,43 @@ static ssize_t gt_min_freq_mhz_store(struct device *kdev, return ret ?: count; } +static ssize_t deu_enable_show(struct device *kdev, struct device_attribute *attr, char *buf) +{ + struct drm_i915_private *i915 = kdev_minor_to_i915(kdev); + + return snprintf(buf, PAGE_SIZE, "%u\n", i915->predictive_load_enable); +} + +static ssize_t deu_enable_store(struct device *kdev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct drm_i915_private *i915 = kdev_minor_to_i915(kdev); + ssize_t ret; + u32 val; + + ret = kstrtou32(buf, 0, &val); + if (ret) + return ret; + + /* Check invalid values */ + if (val != 0 && val != 1) + ret = -EINVAL; + + i915->predictive_load_enable = val; + + return count; +} + static DEVICE_ATTR_RO(gt_act_freq_mhz); static DEVICE_ATTR_RO(gt_cur_freq_mhz); static DEVICE_ATTR_RW(gt_boost_freq_mhz); static DEVICE_ATTR_RW(gt_max_freq_mhz); static DEVICE_ATTR_RW(gt_min_freq_mhz); +static DEVICE_ATTR_RW(deu_enable); + static DEVICE_ATTR_RO(vlv_rpe_freq_mhz); static ssize_t gt_rp_mhz_show(struct device *kdev, struct device_attribute *attr, char *buf); @@ -474,6 +505,7 @@ static const struct attribute * const gen6_attrs[] = { &dev_attr_gt_RP0_freq_mhz.attr, &dev_attr_gt_RP1_freq_mhz.attr, &dev_attr_gt_RPn_freq_mhz.attr, + &dev_attr_deu_enable.attr, NULL, }; -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel (rev3) 2020-03-16 13:29 [Intel-gfx] [PATCH v7 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik ` (2 preceding siblings ...) 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 3/3] drm/i915: Predictive governor to control slice/subslice/eu Ankit Navik @ 2020-03-16 21:50 ` Patchwork 3 siblings, 0 replies; 8+ messages in thread From: Patchwork @ 2020-03-16 21:50 UTC (permalink / raw) To: Ankit Navik; +Cc: intel-gfx == Series Details == Series: drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel (rev3) URL : https://patchwork.freedesktop.org/series/57989/ State : failure == Summary == CALL scripts/checksyscalls.sh CALL scripts/atomic/check-atomics.sh DESCEND objtool CHK include/generated/compile.h CC [M] drivers/gpu/drm/i915/intel_device_info.o drivers/gpu/drm/i915/intel_device_info.c: In function ‘intel_device_info_runtime_init’: drivers/gpu/drm/i915/intel_device_info.c:1061:24: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] dev_priv->opt_config = chv_config; ^ drivers/gpu/drm/i915/intel_device_info.c:1073:25: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] dev_priv->opt_config = glk_gt1_config; ^ drivers/gpu/drm/i915/intel_device_info.c:1078:25: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] dev_priv->opt_config = kbl_gt2_config; ^ drivers/gpu/drm/i915/intel_device_info.c:1083:25: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] dev_priv->opt_config = kbl_gt3_config; ^ cc1: all warnings being treated as errors scripts/Makefile.build:267: recipe for target 'drivers/gpu/drm/i915/intel_device_info.o' failed make[4]: *** [drivers/gpu/drm/i915/intel_device_info.o] Error 1 scripts/Makefile.build:505: recipe for target 'drivers/gpu/drm/i915' failed make[3]: *** [drivers/gpu/drm/i915] Error 2 scripts/Makefile.build:505: recipe for target 'drivers/gpu/drm' failed make[2]: *** [drivers/gpu/drm] Error 2 scripts/Makefile.build:505: recipe for target 'drivers/gpu' failed make[1]: *** [drivers/gpu] Error 2 Makefile:1683: recipe for target 'drivers' failed make: *** [drivers] Error 2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 0/3] Dynamic EU configuration of Slice/Sub-slice/EU @ 2020-03-16 13:36 Ankit Navik 2020-03-16 13:36 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik 0 siblings, 1 reply; 8+ messages in thread From: Ankit Navik @ 2020-03-16 13:36 UTC (permalink / raw) To: intel-gfx; +Cc: ankit.p.navik drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel This patch sets improves GPU power consumption on Linux kernel based OS such as Chromium OS, Ubuntu, etc. Following are the power savings. Power savings on GLK-GT1 Bobba platform running on Chrome OS. -----------------------------------------------| App /KPI | % Power Benefit (mW) | ------------------------|----------------------| Hangout Call- 20 minute | 1.8% | Youtube 4K VPB | 14.13% | WebGL Aquarium | 13.76% | Unity3D | 6.78% | | | ------------------------|----------------------| Chrome PLT | BatteryLife Improves | | by ~45 minute | -----------------------------------------------| Power savings on KBL-GT3 running on Android and Ubuntu (Linux). -----------------------------------------------| App /KPI | % Power Benefit (mW) | |----------------------| | Android | Ubuntu | ------------------------|----------|-----------| 3D Mark (Ice storm) | 2.30% | N.A. | TRex On screen | 2.49% | 2.97% | Manhattan On screen | 3.11% | 4.90% | Carchase On Screen | N.A. | 5.06% | AnTuTu 6.1.4 | 3.42% | N.A. | SynMark2 | N.A. | 1.7% | -----------------------------------------------| We have also observed GPU core residencies improves by 1.035%. Technical Insights of the patch: Current GPU configuration code for i915 does not allow us to change EU/Slice/Sub-slice configuration dynamically. Its done only once while context is created. While particular graphics application is running, if we examine the command requests from user space, we observe that command density is not consistent. It means there is scope to change the graphics configuration dynamically even while context is running actively. This patch series proposes the solution to find the active pending load for all active context at given time and based on that, dynamically perform graphics configuration for each context. The feature can be enabled using sysfs. We examine pending commands for a context in the queue, essentially, we intercept them before they are executed by GPU and we update context with required number of EUs. For the prior one, empirical data to achieve best performance in least power was considered. For the later one, we roughly categorized number of EUs logically based on platform. Now we compare number of pending commands with a particular threshold and then set number of EUs accordingly with update context. That threshold is also based on experiments & findings. If GPU is able to catch up with CPU, typically there are no pending commands, the EU config would remain unchanged there. In case there are more pending commands we reprogram context with higher number of EUs. Ankit Navik (3): drm/i915: Get active pending request for given context drm/i915: set optimum eu/slice/sub-slice configuration based on load type drm/i915: Predictive governor to control slice/subslice/eu drivers/gpu/drm/i915/gem/i915_gem_context.c | 4 ++ drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 37 +++++++++++ drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 + drivers/gpu/drm/i915/gt/intel_context_sseu.c | 2 + drivers/gpu/drm/i915/gt/intel_context_types.h | 2 + drivers/gpu/drm/i915/gt/intel_lrc.c | 79 ++++++++++++++++++++++- drivers/gpu/drm/i915/i915_drv.h | 5 ++ drivers/gpu/drm/i915/i915_sysfs.c | 32 +++++++++ drivers/gpu/drm/i915/intel_device_info.c | 55 +++++++++++++++- 9 files changed, 214 insertions(+), 4 deletions(-) -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context 2020-03-16 13:36 [Intel-gfx] [PATCH v7 0/3] Dynamic EU configuration of Slice/Sub-slice/EU Ankit Navik @ 2020-03-16 13:36 ` Ankit Navik 0 siblings, 0 replies; 8+ messages in thread From: Ankit Navik @ 2020-03-16 13:36 UTC (permalink / raw) To: intel-gfx; +Cc: ankit.p.navik This patch gives us the active pending request count which is yet to be submitted to the GPU. V2: * Change 64-bit to atomic for request count. (Tvrtko Ursulin) V3: * Remove mutex for request count. * Rebase. * Fixes hitting underflow for predictive request. (Tvrtko Ursulin) V4: * Rebase. V5: * Rebase. V6: * Rebase. V7: * Rebase. * Add GEM_BUG_ON for req_cnt. Cc: Vipin Anand <vipin.anand@intel.com> Signed-off-by: Ankit Navik <ankit.p.navik@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 5 +++++ drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 ++ drivers/gpu/drm/i915/gt/intel_lrc.c | 9 +++++++++ 4 files changed, 17 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 026999b34abd..d0ff999429ff 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -879,6 +879,7 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) } trace_i915_context_create(ctx); + atomic_set(&ctx->req_cnt, 0); return ctx; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 28760bd03265..a9ba13f8865e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -171,6 +171,11 @@ struct i915_gem_context { */ struct radix_tree_root handles_vma; + /** req_cnt: tracks the pending commands, based on which we decide to + * go for low/medium/high load configuration of the GPU. + */ + atomic_t req_cnt; + /** * @name: arbitrary name, used for user debug * diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d3f4f28e9468..f90c968f95cd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2565,6 +2565,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (batch->private) intel_engine_pool_mark_active(batch->private, eb.request); + atomic_inc(&eb.gem_context->req_cnt); + trace_i915_request_queue(eb.request, eb.batch_flags); err = eb_submit(&eb, batch); err_request: diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 112531b29f59..ccfebebb0071 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2143,6 +2143,8 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } if (__i915_request_submit(rq)) { + struct i915_gem_context *ctx; + if (!merge) { *port = execlists_schedule_in(last, port - execlists->pending); port++; @@ -2158,6 +2160,13 @@ static void execlists_dequeue(struct intel_engine_cs *engine) submit = true; last = rq; + + ctx = rcu_dereference_protected( + rq->context->gem_context, true); + + GEM_BUG_ON(atomic_read(&ctx->req_cnt)); + if (atomic_read(&ctx->req_cnt) > 0) + atomic_dec(&ctx->req_cnt); } } -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 0/3] Dynamic EU configuration of Slice/Sub-slice/EU @ 2020-03-13 11:12 srinivasan.s 2020-03-13 11:12 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context srinivasan.s 0 siblings, 1 reply; 8+ messages in thread From: srinivasan.s @ 2020-03-13 11:12 UTC (permalink / raw) To: intel-gfx, chris, tvrtko.ursulin From: Srinivasan S <srinivasan.s@intel.com> drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel This patch sets improves GPU power consumption on Linux kernel based OS such as Chromium OS, Ubuntu, etc. Following are the power savings. Power savings on GLK-GT1 Bobba platform running on Chrome OS. -----------------------------------------------| App /KPI | % Power Benefit (mW) | ------------------------|----------------------| Hangout Call- 20 minute | 1.8% | Youtube 4K VPB | 14.13% | WebGL Aquarium | 13.76% | Unity3D | 6.78% | | | ------------------------|----------------------| Chrome PLT | BatteryLife Improves | | by ~45 minute | -----------------------------------------------| Power savings on KBL-GT3 running on Android and Ubuntu (Linux). -----------------------------------------------| App /KPI | % Power Benefit (mW) | |----------------------| | Android | Ubuntu | ------------------------|----------|-----------| 3D Mark (Ice storm) | 2.30% | N.A. | TRex On screen | 2.49% | 2.97% | Manhattan On screen | 3.11% | 4.90% | Carchase On Screen | N.A. | 5.06% | AnTuTu 6.1.4 | 3.42% | N.A. | SynMark2 | N.A. | 1.7% | -----------------------------------------------| We have also observed GPU core residencies improves by 1.035%. Technical Insights of the patch: Current GPU configuration code for i915 does not allow us to change EU/Slice/Sub-slice configuration dynamically. Its done only once while context is created. While particular graphics application is running, if we examine the command requests from user space, we observe that command density is not consistent. It means there is scope to change the graphics configuration dynamically even while context is running actively. This patch series proposes the solution to find the active pending load for all active context at given time and based on that, dynamically perform graphics configuration for each context. We use a hr (high resolution) timer with i915 driver in kernel to get a callback every few milliseconds (this timer value can be configured through debugfs, default is '0' indicating timer is in disabled state i.e. original system without any intervention).In the timer callback, we examine pending commands for a context in the queue, essentially, we intercept them before they are executed by GPU and we update context with required number of EUs. Two questions, how did we arrive at right timer value? and what's the right number of EUs? For the prior one, empirical data to achieve best performance in least power was considered. For the later one, we roughly categorized number of EUs logically based on platform. Now we compare number of pending commands with a particular threshold and then set number of EUs accordingly with update context. That threshold is also based on experiments & findings. If GPU is able to catch up with CPU, typically there are no pending commands, the EU config would remain unchanged there. In case there are more pending commands we reprogram context with higher number of EUs. Please note, here we are changing EUs even while context is running by examining pending commands every 'x' milliseconds. Srinivasan S (3): drm/i915: Get active pending request for given context drm/i915: set optimum eu/slice/sub-slice configuration based on load type drm/i915: Predictive governor to control slice/subslice/eu drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/gem/i915_gem_context.c | 20 +++++ drivers/gpu/drm/i915/gem/i915_gem_context.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 38 ++++++++ drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 + drivers/gpu/drm/i915/gt/intel_deu.c | 104 ++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_deu.h | 31 +++++++ drivers/gpu/drm/i915/gt/intel_lrc.c | 44 ++++++++- drivers/gpu/drm/i915/i915_drv.h | 6 ++ drivers/gpu/drm/i915/i915_gem.c | 4 + drivers/gpu/drm/i915/i915_params.c | 4 + drivers/gpu/drm/i915/i915_params.h | 1 + drivers/gpu/drm/i915/intel_device_info.c | 74 ++++++++++++++- 13 files changed, 325 insertions(+), 5 deletions(-) create mode 100644 drivers/gpu/drm/i915/gt/intel_deu.c create mode 100644 drivers/gpu/drm/i915/gt/intel_deu.h -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 8+ messages in thread
* [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context 2020-03-13 11:12 [Intel-gfx] [PATCH v7 0/3] Dynamic EU configuration of Slice/Sub-slice/EU srinivasan.s @ 2020-03-13 11:12 ` srinivasan.s 0 siblings, 0 replies; 8+ messages in thread From: srinivasan.s @ 2020-03-13 11:12 UTC (permalink / raw) To: intel-gfx, chris, tvrtko.ursulin From: Srinivasan S <srinivasan.s@intel.com> This patch gives us the active pending request count which is yet to be submitted to the GPU. V2: * Change 64-bit to atomic for request count. (Tvrtko Ursulin) V3: * Remove mutex for request count. * Rebase. * Fixes hitting underflow for predictive request. (Tvrtko Ursulin) V4: * Rebase. V5: * Rebase. V6 * Rebase. V7 * Added static table of slice/subslice/EU for JSL GEN11-LP. * Rebase. Signed-off-by: Srinivasan S <srinivasan.s@intel.com> --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + drivers/gpu/drm/i915/gem/i915_gem_context_types.h | 6 ++++++ drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 1 + drivers/gpu/drm/i915/gt/intel_lrc.c | 3 +++ 4 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 026999b34abd..d0ff999429ff 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -879,6 +879,7 @@ i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) } trace_i915_context_create(ctx); + atomic_set(&ctx->req_cnt, 0); return ctx; } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index 28760bd03265..e26e94a0ab07 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -172,6 +172,12 @@ struct i915_gem_context { struct radix_tree_root handles_vma; /** + * req_cnt: tracks the pending commands, based on which we decide to + * go for low/medium/high load configuration of the GPU. + */ + atomic_t req_cnt; + + /** * @name: arbitrary name, used for user debug * * A name is constructed for the context from the creator's process diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d3f4f28e9468..2fe9ab20ec97 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2565,6 +2565,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (batch->private) intel_engine_pool_mark_active(batch->private, eb.request); + atomic_inc(&eb.gem_context->req_cnt); trace_i915_request_queue(eb.request, eb.batch_flags); err = eb_submit(&eb, batch); err_request: diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index 112531b29f59..c58fc4329944 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -2158,6 +2158,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine) submit = true; last = rq; + + if (atomic_read(&rq->context->gem_context->req_cnt) > 0) + atomic_dec(&rq->context->gem_context->req_cnt); } } -- 2.7.4 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-03-16 21:50 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-16 13:29 [Intel-gfx] [PATCH v7 0/3] drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik 2020-03-16 13:43 ` Chris Wilson 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 2/3] drm/i915: set optimum eu/slice/sub-slice configuration based on load type Ankit Navik 2020-03-16 13:29 ` [Intel-gfx] [PATCH v7 3/3] drm/i915: Predictive governor to control slice/subslice/eu Ankit Navik 2020-03-16 21:50 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915: Context aware user agnostic EU/Slice/Sub-slice control within kernel (rev3) Patchwork -- strict thread matches above, loose matches on Subject: below -- 2020-03-16 13:36 [Intel-gfx] [PATCH v7 0/3] Dynamic EU configuration of Slice/Sub-slice/EU Ankit Navik 2020-03-16 13:36 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context Ankit Navik 2020-03-13 11:12 [Intel-gfx] [PATCH v7 0/3] Dynamic EU configuration of Slice/Sub-slice/EU srinivasan.s 2020-03-13 11:12 ` [Intel-gfx] [PATCH v7 1/3] drm/i915: Get active pending request for given context srinivasan.s
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).