From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> To: Chris Wilson <chris@chris-wilson.co.uk>, Francisco Jerez <currojerez@riseup.net>, intel-gfx@lists.freedesktop.org, linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, "Pandruvada, Srinivas" <srinivas.pandruvada@intel.com> Subject: Re: [Intel-gfx] [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load. Date: Wed, 11 Mar 2020 10:00:41 +0000 [thread overview] Message-ID: <ac5fdd3c-bf47-60d3-edef-82d451266dcb@linux.intel.com> (raw) In-Reply-To: <158387916218.28297.4489489879582782488@build.alporthouse.com> On 10/03/2020 22:26, Chris Wilson wrote: > Quoting Francisco Jerez (2020-03-10 21:41:55) >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c >> index b9b3f78f1324..a5d7a80b826d 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c >> @@ -1577,6 +1577,11 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) >> /* we need to manually load the submit queue */ >> if (execlists->ctrl_reg) >> writel(EL_CTRL_LOAD, execlists->ctrl_reg); >> + >> + if (execlists_num_ports(execlists) > 1 && > pending[1] is always defined, the minimum submission is one slot, with > pending[1] as the sentinel NULL. > >> + execlists->pending[1] && >> + !atomic_xchg(&execlists->overload, 1)) >> + intel_gt_pm_active_begin(&engine->i915->gt); > > engine->gt > >> } >> >> static bool ctx_single_port_submission(const struct intel_context *ce) >> @@ -2213,6 +2218,12 @@ cancel_port_requests(struct intel_engine_execlists * const execlists) >> clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight)); >> >> WRITE_ONCE(execlists->active, execlists->inflight); >> + >> + if (atomic_xchg(&execlists->overload, 0)) { >> + struct intel_engine_cs *engine = >> + container_of(execlists, typeof(*engine), execlists); >> + intel_gt_pm_active_end(&engine->i915->gt); >> + } >> } >> >> static inline void >> @@ -2386,6 +2397,9 @@ static void process_csb(struct intel_engine_cs *engine) >> /* port0 completed, advanced to port1 */ >> trace_ports(execlists, "completed", execlists->active); >> >> + if (atomic_xchg(&execlists->overload, 0)) >> + intel_gt_pm_active_end(&engine->i915->gt); > > So this looses track if we preempt a dual-ELSP submission with a > single-ELSP submission (and never go back to dual). > > If you move this to the end of the loop and check > > if (!execlists->active[1] && atomic_xchg(&execlists->overload, 0)) > intel_gt_pm_active_end(engine->gt); > > so that it covers both preemption/promotion and completion. > > However, that will fluctuate quite rapidly. (And runs the risk of > exceeding the sentinel.) > > An alternative approach would be to couple along > schedule_in/schedule_out > > atomic_set(overload, -1); > > __execlists_schedule_in: > if (!atomic_fetch_inc(overload) > intel_gt_pm_active_begin(engine->gt); > __execlists_schedule_out: > if (!atomic_dec_return(overload) > intel_gt_pm_active_end(engine->gt); > > which would mean we are overloaded as soon as we try to submit an > overlapping ELSP. Putting it this low-level into submission code also would not work well with GuC. How about we try to keep some accounting one level higher, as the i915 scheduler is passing requests on to the backend for execution? Or number of runnable contexts, if the distinction between contexts and requests is better for this purpose. Problematic bit in going one level higher though is that the exit point is less precisely coupled to the actual state. Or maybe with aggressive engine retire we have nowadays it wouldn't be a problem. Regards, Tvrtko > > > The metric feels very multiple client (game + display server, or > saturated transcode) centric. In the endless kernel world, we expect > 100% engine utilisation from a single context, and never a dual-ELSP > submission. They are also likely to want to avoid being throttled to > converse TDP for the CPU. > > Should we also reduce the overload for the number of clients who are > waiting for interrupts from the GPU, so that their wakeup latency is not > impacted? > -Chris > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx >
WARNING: multiple messages have this Message-ID (diff)
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> To: Chris Wilson <chris@chris-wilson.co.uk>, Francisco Jerez <currojerez@riseup.net>, intel-gfx@lists.freedesktop.org, linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net>, "Pandruvada, Srinivas" <srinivas.pandruvada@intel.com> Subject: Re: [Intel-gfx] [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load. Date: Wed, 11 Mar 2020 10:00:41 +0000 [thread overview] Message-ID: <ac5fdd3c-bf47-60d3-edef-82d451266dcb@linux.intel.com> (raw) In-Reply-To: <158387916218.28297.4489489879582782488@build.alporthouse.com> On 10/03/2020 22:26, Chris Wilson wrote: > Quoting Francisco Jerez (2020-03-10 21:41:55) >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c >> index b9b3f78f1324..a5d7a80b826d 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c >> @@ -1577,6 +1577,11 @@ static void execlists_submit_ports(struct intel_engine_cs *engine) >> /* we need to manually load the submit queue */ >> if (execlists->ctrl_reg) >> writel(EL_CTRL_LOAD, execlists->ctrl_reg); >> + >> + if (execlists_num_ports(execlists) > 1 && > pending[1] is always defined, the minimum submission is one slot, with > pending[1] as the sentinel NULL. > >> + execlists->pending[1] && >> + !atomic_xchg(&execlists->overload, 1)) >> + intel_gt_pm_active_begin(&engine->i915->gt); > > engine->gt > >> } >> >> static bool ctx_single_port_submission(const struct intel_context *ce) >> @@ -2213,6 +2218,12 @@ cancel_port_requests(struct intel_engine_execlists * const execlists) >> clear_ports(execlists->inflight, ARRAY_SIZE(execlists->inflight)); >> >> WRITE_ONCE(execlists->active, execlists->inflight); >> + >> + if (atomic_xchg(&execlists->overload, 0)) { >> + struct intel_engine_cs *engine = >> + container_of(execlists, typeof(*engine), execlists); >> + intel_gt_pm_active_end(&engine->i915->gt); >> + } >> } >> >> static inline void >> @@ -2386,6 +2397,9 @@ static void process_csb(struct intel_engine_cs *engine) >> /* port0 completed, advanced to port1 */ >> trace_ports(execlists, "completed", execlists->active); >> >> + if (atomic_xchg(&execlists->overload, 0)) >> + intel_gt_pm_active_end(&engine->i915->gt); > > So this looses track if we preempt a dual-ELSP submission with a > single-ELSP submission (and never go back to dual). > > If you move this to the end of the loop and check > > if (!execlists->active[1] && atomic_xchg(&execlists->overload, 0)) > intel_gt_pm_active_end(engine->gt); > > so that it covers both preemption/promotion and completion. > > However, that will fluctuate quite rapidly. (And runs the risk of > exceeding the sentinel.) > > An alternative approach would be to couple along > schedule_in/schedule_out > > atomic_set(overload, -1); > > __execlists_schedule_in: > if (!atomic_fetch_inc(overload) > intel_gt_pm_active_begin(engine->gt); > __execlists_schedule_out: > if (!atomic_dec_return(overload) > intel_gt_pm_active_end(engine->gt); > > which would mean we are overloaded as soon as we try to submit an > overlapping ELSP. Putting it this low-level into submission code also would not work well with GuC. How about we try to keep some accounting one level higher, as the i915 scheduler is passing requests on to the backend for execution? Or number of runnable contexts, if the distinction between contexts and requests is better for this purpose. Problematic bit in going one level higher though is that the exit point is less precisely coupled to the actual state. Or maybe with aggressive engine retire we have nowadays it wouldn't be a problem. Regards, Tvrtko > > > The metric feels very multiple client (game + display server, or > saturated transcode) centric. In the endless kernel world, we expect > 100% engine utilisation from a single context, and never a dual-ELSP > submission. They are also likely to want to avoid being throttled to > converse TDP for the CPU. > > Should we also reduce the overload for the number of clients who are > waiting for interrupts from the GPU, so that their wakeup latency is not > impacted? > -Chris > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-03-11 10:00 UTC|newest] Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-03-10 21:41 [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-10 21:41 ` [PATCH 01/10] PM: QoS: Add CPU_RESPONSE_FREQUENCY global PM QoS limit Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-11 12:42 ` Peter Zijlstra 2020-03-11 12:42 ` [Intel-gfx] " Peter Zijlstra 2020-03-11 19:23 ` Francisco Jerez 2020-03-11 19:23 ` [Intel-gfx] " Francisco Jerez 2020-03-11 19:23 ` [PATCHv2 " Francisco Jerez 2020-03-11 19:23 ` [Intel-gfx] " Francisco Jerez 2020-03-19 10:25 ` Rafael J. Wysocki 2020-03-19 10:25 ` [Intel-gfx] " Rafael J. Wysocki 2020-03-10 21:41 ` [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-10 22:26 ` Chris Wilson 2020-03-10 22:26 ` Chris Wilson 2020-03-11 0:34 ` Francisco Jerez 2020-03-11 0:34 ` Francisco Jerez 2020-03-18 19:42 ` Francisco Jerez 2020-03-18 19:42 ` Francisco Jerez 2020-03-20 2:46 ` Francisco Jerez 2020-03-20 2:46 ` Francisco Jerez 2020-03-20 10:06 ` Chris Wilson 2020-03-20 10:06 ` Chris Wilson 2020-03-11 10:00 ` Tvrtko Ursulin [this message] 2020-03-11 10:00 ` Tvrtko Ursulin 2020-03-11 10:21 ` Chris Wilson 2020-03-11 10:21 ` Chris Wilson 2020-03-11 19:54 ` Francisco Jerez 2020-03-11 19:54 ` Francisco Jerez 2020-03-12 11:52 ` Tvrtko Ursulin 2020-03-12 11:52 ` Tvrtko Ursulin 2020-03-13 7:39 ` Francisco Jerez 2020-03-13 7:39 ` Francisco Jerez 2020-03-16 20:54 ` Francisco Jerez 2020-03-16 20:54 ` Francisco Jerez 2020-03-10 21:41 ` [PATCH 03/10] OPTIONAL: drm/i915: Expose PM QoS control parameters via debugfs Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-10 21:41 ` [PATCH 04/10] Revert "cpufreq: intel_pstate: Drop ->update_util from pstate_funcs" Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-19 10:45 ` Rafael J. Wysocki 2020-03-19 10:45 ` [Intel-gfx] " Rafael J. Wysocki 2020-03-10 21:41 ` [PATCH 05/10] cpufreq: intel_pstate: Implement VLP controller statistics and status calculation Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-19 11:06 ` Rafael J. Wysocki 2020-03-19 11:06 ` [Intel-gfx] " Rafael J. Wysocki 2020-03-10 21:41 ` [PATCH 06/10] cpufreq: intel_pstate: Implement VLP controller target P-state range estimation Francisco Jerez 2020-03-10 21:41 ` [Intel-gfx] " Francisco Jerez 2020-03-19 11:12 ` Rafael J. Wysocki 2020-03-19 11:12 ` [Intel-gfx] " Rafael J. Wysocki 2020-03-10 21:42 ` [PATCH 07/10] cpufreq: intel_pstate: Implement VLP controller for HWP parts Francisco Jerez 2020-03-10 21:42 ` [Intel-gfx] " Francisco Jerez 2020-03-17 23:59 ` Pandruvada, Srinivas 2020-03-17 23:59 ` [Intel-gfx] " Pandruvada, Srinivas 2020-03-18 19:51 ` Francisco Jerez 2020-03-18 19:51 ` [Intel-gfx] " Francisco Jerez 2020-03-18 20:10 ` Pandruvada, Srinivas 2020-03-18 20:10 ` [Intel-gfx] " Pandruvada, Srinivas 2020-03-18 20:22 ` Francisco Jerez 2020-03-18 20:22 ` [Intel-gfx] " Francisco Jerez 2020-03-23 20:13 ` Pandruvada, Srinivas 2020-03-23 20:13 ` [Intel-gfx] " Pandruvada, Srinivas 2020-03-10 21:42 ` [PATCH 08/10] cpufreq: intel_pstate: Enable VLP controller based on ACPI FADT profile and CPUID Francisco Jerez 2020-03-10 21:42 ` [Intel-gfx] " Francisco Jerez 2020-03-19 11:20 ` Rafael J. Wysocki 2020-03-19 11:20 ` [Intel-gfx] " Rafael J. Wysocki 2020-03-10 21:42 ` [PATCH 09/10] OPTIONAL: cpufreq: intel_pstate: Add tracing of VLP controller status Francisco Jerez 2020-03-10 21:42 ` [Intel-gfx] " Francisco Jerez 2020-03-10 21:42 ` [PATCH 10/10] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller parameters via debugfs Francisco Jerez 2020-03-10 21:42 ` [Intel-gfx] " Francisco Jerez 2020-03-11 2:35 ` [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Pandruvada, Srinivas 2020-03-11 2:35 ` [Intel-gfx] " Pandruvada, Srinivas 2020-03-11 3:55 ` Francisco Jerez 2020-03-11 3:55 ` [Intel-gfx] " Francisco Jerez 2020-03-11 4:25 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork 2020-03-12 2:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for GPU-bound energy efficiency improvements for the intel_pstate driver (v2). (rev2) Patchwork 2020-03-12 2:32 ` Patchwork 2020-03-23 23:29 ` [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Pandruvada, Srinivas 2020-03-23 23:29 ` [Intel-gfx] " Pandruvada, Srinivas 2020-03-24 0:23 ` Francisco Jerez 2020-03-24 0:23 ` [Intel-gfx] " Francisco Jerez 2020-03-24 19:16 ` Francisco Jerez 2020-03-24 19:16 ` [Intel-gfx] " Francisco Jerez 2020-03-24 20:03 ` Pandruvada, Srinivas 2020-03-24 20:03 ` [Intel-gfx] " Pandruvada, Srinivas
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=ac5fdd3c-bf47-60d3-edef-82d451266dcb@linux.intel.com \ --to=tvrtko.ursulin@linux.intel.com \ --cc=chris@chris-wilson.co.uk \ --cc=currojerez@riseup.net \ --cc=intel-gfx@lists.freedesktop.org \ --cc=linux-pm@vger.kernel.org \ --cc=peterz@infradead.org \ --cc=rjw@rjwysocki.net \ --cc=srinivas.pandruvada@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.