All of lore.kernel.org
 help / color / mirror / Atom feed
From: Francisco Jerez <currojerez@riseup.net>
To: Juri Lelli <juri.lelli@gmail.com>
Cc: Javi Merino <Javi.Merino@arm.com>,
	linux-pm@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
	intel-gfx@lists.freedesktop.org,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Eero Tamminen <eero.t.tamminen@intel.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>
Subject: Re: [PATCH 0/9] GPU-bound energy efficiency improvements for the intel_pstate driver.
Date: Thu, 12 Apr 2018 14:38:04 -0700	[thread overview]
Message-ID: <87r2nk2jib.fsf@riseup.net> (raw)
In-Reply-To: <20180411173558.GL13334@localhost.localdomain>


[-- Attachment #1.1.1: Type: text/plain, Size: 5053 bytes --]

Juri Lelli <juri.lelli@gmail.com> writes:

> Hi,
>
> On 11/04/18 09:26, Francisco Jerez wrote:
>> Francisco Jerez <currojerez@riseup.net> writes:
>> 
>> > Hi Srinivas,
>> >
>> > Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> writes:
>> >
>> >> On Tue, 2018-04-10 at 15:28 -0700, Francisco Jerez wrote:
>> >>> Francisco Jerez <currojerez@riseup.net> writes:
>> >>> 
>> >> [...]
>> >>
>> >>
>> >>> For the case anyone is wondering what's going on, Srinivas pointed me
>> >>> at
>> >>> a larger idle power usage increase off-list, ultimately caused by the
>> >>> low-latency heuristic as discussed in the paragraph above.  I have a
>> >>> v2
>> >>> of PATCH 6 that gives the controller a third response curve roughly
>> >>> intermediate between the low-latency and low-power states of this
>> >>> revision, which avoids the energy usage increase while C0 residency
>> >>> is
>> >>> low (e.g. during idle) expected for v1.  The low-latency behavior of
>> >>> this revision is still going to be available based on a heuristic (in
>> >>> particular when a realtime-priority task is scheduled).  We're
>> >>> carrying
>> >>> out some additional testing, I'll post the code here eventually.
>> >>
>> >> Please try sched-util governor also. There is a frequency-invariant
>> >> patch, which I can send you (This eventually will be pushed by Peter).
>> >> We want to avoid complexity to intel-pstate for non HWP power sensitive
>> >> platforms as far as possible.
>> >>
>> >
>> > Unfortunately the schedutil governor (whether frequency invariant or
>> > not) has the exact same energy efficiency issues as the present
>> > intel_pstate non-HWP governor.  Its response is severely underdamped
>> > leading to energy-inefficient behavior for any oscillating non-CPU-bound
>> > workload.  To exacerbate that problem the frequency is maxed out on
>> > frequent IO waiting just like the current intel_pstate cpu-load
>> 
>> "just like" here is possibly somewhat unfair to the schedutil governor,
>> admittedly its progressive IOWAIT boosting behavior seems somewhat less
>> wasteful than the intel_pstate non-HWP governor's IOWAIT boosting
>> behavior, but it's still largely unhelpful on IO-bound conditions.
>
> Sorry if I jump in out of the blue, but what you are trying to solve
> looks very similar to what IPA [1] is targeting as well. I might be
> wrong (I'll try to spend more time reviewing your set), but my first
> impression is that we should try to solve similar problems with a more
> general approach that could benefit different sys/archs.
>

Thanks, seems interesting, I've also been taking a look at your
whitepaper and source code.  The problem we've both been trying to solve
is indeed closely related, there may be an opportunity for sharing
efforts both ways.

Correct me if I didn't understand the whole details about your power
allocation code, but IPA seems to be dividing up the available power
budget proportionally to the power requested by the different actors (up
to the point that causes some actor to reach its maximum power) and
configured weights.  From my understanding of the get_requested_power
implementations for cpufreq and devfreq, the requested power attempts to
approximate the current power usage of each device (whether it's
estimated from the current frequency and a capacitance model, from the
get_real_power callback, or other mechanism), which can be far from the
optimal power consumption in cases where the device's governor is
programming a frequency that wildly deviates from the optimal one (as is
the case with the current intel_pstate governor for any IO-bound
workload, which incidentally will suffer the greatest penalty from a
suboptimal power allocation in cases where the IO device is actually an
integrated GPU).

Is there any mechanism in place to prevent the system from stabilizing
at a power allocation that prevents it from achieving maximum
throughput?  E.g. in a TDP-limited system with two devices consuming a
total power of Pmax = P0(f0) + P1(f1), with f0 much greater than the
optimal, and f1 capped at a frequency lower than the optimal due to TDP
or thermal constraints, and assuming that the system is bottlenecking at
the second device.  In such a scenario wouldn't IPA distribute power in
a way that roughly approximates the pre-existing suboptimal
distribution?

If that's the case, I understand that it's the responsibility of the
device's (or CPU's) frequency governor to request a frequency which is
reasonably energy-efficient in the first place for the balancer to
function correctly? (That's precisely the goal of this series) -- Which
in addition allows the system to use less power to get the same work
done in cases where the system is not thermally or TDP-limited as a
whole, so the balancing logic wouldn't have any effect at all.

> I'm Cc-ing some Arm folks...
>
> Best,
>
> - Juri
>
> [1] https://developer.arm.com/open-source/intelligent-power-allocation

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2018-04-12 21:38 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-28  6:38 [PATCH 0/9] GPU-bound energy efficiency improvements for the intel_pstate driver Francisco Jerez
2018-03-28  6:38 ` [PATCH 1/9] cpufreq: Implement infrastructure keeping track of aggregated IO active time Francisco Jerez
2018-03-28  6:38 ` [PATCH 2/9] Revert "cpufreq: intel_pstate: Replace bxt_funcs with core_funcs" Francisco Jerez
2018-03-28  6:38 ` [PATCH 3/9] Revert "cpufreq: intel_pstate: Shorten a couple of long names" Francisco Jerez
2018-03-28  6:38 ` [PATCH 4/9] Revert "cpufreq: intel_pstate: Simplify intel_pstate_adjust_pstate()" Francisco Jerez
2018-03-28  6:38 ` [PATCH 5/9] Revert "cpufreq: intel_pstate: Drop ->update_util from pstate_funcs" Francisco Jerez
2018-03-28  6:38 ` [PATCH 6/9] cpufreq/intel_pstate: Implement variably low-pass filtering controller for small core Francisco Jerez
2018-03-28  6:38 ` [PATCH 7/9] SQUASH: cpufreq/intel_pstate: Enable LP controller based on ACPI FADT profile Francisco Jerez
2018-03-28  6:38 ` [PATCH 8/9] OPTIONAL: cpufreq/intel_pstate: Expose LP controller parameters via debugfs Francisco Jerez
2018-03-28  6:38 ` [PATCH 9/9] drm/i915/execlists: Report GPU rendering as IO activity to cpufreq Francisco Jerez
2018-03-28  8:02   ` Chris Wilson
2018-03-28 18:55     ` Francisco Jerez
2018-03-28 19:20       ` Chris Wilson
2018-03-28 23:19         ` Chris Wilson
2018-03-29  0:32           ` Francisco Jerez
2018-03-29  1:01             ` Chris Wilson
2018-03-29  1:20               ` Chris Wilson
2018-03-30 18:50 ` [PATCH 0/9] GPU-bound energy efficiency improvements for the intel_pstate driver Francisco Jerez
2018-04-10 22:28 ` Francisco Jerez
2018-04-11  3:14   ` Srinivas Pandruvada
2018-04-11 16:10     ` Francisco Jerez
2018-04-11 16:26       ` Francisco Jerez
2018-04-11 17:35         ` Juri Lelli
2018-04-12 21:38           ` Francisco Jerez [this message]
2018-04-12  6:17         ` Srinivas Pandruvada
2018-04-14  2:00           ` Francisco Jerez
2018-04-14  4:01             ` Srinivas Pandruvada
2018-04-16 14:04               ` Eero Tamminen
2018-04-16 17:27                 ` Srinivas Pandruvada
2018-04-12  8:58         ` Peter Zijlstra
2018-04-12 18:34           ` Francisco Jerez
2018-04-12 19:33             ` Peter Zijlstra
2018-04-12 19:55               ` Francisco Jerez
2018-04-13 18:15                 ` Peter Zijlstra
2018-04-14  1:57                   ` Francisco Jerez
2018-04-14  9:49                     ` Peter Zijlstra
2018-04-17 14:03 ` Chris Wilson
2018-04-17 15:34   ` Srinivas Pandruvada
2018-04-17 19:27   ` Francisco Jerez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r2nk2jib.fsf@riseup.net \
    --to=currojerez@riseup.net \
    --cc=Javi.Merino@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=eero.t.tamminen@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=juri.lelli@gmail.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.