From: Francisco Jerez <currojerez@riseup.net>
To: "Pandruvada\, Srinivas" <srinivas.pandruvada@intel.com>, "Brown\,
Len" <len.brown@intel.com>,
"linux-pm\@vger.kernel.org" <linux-pm@vger.kernel.org>,
"intel-gfx\@lists.freedesktop.org"
<intel-gfx@lists.freedesktop.org>
Cc: "peterz@infradead.org" <peterz@infradead.org>,
"rjw@rjwysocki.net" <rjw@rjwysocki.net>
Subject: Re: [Intel-gfx] [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2).
Date: Mon, 23 Mar 2020 17:23:51 -0700 [thread overview]
Message-ID: <87blom4n3c.fsf@riseup.net> (raw)
In-Reply-To: <5a7aa1cef880ee5ac3ffe2055745c26f8d124b68.camel@intel.com>
[-- Attachment #1.1.1: Type: text/plain, Size: 8285 bytes --]
"Pandruvada, Srinivas" <srinivas.pandruvada@intel.com> writes:
> Hi Francisco,
>
> On Tue, 2020-03-10 at 14:41 -0700, Francisco Jerez wrote:
>> This is my second take on improving the energy efficiency of the
>> intel_pstate driver under IO-bound conditions. The problem and
>> approach to solve it are roughly the same as in my previous series
>> [1]
>> at a high level:
>>
>> In IO-bound scenarios (by definition) the throughput of the system
>> doesn't improve with increasing CPU frequency beyond the threshold
>> value at which the IO device becomes the bottleneck, however with the
>> current governors (whether HWP is in use or not) the CPU frequency
>> tends to oscillate with the load, often with an amplitude far into
>> the
>> turbo range, leading to severely reduced energy efficiency, which is
>> particularly problematic when a limited TDP budget is shared among a
>> number of cores running some multithreaded workload, or among a CPU
>> core and an integrated GPU.
>>
>> Improving the energy efficiency of the CPU improves the throughput of
>> the system in such TDP-limited conditions. See [4] for some
>> preliminary benchmark results from a Razer Blade Stealth 13 Late
>> 2019/LY320 laptop with an Intel ICL processor and integrated
>> graphics,
>> including throughput results that range up to a ~15% improvement and
>> performance-per-watt results up to a ~43% improvement (estimated via
>> RAPL). Particularly the throughput results may vary substantially
>> from one platform to another depending on the TDP budget and the
>> balance of load between CPU and GPU.
>>
>
> You changed the EPP to 0 intentionally or unintentionally. We know that
> all energy optimization will be disabled with this change.
> This test was done on an ICL system.
>
Hmm, that's bad, and fully unintentional. It's probably a side effect
of intel_pstate_reset_vlp() running before intel_pstate_hwp_set(), which
could cause it to use an uninitialized value of hwp_req_cached (zero?).
I'll fix it in v3. Thanks a lot for pointing this out.
>
> Basically without your patches on top of linux-next: EPP = 0x80
> $sudo rdmsr -a 0x774
> 80002704
> 80002704
> 80002704
> 80002704
> 80002704
> 80002704
> 80002704
> 80002704
>
>
> After your patches
>
> $sudo rdmsr -a 0x774
> 2704
> 2704
> 2704
> 2704
> 2704
> 2704
> 2704
> 2704
>
> I added some prints, basically you change the EPP at startup before
> regular HWP request update path and update on top. So boot up EPP is
> overwritten.
>
>
> [ 5.867476] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.872426] intel_pstate_reset_vlp hwp_req:404
> [ 5.881645] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.886634] intel_pstate_reset_vlp hwp_req:404
> [ 5.895819] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.900958] intel_pstate_reset_vlp hwp_req:404
> [ 5.910321] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.915406] intel_pstate_reset_vlp hwp_req:404
> [ 5.924623] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.929564] intel_pstate_reset_vlp hwp_req:404
> [ 5.944039] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.951672] intel_pstate_reset_vlp hwp_req:404
> [ 5.966157] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.973808] intel_pstate_reset_vlp hwp_req:404
> [ 5.988223] intel_pstate_reset_vlp hwp_req cached:0
> [ 5.995823] intel_pstate_reset_vlp hwp_req:404
> [ 6.010062] intel_pstate: HWP enabled
>
> Thanks,
> Srinivas
>
>
>
>> One of the main differences relative to my previous version is that
>> the trade-off between energy efficiency and frequency ramp-up latency
>> is now exposed to device drivers through a new PM QoS class [It would
>> make sense to expose it to userspace too eventually but that's beyond
>> the purpose of this series]. The new PM QoS class provides a latency
>> target to CPUFREQ governors which gives them permission to filter out
>> CPU frequency oscillations with a period significantly shorter than
>> the specified target, whenever doing so leads to improved energy
>> efficiency.
>>
>> This series takes advantage of the new PM QoS class from the i915
>> driver whenever the driver determines that the GPU has become a
>> bottleneck for an extended period of time. At that point it places a
>> PM QoS ramp-up latency target which causes CPUFREQ to limit the CPU
>> to
>> a reasonably energy-efficient frequency able to at least achieve the
>> required amount of work in a time window approximately equal to the
>> ramp-up latency target (since any longer-term energy efficiency
>> optimization would potentially violate the latency target). This
>> seems more effective than clamping the CPU frequency to a fixed value
>> directly from various subsystems, since the CPU is a shared resource,
>> so the frequency bound needs to consider the load and latency
>> requirements of all independent workloads running on the same CPU
>> core
>> in order to avoid performance degradation in a multitasking, possibly
>> virtualized environment.
>>
>> The main limitation of this PM QoS approach is that whenever multiple
>> clients request different ramp-up latency targets, only the strictest
>> (lowest latency) one will apply system-wide, potentially leading to
>> suboptimal energy efficiency for the less latency-sensitive clients,
>> (though it won't artificially limit the CPU throughput of the most
>> latency-sensitive clients as a result of the PM QoS requests placed
>> by
>> less latency-sensitive ones). In order to address this limitation
>> I'm
>> working on a more complicated solution which integrates with the task
>> scheduler in order to provide response latency control with process
>> granularity (pretty much in the spirit of PELT). One of the
>> alternatives Rafael and I were discussing was to expose that through
>> a
>> third cgroup clamp on top of the MIN and MAX utilization clamps, but
>> I'm open to any other possibilities regarding what the interface
>> should look like. Either way the current (scheduling-unaware) PM
>> QoS-based interface should provide most of the benefit except in
>> heavily multitasking environments.
>>
>> A branch with this series in testable form can be found here [2],
>> based on linux-next from a few days ago. Another important
>> difference
>> with respect to my previous revision is that the present one targets
>> HWP systems (though for the moment it's only enabled by default on
>> ICL, even though that can be overridden through the kernel command
>> line). I have WIP code that uses the same governor in order to
>> provide a similar benefit on non-HWP systems (like my previous
>> revision), which can be found in this branch for reference [3] -- I'm
>> planning to finish that up and send it as follow-up to this series
>> assuming people are happy with the overall approach.
>>
>> Thanks in advance for any review feed-back and test reports.
>>
>> [PATCH 01/10] PM: QoS: Add CPU_RESPONSE_FREQUENCY global PM QoS
>> limit.
>> [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU
>> load.
>> [PATCH 03/10] OPTIONAL: drm/i915: Expose PM QoS control parameters
>> via debugfs.
>> [PATCH 04/10] Revert "cpufreq: intel_pstate: Drop ->update_util from
>> pstate_funcs"
>> [PATCH 05/10] cpufreq: intel_pstate: Implement VLP controller
>> statistics and status calculation.
>> [PATCH 06/10] cpufreq: intel_pstate: Implement VLP controller target
>> P-state range estimation.
>> [PATCH 07/10] cpufreq: intel_pstate: Implement VLP controller for HWP
>> parts.
>> [PATCH 08/10] cpufreq: intel_pstate: Enable VLP controller based on
>> ACPI FADT profile and CPUID.
>> [PATCH 09/10] OPTIONAL: cpufreq: intel_pstate: Add tracing of VLP
>> controller status.
>> [PATCH 10/10] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller
>> parameters via debugfs.
>>
>> [1] https://marc.info/?l=linux-pm&m=152221943320908&w=2
>> [2]
>> https://github.com/curro/linux/commits/intel_pstate-vlp-v2-hwp-only
>> [3] https://github.com/curro/linux/commits/intel_pstate-vlp-v2
>> [4]
>> http://people.freedesktop.org/~currojerez/intel_pstate-vlp-v2/benchmark-comparison-ICL.log
>>
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2020-03-24 0:23 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-10 21:41 [Intel-gfx] [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Francisco Jerez
2020-03-10 21:41 ` [Intel-gfx] [PATCH 01/10] PM: QoS: Add CPU_RESPONSE_FREQUENCY global PM QoS limit Francisco Jerez
2020-03-11 12:42 ` Peter Zijlstra
2020-03-11 19:23 ` Francisco Jerez
2020-03-11 19:23 ` [Intel-gfx] [PATCHv2 " Francisco Jerez
2020-03-19 10:25 ` Rafael J. Wysocki
2020-03-10 21:41 ` [Intel-gfx] [PATCH 02/10] drm/i915: Adjust PM QoS response frequency based on GPU load Francisco Jerez
2020-03-10 22:26 ` Chris Wilson
2020-03-11 0:34 ` Francisco Jerez
2020-03-18 19:42 ` Francisco Jerez
2020-03-20 2:46 ` Francisco Jerez
2020-03-20 10:06 ` Chris Wilson
2020-03-11 10:00 ` Tvrtko Ursulin
2020-03-11 10:21 ` Chris Wilson
2020-03-11 19:54 ` Francisco Jerez
2020-03-12 11:52 ` Tvrtko Ursulin
2020-03-13 7:39 ` Francisco Jerez
2020-03-16 20:54 ` Francisco Jerez
2020-03-10 21:41 ` [Intel-gfx] [PATCH 03/10] OPTIONAL: drm/i915: Expose PM QoS control parameters via debugfs Francisco Jerez
2020-03-10 21:41 ` [Intel-gfx] [PATCH 04/10] Revert "cpufreq: intel_pstate: Drop ->update_util from pstate_funcs" Francisco Jerez
2020-03-19 10:45 ` Rafael J. Wysocki
2020-03-10 21:41 ` [Intel-gfx] [PATCH 05/10] cpufreq: intel_pstate: Implement VLP controller statistics and status calculation Francisco Jerez
2020-03-19 11:06 ` Rafael J. Wysocki
2020-03-10 21:41 ` [Intel-gfx] [PATCH 06/10] cpufreq: intel_pstate: Implement VLP controller target P-state range estimation Francisco Jerez
2020-03-19 11:12 ` Rafael J. Wysocki
2020-03-10 21:42 ` [Intel-gfx] [PATCH 07/10] cpufreq: intel_pstate: Implement VLP controller for HWP parts Francisco Jerez
2020-03-17 23:59 ` Pandruvada, Srinivas
2020-03-18 19:51 ` Francisco Jerez
2020-03-18 20:10 ` Pandruvada, Srinivas
2020-03-18 20:22 ` Francisco Jerez
2020-03-23 20:13 ` Pandruvada, Srinivas
2020-03-10 21:42 ` [Intel-gfx] [PATCH 08/10] cpufreq: intel_pstate: Enable VLP controller based on ACPI FADT profile and CPUID Francisco Jerez
2020-03-19 11:20 ` Rafael J. Wysocki
2020-03-10 21:42 ` [Intel-gfx] [PATCH 09/10] OPTIONAL: cpufreq: intel_pstate: Add tracing of VLP controller status Francisco Jerez
2020-03-10 21:42 ` [Intel-gfx] [PATCH 10/10] OPTIONAL: cpufreq: intel_pstate: Expose VLP controller parameters via debugfs Francisco Jerez
2020-03-11 2:35 ` [Intel-gfx] [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Pandruvada, Srinivas
2020-03-11 3:55 ` Francisco Jerez
2020-03-11 4:25 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork
2020-03-12 2:31 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for GPU-bound energy efficiency improvements for the intel_pstate driver (v2). (rev2) Patchwork
2020-03-12 2:32 ` Patchwork
2020-03-23 23:29 ` [Intel-gfx] [RFC] GPU-bound energy efficiency improvements for the intel_pstate driver (v2) Pandruvada, Srinivas
2020-03-24 0:23 ` Francisco Jerez [this message]
2020-03-24 19:16 ` Francisco Jerez
2020-03-24 20:03 ` Pandruvada, Srinivas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87blom4n3c.fsf@riseup.net \
--to=currojerez@riseup.net \
--cc=intel-gfx@lists.freedesktop.org \
--cc=len.brown@intel.com \
--cc=linux-pm@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
--cc=srinivas.pandruvada@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).