linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Doug Smythies" <dsmythies@telus.net>
To: "'Rafael J. Wysocki'" <rjw@rjwysocki.net>,
	"'Linux PM'" <linux-pm@vger.kernel.org>
Cc: "'LKML'" <linux-kernel@vger.kernel.org>,
	"'Len Brown'" <len.brown@intel.com>,
	"'Srinivas Pandruvada'" <srinivas.pandruvada@linux.intel.com>,
	"'Peter Zijlstra'" <peterz@infradead.org>,
	"'Giovanni Gherdovich'" <ggherdovich@suse.cz>,
	"'Francisco Jerez'" <francisco.jerez.plata@intel.com>
Subject: RE: [RFC/RFT][PATCH] cpufreq: intel_pstate: Work in passive mode with HWP enabled
Date: Sun, 14 Jun 2020 23:18:16 -0700	[thread overview]
Message-ID: <002801d642dc$c5225c10$4f671430$@net> (raw)
In-Reply-To: <3169564.ZRsPWhXyMD@kreacher>

On 2020.05.21 10:16 Rafael J. Wysocki wrote:
> 
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Allow intel_pstate to work in the passive mode with HWP enabled and
> make it translate the target frequency supplied by the cpufreq
> governor in use into an EPP value to be written to the HWP request
> MSR (high frequencies are mapped to low EPP values that mean more
> performance-oriented HWP operation) as a hint for the HWP algorithm
> in the processor, so as to prevent it and the CPU scheduler from
> working against each other at least when the schedutil governor is
> in use.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> This is a prototype not intended for production use (based on linux-next).
> 
> Please test it if you can (on HWP systems, of course) and let me know the
> results.
> 
> The INTEL_CPUFREQ_TRANSITION_DELAY_HWP value has been guessed and it very well
> may turn out to be either too high or too low for the general use, which is one
> reason why getting as much testing coverage as possible is key here.
> 
> If you can play with different INTEL_CPUFREQ_TRANSITION_DELAY_HWP values,
> please do so and let me know the conclusions.
> 
> Cheers,
> Rafael

To anyone trying this patch:

You will need to monitor EPP (Energy Performance Preference) carefully.
It changes as a function of passive/active, and if you booted active
or passive or no-hwp and changed later.

Originally, I was not specifically monitoring EPP, or paths taken since boot
towards driver, intel_pstate or intel_cpufreq, and governor, and will now have
to set aside test results.

@Rafael: I am still having problems with my test computer and HWP. However, I can
observe the energy saving potential of this "passive-yet-active HWP mode".

At this point, I am actually trying to make my newer test computer
simply behave and do what it is told with respect to CPU frequency scaling,
because even acpi-cpufreq misbehaves for performance governor under
some conditions [1].

[1] https://marc.info/?l=linux-pm&m=159155067328641&w=2

To my way of thinking:

1.) it is imperative that we be able to decouple the governor servo
from the processor servo. At a minimum this is needed for system testing,
debugging and reference baselines. At a maximum users could, perhaps, decide
for themselves. Myself, I would prefer "passive" to mean "do what you
have been told", and that is now what I am testing.

2.) I have always thought, indeed relied on, performance mode as being
more than a hint. For my older i7-2600K it never disobeyed orders, except
for the most minuscule of workloads.
This newer i5-9600K seems to have a mind of its own which I would like
to be able to turn off, yet still be able to use intel_pstate trace
with schedutil.

Recall last week I said

> moving forward the typical CPU frequency scaling
> configuration for my test system will be:
>
> driver: intel-cpufreq, forced at boot.
> governor: schedutil
> hwp: forced off at boot.

The problem is that baseline references are still needed
and performance mode is unreliable. Maybe other stuff also,
I simply don't know at this point.

Example of EPP changing (no need to read on) (from fresh boot):

Current EPP:

root@s18:/home/doug# rdmsr --bitfield 31:24 -u -a 0x774
128
128
128
128
128
128
root@s18:/home/doug# grep . /sys/devices/system/cpu/cpu3/cpufreq/*
/sys/devices/system/cpu/cpu3/cpufreq/affected_cpus:3
/sys/devices/system/cpu/cpu3/cpufreq/base_frequency:3700000
/sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_max_freq:4600000
/sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_min_freq:800000
/sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_transition_latency:0
/sys/devices/system/cpu/cpu3/cpufreq/energy_performance_available_preferences:default performance balance_performance balance_power
power
/sys/devices/system/cpu/cpu3/cpufreq/energy_performance_preference:balance_performance
/sys/devices/system/cpu/cpu3/cpufreq/related_cpus:3
/sys/devices/system/cpu/cpu3/cpufreq/scaling_available_governors:performance powersave
/sys/devices/system/cpu/cpu3/cpufreq/scaling_cur_freq:800102
/sys/devices/system/cpu/cpu3/cpufreq/scaling_driver:intel_pstate
/sys/devices/system/cpu/cpu3/cpufreq/scaling_governor:powersave
/sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq:4600000
/sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq:800000
/sys/devices/system/cpu/cpu3/cpufreq/scaling_setspeed:<unsupported>

Now, switch to passive mode:

echo passive > /sys/devices/system/cpu/intel_pstate/status

And observe EPP:

root@s18:/home/doug# rdmsr --bitfield 31:24 -u -a 0x774
255
255
255
255
255
255
root@s18:/home/doug# grep . /sys/devices/system/cpu/cpu3/cpufreq/*
/sys/devices/system/cpu/cpu3/cpufreq/affected_cpus:3
/sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_max_freq:4600000
/sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_min_freq:800000
/sys/devices/system/cpu/cpu3/cpufreq/cpuinfo_transition_latency:20000
/sys/devices/system/cpu/cpu3/cpufreq/related_cpus:3
/sys/devices/system/cpu/cpu3/cpufreq/scaling_available_governors:conservative ondemand userspace powersave performance schedutil

Hey, where did the ability to adjust the energy_performance_preference setting go?

/sys/devices/system/cpu/cpu3/cpufreq/scaling_cur_freq:3400313
/sys/devices/system/cpu/cpu3/cpufreq/scaling_driver:intel_cpufreq
/sys/devices/system/cpu/cpu3/cpufreq/scaling_governor:performance
/sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq:4600000
/sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq:800000
/sys/devices/system/cpu/cpu3/cpufreq/scaling_setspeed:<unsupported>

Kernel is 5.7 +plus this patch:

root@s18:/home/doug# uname -a
Linux s18 5.7.0-hwp10 #786 SMP PREEMPT Tue Jun 9 20:15:18 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux

223e5c33f927 (HEAD -> k57-doug-hwp) cpufreq: intel_pstate: Accept passive mode with HWP enabled
5d890a14763d cpufreq: intel_pstate: Use passive mode by default without HWP
3d77e6a8804a (tag: v5.7) Linux 5.7

The below is on top of this patch, and is how I am attempting to move forward:

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 4ab8bc1476c9..6c28ec49b192 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2331,33 +2331,32 @@ static void intel_cpufreq_update_hwp_request(struct cpudata *cpu, u32 min_perf)
        value |= HWP_MIN_PERF(min_perf);

        /*
-        * The entire MSR needs to be updated in order to update the HWP min
-        * field in it, so opportunistically update the max too if needed.
+        * the max also...
         */
        value &= ~HWP_MAX_PERF(~0L);
-       value |= HWP_MAX_PERF(cpu->max_perf_ratio);
+       value |= HWP_MAX_PERF(min_perf);

        if (value != prev)
                wrmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, value);
 }

 /**
- * intel_cpufreq_adjust_hwp - Adjust the HWP reuqest register.
+ * intel_cpufreq_adjust_hwp - Adjust the HWP request register.
  * @cpu: Target CPU.
  * @target_pstate: P-state corresponding to the target frequency.
  *
- * Set the HWP minimum performance limit to 75% of @target_pstate taking the
+ * Set the HWP minimum performance limit to @target_pstate taking the
  * global min and max policy limits into account.
  *
- * The purpose of this is to avoid situations in which the kernel and the HWP
- * algorithm work against each other by giving a hint about the expectations of
- * the former to the latter.
+ * The purpose of this is to force the slave (passive) servo to do what
+ * it has been told, not what ever it wants.
+ * This NOT a hint. EPP (responsiveness) is managed from elsewhere.
  */
 static void intel_cpufreq_adjust_hwp(struct cpudata *cpu, u32 target_pstate)
 {
        u32 min_perf;

-       min_perf = max_t(u32, (3 * target_pstate) / 4, cpu->min_perf_ratio);
+       min_perf = max_t(u32, target_pstate, cpu->min_perf_ratio);
        min_perf = min_t(u32, min_perf, cpu->max_perf_ratio);
        if (min_perf != cpu->pstate.current_pstate) {
                cpu->pstate.current_pstate = min_perf;

... Doug

... [deleted] ...



  parent reply	other threads:[~2020-06-15  6:18 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-21 17:15 [RFC/RFT][PATCH] cpufreq: intel_pstate: Work in passive mode with HWP enabled Rafael J. Wysocki
2020-05-25  1:36 ` Francisco Jerez
2020-05-25 13:23   ` Rafael J. Wysocki
2020-05-25 20:57     ` Francisco Jerez
2020-05-26  8:19       ` Rafael J. Wysocki
2020-05-26 15:51         ` Doug Smythies
2020-05-26 17:42           ` Rafael J. Wysocki
2020-05-26 18:29         ` Francisco Jerez
2020-05-25 15:30 ` Doug Smythies
2020-05-25 21:09   ` Doug Smythies
2020-06-15  6:18 ` Doug Smythies [this message]
2020-08-24 14:54 ` Doug Smythies

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='002801d642dc$c5225c10$4f671430$@net' \
    --to=dsmythies@telus.net \
    --cc=francisco.jerez.plata@intel.com \
    --cc=ggherdovich@suse.cz \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).