linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Doug Smythies" <dsmythies@telus.net>
To: "'Rafael J. Wysocki'" <rjw@rjwysocki.net>
Cc: "'Srinivas Pandruvada'" <srinivas.pandruvada@linux.intel.com>,
	"'LKML'" <linux-kernel@vger.kernel.org>,
	"'Linux PM'" <linux-pm@vger.kernel.org>,
	"Doug Smythies" <dsmythies@telus.net>
Subject: RE: [PATCH] cpufreq: intel_pstate: Rework iowait boosting to be less aggressive
Date: Mon, 11 Feb 2019 12:14:17 -0800	[thread overview]
Message-ID: <002701d4c246$5fdd60b0$1f982210$@net> (raw)
In-Reply-To: qzTngvRgpCLmLqzTsgj72v

On 2019.02.05 04:04 Rafael J. Wysocki wrote:
> On Friday, February 1, 2019 5:54:37 PM CET Doug Smythies wrote:
>> On 2019.01.30 16:05 Rafael J. Wysocki wrote:
>> 
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> The current iowait boosting mechanism in intel_pstate_update_util()
>>> is quite aggressive, as it goes to the maximum P-state right away,
>>> and may cause excessive amounts of energy to be used, which is not
>>> desirable and arguably isn't necessary too.
>>>
>>> Follow commit a5a0809bc58e ("cpufreq: schedutil: Make iowait boost
>>> more energy efficient") that reworked the analogous iowait boost
>>> mechanism in the schedutil governor and make the iowait boosting
>>> in intel_pstate_update_util() work along the same lines.
>>> 
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>  drivers/cpufreq/intel_pstate.c |   46 +++++++++++++++++++++++++----------------
>>>  1 file changed, 29 insertions(+), 17 deletions(-)
>>>
>>> Index: linux-pm/drivers/cpufreq/intel_pstate.c
>>> ===================================================================
>>> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
>>> +++ linux-pm/drivers/cpufreq/intel_pstate.c
>>> @@ -50,6 +50,8 @@
>>>  #define int_tofp(X) ((int64_t)(X) << FRAC_BITS)
>>>  #define fp_toint(X) ((X) >> FRAC_BITS)
>>> 
>>> +#define ONE_EIGHTH_FP ((int64_t)1 << (FRAC_BITS - 3))
>>> +
>>>  #define EXT_BITS 6
>>>  #define EXT_FRAC_BITS (EXT_BITS + FRAC_BITS)
>>>  #define fp_ext_toint(X) ((X) >> EXT_FRAC_BITS)
>>> @@ -1678,17 +1680,14 @@ static inline int32_t get_avg_pstate(str
>>>  static inline int32_t get_target_pstate(struct cpudata *cpu)
>>>  {
>>> 	struct sample *sample = &cpu->sample;
>>> -	int32_t busy_frac, boost;
>>> +	int32_t busy_frac;
>>> 	int target, avg_pstate;
>>> 
>>> 	busy_frac = div_fp(sample->mperf << cpu->aperf_mperf_shift,
>>> 			   sample->tsc);
>>> 
>>> -	boost = cpu->iowait_boost;
>>> -	cpu->iowait_boost >>= 1;
>>> -
>>> -	if (busy_frac < boost)
>>> -		busy_frac = boost;
>>> +	if (busy_frac < cpu->iowait_boost)
>>> +		busy_frac = cpu->iowait_boost;
>>> 
>>> 	sample->busy_scaled = busy_frac * 100;
>>> 
>>> @@ -1767,22 +1766,35 @@ static void intel_pstate_update_util(str
>>> 	if (smp_processor_id() != cpu->cpu)
>>>  		return;
>>>
>>> +	delta_ns = time - cpu->last_update;
>>> 	if (flags & SCHED_CPUFREQ_IOWAIT) {
>>> -		cpu->iowait_boost = int_tofp(1);
>>> -		cpu->last_update = time;
>>> -		/*
>>> -		 * The last time the busy was 100% so P-state was max anyway
>>> -		 * so avoid overhead of computation.
>>> -		 */
>>> -		if (fp_toint(cpu->sample.busy_scaled) == 100)
>>> -			return;
>>> -
>>> -		goto set_pstate;
>>> +		/* Start over if the CPU may have been idle. */
>>> +		if (delta_ns > TICK_NSEC) {
>>> +			cpu->iowait_boost = ONE_EIGHTH_FP;
>>> +		} else if (cpu->iowait_boost) {
>>> +			cpu->iowait_boost <<= 1;
>>> +			if (cpu->iowait_boost >= int_tofp(1)) {
>>> +				cpu->iowait_boost = int_tofp(1);
>>> +				cpu->last_update = time;
>>> +				/*
>>> +				 * The last time the busy was 100% so P-state
>>> +				 * was max anyway, so avoid the overhead of
>>> +				 * computation.
>>> +				 */
>>> +				if (fp_toint(cpu->sample.busy_scaled) == 100)
>>> +					return;
>> 
>> Hi Rafael,
>> 
>> By exiting here, the trace, if enabled, is also bypassed.
>> We want the trace sample.
>
> Fair enough, but the return is there regardless of this patch.
>
> Maybe it should be fixed separately?

O.K.

>> Also, there is a generic:
>> "If the target ptstate is the same as before, then don't set it"
>> later on.
>> Suggest to delete this test and exit condition. (I see that this early
>> exit was done before also.)
>
> Well, exactly.
>
> It is not unreasonable to boost the frequency right away for an IO-waiter
> without waiting for the next sample time IMO.

I agree, but am just saying that it should include a trace sample, otherwise
it is difficult to understand what happened.

By the way, I forgot to mention before, I tried the patch and it does prevent
CPU frequency spikes to maximum every few seconds in a very idle system.

... Doug



  parent reply	other threads:[~2019-02-11 20:14 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-01 16:54 [PATCH] cpufreq: intel_pstate: Rework iowait boosting to be less aggressive Doug Smythies
2019-02-05 12:04 ` Rafael J. Wysocki
2019-02-11 20:14 ` Doug Smythies [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-01-31  0:04 Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='002701d4c246$5fdd60b0$1f982210$@net' \
    --to=dsmythies@telus.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).