All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Donnefort <vincent.donnefort@arm.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: peterz@infradead.org, rjw@rjwysocki.net,
	vincent.guittot@linaro.org, qperret@google.com,
	linux-kernel@vger.kernel.org, ionela.voinescu@arm.com,
	lukasz.luba@arm.com, dietmar.eggemann@arm.com
Subject: Re: [PATCH v2 0/3] EM / PM: Inefficient OPPs
Date: Wed, 26 May 2021 10:01:42 +0100	[thread overview]
Message-ID: <20210526090141.GA408481@e120877-lin.cambridge.arm.com> (raw)
In-Reply-To: <20210526034751.5fl4kekq73gqy2wq@vireshk-i7>

On Wed, May 26, 2021 at 09:17:51AM +0530, Viresh Kumar wrote:
> On 21-05-21, 17:54, Vincent Donnefort wrote:
> > We (Power team in Arm) are working with an experimental kernel for the
> > Google's Pixel4 to evaluate and improve the current mainline performance
> > and energy consumption on a real life device with Android.
> > 
> > The SD855 SoC found in this phone has several OPPs that are inefficient.
> > I.e. despite a lower frequency, they have a greater cost. (That cost being 
> > fmax * OPP power / OPP freq). This issue is twofold. First of course,
> > running a specific workload at an inefficient OPP is counterproductive
> > since it wastes wasting energy. But also, inefficient OPPs make a
> > performance domain less appealing for task placement than it really is.
> > 
> > We evaluated the change presented here by running 30 iterations of Android 
> > PCMark "Work 2.0 Performance". While we did not see any statistically
> > significant performance impact, this change allowed to drastically improve 
> > the idle time residency.   
> >  
> > 
> >                            |   Running   |  WFI [1]  |    Idle   |
> >    ------------------------+-------------+-----------+-----------+
> >    Little cluster (4 CPUs) |    -0.35%   |   +0.35%  |   +0.79%  |
> >    ------------------------+-------------+-----------+-----------+
> >    Medium cluster (3 CPUs) |    -6.3%    |    -18%   |    +12%   |
> >    ------------------------+-------------+-----------+-----------+
> >    Big cluster    (1 CPU)  |    -6.4%    |    -6.5%  |    +2.8%  |
> >    ------------------------+-------------+-----------+-----------+
> > 
> > On the SD855, the inefficient OPPs are found on the little cluster. By
> > removing them from the Energy Model, we make the most efficient CPUs more
> > appealing for task placement, helping to reduce the running time for the
> > medium and big CPUs. Increasing idle time is crucial for this platform due 
> > to the substantial energy cost differences among the clusters. Also,
> > despite not appearing in the statistics (the idle driver used here doesn't 
> > report it), we can speculate that we also improve the cluster idle time.
> 
> First of all, sorry about not replying earlier. I have seen this earlier and was
> shying away to receive some feedback from Rafael/Peter instead :(

No worries at all, thanks for your comments!

> 
> I think the problem you mention is genuine, I have realized it in the past,
> discussed with Vincent Guittot (cc'd) but never was able to get to a proper
> solution as the EM model wasn't there then.
> 
> I have seen your approach (from top level) and I feel maybe we can improve upon
> the whole idea a bit, lemme know what you think. The problem I see with this
> approach is the unnecessary updates to schedutil that this series makes, which
> IMHO is the wrong thing to do. Schedutil isn't the only governor and such
> changes will end up making the performance delta between ondemand and schedutil
> even more (difference based on their core design philosophy is fine, but these
> are improvements which each of them should enjoy). And if another governor wants
> these smart decisions to be added there, then it is trouble again.

I originally considered to add the inefficient knowledge into the CPUFreq table.
But I then gave up the idea for two reasons:

  * The EM depends on having schedutil enabled. I don't think that any
    other governor would then manage to rely on the inefficient OPPs. (also I
    believe Peter had a plan to keep schedutil as the one and only governor)

  * The CPUfreq driver doesn't have to rely on the CPUfreq table, if the
    knowledge about inefficient OPPs is into the latter, some drivers might not
    be able to rely on the feature (you might say 'their loss' though :)) 

For those reasons, I thought that adding inefficient support into the
CPUfreq table would complexify a lot the patchset for no functional gain. 

> 
> Since the whole thing depends on EM and OPPs, I think we can actually do this.
> 
> When the cpufreq driver registers with the EM core, lets find all the
> Inefficient OPPs and disable them once and for all. Of course, this must be done
> on voluntarily basis, a flag from the drivers will do. With this, we won't be
> required to update any thing at any of the governors end.

We still need to keep the inefficient OPPs for thermal reason. But if we go with
the inefficiency support into the CPUfreq table, we could enable or disable
them, depending on the thermal pressure. Or add a flag to read the table with or
without inefficient OPPs?


> 
> Will that work ?
> 
> -- 
> viresh


  parent reply	other threads:[~2021-05-26  9:01 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21 16:54 [PATCH v2 0/3] EM / PM: Inefficient OPPs Vincent Donnefort
2021-05-21 16:54 ` [PATCH v2 1/3] PM / EM: Fix inefficient state detection Vincent Donnefort
2021-05-24 12:41   ` Lukasz Luba
2021-05-25  9:50   ` Quentin Perret
2021-05-21 16:54 ` [PATCH v2 2/3] PM / EM: Extend em_perf_domain with a flag field Vincent Donnefort
2021-05-24 12:44   ` Lukasz Luba
2021-05-25  9:54   ` Quentin Perret
2021-05-21 16:54 ` [PATCH v2 3/3] PM / EM: Skip inefficient OPPs Vincent Donnefort
2021-05-24 12:55   ` Lukasz Luba
2021-05-25  8:48   ` Peter Zijlstra
2021-05-25  9:21     ` Vincent Donnefort
2021-05-25 10:00       ` Vincent Donnefort
2021-05-28  5:09     ` Viresh Kumar
2021-06-01  8:47       ` Vincent Donnefort
2021-06-01  8:56         ` Viresh Kumar
2021-06-01  9:07           ` Quentin Perret
2021-06-01  9:13             ` Viresh Kumar
2021-05-25  9:33   ` Quentin Perret
2021-05-25  9:46     ` Vincent Donnefort
2021-05-25 11:03       ` Lukasz Luba
2021-05-25 13:06         ` Quentin Perret
2021-05-25 13:34           ` Lukasz Luba
2021-05-25  9:47     ` Vincent Donnefort
2021-05-28  5:04   ` Viresh Kumar
2021-05-28  9:00     ` Lukasz Luba
2021-05-26  3:47 ` [PATCH v2 0/3] EM / PM: Inefficient OPPs Viresh Kumar
2021-05-26  8:56   ` Lukasz Luba
2021-05-26  9:33     ` Viresh Kumar
2021-05-27  7:13       ` Lukasz Luba
2021-05-26  9:01   ` Vincent Donnefort [this message]
2021-05-26  9:38     ` Viresh Kumar
2021-05-26  9:39       ` Viresh Kumar
2021-05-26 10:24       ` Lukasz Luba
2021-05-26 10:39         ` Lukasz Luba
2021-05-26 11:50           ` Lukasz Luba
2021-05-26 13:49       ` Vincent Donnefort

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210526090141.GA408481@e120877-lin.cambridge.arm.com \
    --to=vincent.donnefort@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=rjw@rjwysocki.net \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.