linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier
@ 2018-09-25  9:38 Daniel Lezcano
  2018-10-03  8:58 ` Rafael J. Wysocki
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Lezcano @ 2018-09-25  9:38 UTC (permalink / raw)
  To: rafael
  Cc: linux-pm, Todd Kjos, Joel Fernandes, Colin Cross,
	Rafael J. Wysocki, Peter Zijlstra (Intel),
	Ramesh Thomas, Alex Shi, open list

The function get_loadavg() returns almost always zero. To be more
precise, statistically speaking for a total of 1023379 times passing
to the function, the load is equal to zero 1020728 times, greater than
100, 610 times, the remaining is between 0 and 5.

I'm putting in question this metric. Is it worth to keep it?

Cc: Todd Kjos <tkjos@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Colin Cross <ccross@android.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/cpuidle/governors/menu.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index e26a409..d939b8e 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -173,18 +173,10 @@ static inline int which_bucket(unsigned int duration, unsigned long nr_iowaiters
  * to be, the higher this multiplier, and thus the higher
  * the barrier to go to an expensive C state.
  */
-static inline int performance_multiplier(unsigned long nr_iowaiters, unsigned long load)
+static inline int performance_multiplier(unsigned long nr_iowaiters)
 {
-	int mult = 1;
-
-	/* for higher loadavg, we are more reluctant */
-
-	mult += 2 * get_loadavg(load);
-
 	/* for IO wait tasks (per cpu!) we add 5x each */
-	mult += 10 * nr_iowaiters;
-
-	return mult;
+	return 1 + 10 * nr_iowaiters;
 }
 
 static DEFINE_PER_CPU(struct menu_device, menu_devices);
@@ -359,7 +351,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
 		 * Use the performance multiplier and the user-configurable
 		 * latency_req to determine the maximum exit latency.
 		 */
-		interactivity_req = data->predicted_us / performance_multiplier(nr_iowaiters, cpu_load);
+		interactivity_req = data->predicted_us /
+			performance_multiplier(nr_iowaiters);
 		if (latency_req > interactivity_req)
 			latency_req = interactivity_req;
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier
  2018-09-25  9:38 [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier Daniel Lezcano
@ 2018-10-03  8:58 ` Rafael J. Wysocki
  2018-10-03 10:25   ` Daniel Lezcano
  0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2018-10-03  8:58 UTC (permalink / raw)
  To: Daniel Lezcano, Peter Zijlstra (Intel)
  Cc: linux-pm, Todd Kjos, Joel Fernandes, Colin Cross, Ramesh Thomas,
	Alex Shi, LKML

On Tuesday, September 25, 2018 11:38:44 AM CEST Daniel Lezcano wrote:
> The function get_loadavg() returns almost always zero. To be more
> precise, statistically speaking for a total of 1023379 times passing
> to the function, the load is equal to zero 1020728 times, greater than
> 100, 610 times, the remaining is between 0 and 5.
> 
> I'm putting in question this metric. Is it worth to keep it?

The honest answer is that I'm not sure.

Using the CPU load for driving idle state selection is questionable in general
(there need not be any connection between the two in principle) and this
particular function is questionable either.  If it really makes no difference
in addition to that, then yes it would be good to get rid of it.

That said it looks like that may be quite workload-dependent, so maybe there
are workloads where it does make a difference?  Like networking or similar?

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier
  2018-10-03  8:58 ` Rafael J. Wysocki
@ 2018-10-03 10:25   ` Daniel Lezcano
  2018-10-03 11:09     ` Rafael J. Wysocki
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Lezcano @ 2018-10-03 10:25 UTC (permalink / raw)
  To: Rafael J. Wysocki, Peter Zijlstra (Intel)
  Cc: linux-pm, Todd Kjos, Joel Fernandes, Colin Cross, Ramesh Thomas,
	Alex Shi, LKML

On 03/10/2018 10:58, Rafael J. Wysocki wrote:
> On Tuesday, September 25, 2018 11:38:44 AM CEST Daniel Lezcano wrote:
>> The function get_loadavg() returns almost always zero. To be more
>> precise, statistically speaking for a total of 1023379 times passing
>> to the function, the load is equal to zero 1020728 times, greater than
>> 100, 610 times, the remaining is between 0 and 5.
>>
>> I'm putting in question this metric. Is it worth to keep it?
> 
> The honest answer is that I'm not sure.
> 
> Using the CPU load for driving idle state selection is questionable in general
> (there need not be any connection between the two in principle) and this
> particular function is questionable either.  If it really makes no difference
> in addition to that, then yes it would be good to get rid of it.
> 
> That said it looks like that may be quite workload-dependent, so maybe there
> are workloads where it does make a difference?  Like networking or similar?

Perhaps initially the load was not the same:


commit 69d25870f20c4b2563304f2b79c5300dd60a067e
Author: Arjan van de Ven <arjan@infradead.org>
Date:   Mon Sep 21 17:04:08 2009 -0700

    cpuidle: fix the menu governor to boost IO performance


The function get_loadavg() was:

static int get_loadavg(void)
{
       unsigned long this = this_cpu_load();

       return LOAD_INT(this) * 10 + LOAD_FRAC(this) / 10;
}

With:

unsigned long this_cpu_load(void)
{
        struct rq *this = this_rq();
        return this->cpu_load[0];
}


Now we pass the load we got from get_iowait_load() which is:

void get_iowait_load(unsigned long *nr_waiters, unsigned long *load)
{
        struct rq *rq = this_rq();
        *nr_waiters = atomic_read(&rq->nr_iowait);
        *load = rq->load.weight;
}

Colin Cross did some measurements a few years ago and concluded it is
zero most of the time, otherwise the number is so high it has no
meaning. The get_loadavg() call is removed from Android since 2011 [1].

I can hack my server (bi Xeon - 6 cores), add statistics regarding this
metric and let it run during several days, sure a lot of different
workloads will happen on it (compilation, network, throughput computing,
...) in a real use case manner. That is what I did with an ARM64
evaluation board.

However I won't be able to do that right now and that will take a bit of
time. If you think it is really useful, I can plan to do that in the
next weeks.

 -- Daniel

[1]
https://android.googlesource.com/kernel/common/+/4dedd9f124703207895777ac6e91dacde0f7cc17



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier
  2018-10-03 10:25   ` Daniel Lezcano
@ 2018-10-03 11:09     ` Rafael J. Wysocki
  2018-10-03 13:05       ` Daniel Lezcano
  0 siblings, 1 reply; 5+ messages in thread
From: Rafael J. Wysocki @ 2018-10-03 11:09 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Peter Zijlstra, Linux PM, Todd Kjos,
	Joel Fernandes, Colin Cross, Ramesh Thomas, Alex Shi,
	Linux Kernel Mailing List, Mel Gorman

On Wed, Oct 3, 2018 at 12:25 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 03/10/2018 10:58, Rafael J. Wysocki wrote:
> > On Tuesday, September 25, 2018 11:38:44 AM CEST Daniel Lezcano wrote:
> >> The function get_loadavg() returns almost always zero. To be more
> >> precise, statistically speaking for a total of 1023379 times passing
> >> to the function, the load is equal to zero 1020728 times, greater than
> >> 100, 610 times, the remaining is between 0 and 5.
> >>
> >> I'm putting in question this metric. Is it worth to keep it?
> >
> > The honest answer is that I'm not sure.
> >
> > Using the CPU load for driving idle state selection is questionable in general
> > (there need not be any connection between the two in principle) and this
> > particular function is questionable either.  If it really makes no difference
> > in addition to that, then yes it would be good to get rid of it.
> >
> > That said it looks like that may be quite workload-dependent, so maybe there
> > are workloads where it does make a difference?  Like networking or similar?
>
> Perhaps initially the load was not the same:
>
>
> commit 69d25870f20c4b2563304f2b79c5300dd60a067e
> Author: Arjan van de Ven <arjan@infradead.org>
> Date:   Mon Sep 21 17:04:08 2009 -0700
>
>     cpuidle: fix the menu governor to boost IO performance
>
>
> The function get_loadavg() was:
>
> static int get_loadavg(void)
> {
>        unsigned long this = this_cpu_load();
>
>        return LOAD_INT(this) * 10 + LOAD_FRAC(this) / 10;
> }
>
> With:
>
> unsigned long this_cpu_load(void)
> {
>         struct rq *this = this_rq();
>         return this->cpu_load[0];
> }
>
>
> Now we pass the load we got from get_iowait_load() which is:
>
> void get_iowait_load(unsigned long *nr_waiters, unsigned long *load)
> {
>         struct rq *rq = this_rq();
>         *nr_waiters = atomic_read(&rq->nr_iowait);
>         *load = rq->load.weight;
> }

This would mean that commit 372ba8cb46b2 (cpuidle: menu: Lookup CPU
runqueues less) changed the behavior more than expected, though (CC
Mel).

In any case that change was made in 2014 which is after Android
stopped using get_loadavg().

> Colin Cross did some measurements a few years ago and concluded it is
> zero most of the time, otherwise the number is so high it has no
> meaning. The get_loadavg() call is removed from Android since 2011 [1].

This is a good indication that it doesn't really matter (any more perhaps).

> I can hack my server (bi Xeon - 6 cores), add statistics regarding this
> metric and let it run during several days, sure a lot of different
> workloads will happen on it (compilation, network, throughput computing,
> ...) in a real use case manner. That is what I did with an ARM64
> evaluation board.
>
> However I won't be able to do that right now and that will take a bit of
> time. If you think it is really useful, I can plan to do that in the
> next weeks.

Well, we may as well just make this change to follow Android and let
it go through all of the CI we have for one cycle.

> [1] https://android.googlesource.com/kernel/common/+/4dedd9f124703207895777ac6e91dacde0f7cc17

If you send a non-RFC patch to drop get_loadavg() (but you can drop it
altogether then, there are no other callers of it AFAICS), I'll queue
it up after the 4.20 (or whatever it turns out to be in the end) merge
window so all of the CIs out there have a chance to report issues (if
any).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier
  2018-10-03 11:09     ` Rafael J. Wysocki
@ 2018-10-03 13:05       ` Daniel Lezcano
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Lezcano @ 2018-10-03 13:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Peter Zijlstra, Linux PM, Todd Kjos,
	Joel Fernandes, Colin Cross, Ramesh Thomas, Alex Shi,
	Linux Kernel Mailing List, Mel Gorman

On 03/10/2018 13:09, Rafael J. Wysocki wrote:
> On Wed, Oct 3, 2018 at 12:25 PM Daniel Lezcano

[ ... ]

> If you send a non-RFC patch to drop get_loadavg() (but you can drop it
> altogether then, there are no other callers of it AFAICS), I'll queue
> it up after the 4.20 (or whatever it turns out to be in the end) merge
> window so all of the CIs out there have a chance to report issues (if
> any).

Sure, I can do that. There were some changes in this file, shall I base
the patch on v4.19-rc6 or linux-pm/linux-next ?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-10-03 13:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-25  9:38 [PATCH RFC] cpuidle/drivers/menu: Remove get_loadavg in the performance multiplier Daniel Lezcano
2018-10-03  8:58 ` Rafael J. Wysocki
2018-10-03 10:25   ` Daniel Lezcano
2018-10-03 11:09     ` Rafael J. Wysocki
2018-10-03 13:05       ` Daniel Lezcano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).