linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/fair: use signed long when compute energy delta in eas
@ 2021-03-30  5:21 Xuewen Yan
  2021-03-30  9:45 ` Quentin Perret
  0 siblings, 1 reply; 14+ messages in thread
From: Xuewen Yan @ 2021-03-30  5:21 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann
  Cc: rostedt, bsegall, mgorman, bristot, linux-kernel, quentin.perret,
	zhang.lyra, xuewyan

From: Xuewen Yan <xuewen.yan@unisoc.com>

now the energy delta compute as follow:

base_energy_pd = compute_energy(p, -1, pd);
	--->Traverse all CPUs in pd
	--->em_pd_energy()
----------------------------------------------------- \
search for the max_sapre_cap_cpu                       \
---------------------------------                       search time
cur_delta = compute_energy(p, max_spare_cap_cpu, pd);  /
	--->Traverse all CPUs in pd                   /
---------------------------------------------------- /
	--->em_pd_energy()
cur_delta -= base_energy_pd;

During the search_time, or when calculate the cpu_util in
compute_energy(), there may occurred task dequeue or cpu_util change,
it may cause the cur_energy < base_energy_pd, so the cur_delta
would be negative. But the cur_delta is unsigned long, at this time,
the cur_delta would always bigger than best_delta of last pd.

Change the vars to signed long.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/sched/fair.c | 31 +++++++++++++++++--------------
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 794c2cb945f8..310d9d215cd7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6586,7 +6586,7 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
  */
 static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 {
-	unsigned long prev_delta = ULONG_MAX, best_delta = ULONG_MAX;
+	long prev_delta = LONG_MAX, best_delta = LONG_MAX;
 	struct root_domain *rd = cpu_rq(smp_processor_id())->rd;
 	unsigned long cpu_cap, util, base_energy = 0;
 	int cpu, best_energy_cpu = prev_cpu;
@@ -6613,13 +6613,11 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		goto unlock;
 
 	for (; pd; pd = pd->next) {
-		unsigned long cur_delta, spare_cap, max_spare_cap = 0;
+		long cur_delta;
+		unsigned long spare_cap, max_spare_cap = 0;
 		unsigned long base_energy_pd;
 		int max_spare_cap_cpu = -1;
-
-		/* Compute the 'base' energy of the pd, without @p */
-		base_energy_pd = compute_energy(p, -1, pd);
-		base_energy += base_energy_pd;
+		bool prev_in_pd = false;
 
 		for_each_cpu_and(cpu, perf_domain_span(pd), sched_domain_span(sd)) {
 			if (!cpumask_test_cpu(cpu, p->cpus_ptr))
@@ -6641,13 +6639,8 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 			if (!fits_capacity(util, cpu_cap))
 				continue;
 
-			/* Always use prev_cpu as a candidate. */
-			if (cpu == prev_cpu) {
-				prev_delta = compute_energy(p, prev_cpu, pd);
-				prev_delta -= base_energy_pd;
-				best_delta = min(best_delta, prev_delta);
-			}
-
+			if (cpu == prev_cpu)
+				prev_in_pd = true;
 			/*
 			 * Find the CPU with the maximum spare capacity in
 			 * the performance domain
@@ -6658,6 +6651,16 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 			}
 		}
 
+		/* Compute the 'base' energy of the pd, without @p */
+		base_energy_pd = compute_energy(p, -1, pd);
+		base_energy += base_energy_pd;
+
+		/* Always use prev_cpu as a candidate. */
+		if (prev_in_pd) {
+			prev_delta = compute_energy(p, prev_cpu, pd);
+			prev_delta -= base_energy_pd;
+			best_delta = min(best_delta, prev_delta);
+		}
 		/* Evaluate the energy impact of using this CPU. */
 		if (max_spare_cap_cpu >= 0 && max_spare_cap_cpu != prev_cpu) {
 			cur_delta = compute_energy(p, max_spare_cap_cpu, pd);
@@ -6675,7 +6678,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 	 * Pick the best CPU if prev_cpu cannot be used, or if it saves at
 	 * least 6% of the energy used by prev_cpu.
 	 */
-	if (prev_delta == ULONG_MAX)
+	if (prev_delta == LONG_MAX)
 		return best_energy_cpu;
 
 	if ((prev_delta - best_delta) > ((prev_delta + base_energy) >> 4))
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-03-30  5:21 [PATCH] sched/fair: use signed long when compute energy delta in eas Xuewen Yan
@ 2021-03-30  9:45 ` Quentin Perret
  2021-04-01 18:07   ` Dietmar Eggemann
  0 siblings, 1 reply; 14+ messages in thread
From: Quentin Perret @ 2021-03-30  9:45 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot, linux-kernel, quentin.perret,
	zhang.lyra, xuewyan

Hi,

On Tuesday 30 Mar 2021 at 13:21:54 (+0800), Xuewen Yan wrote:
> From: Xuewen Yan <xuewen.yan@unisoc.com>
> 
> now the energy delta compute as follow:
> 
> base_energy_pd = compute_energy(p, -1, pd);
> 	--->Traverse all CPUs in pd
> 	--->em_pd_energy()
> ----------------------------------------------------- \
> search for the max_sapre_cap_cpu                       \
> ---------------------------------                       search time
> cur_delta = compute_energy(p, max_spare_cap_cpu, pd);  /
> 	--->Traverse all CPUs in pd                   /
> ---------------------------------------------------- /
> 	--->em_pd_energy()
> cur_delta -= base_energy_pd;
> 
> During the search_time, or when calculate the cpu_util in
> compute_energy(), there may occurred task dequeue or cpu_util change,
> it may cause the cur_energy < base_energy_pd, so the cur_delta
> would be negative. But the cur_delta is unsigned long, at this time,
> the cur_delta would always bigger than best_delta of last pd.
> 
> Change the vars to signed long.

Is that really helping though?

Yes you will not overflow, but the decision is still 'wrong' if the util
values are not stable for the entire wake-up. I think folks on the Arm
side had patches to try and cache the util values upfront, and then use
them throughout feec() and compute_energy(), which I think would be a
better fix.

Dietmar, wdyt?

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-03-30  9:45 ` Quentin Perret
@ 2021-04-01 18:07   ` Dietmar Eggemann
  2021-04-06 10:59     ` Xuewen Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Dietmar Eggemann @ 2021-04-01 18:07 UTC (permalink / raw)
  To: Quentin Perret, Xuewen Yan
  Cc: mingo, peterz, juri.lelli, vincent.guittot, rostedt, bsegall,
	mgorman, bristot, linux-kernel, quentin.perret, zhang.lyra,
	xuewyan, Pierre Gondois

+cc: Pierre Gondois <pierre.gondois@arm.com>

On 30/03/2021 11:45, Quentin Perret wrote:
> Hi,
> 
> On Tuesday 30 Mar 2021 at 13:21:54 (+0800), Xuewen Yan wrote:
>> From: Xuewen Yan <xuewen.yan@unisoc.com>
>>
>> now the energy delta compute as follow:
>>
>> base_energy_pd = compute_energy(p, -1, pd);
>> 	--->Traverse all CPUs in pd
>> 	--->em_pd_energy()
>> ----------------------------------------------------- \
>> search for the max_sapre_cap_cpu                       \
>> ---------------------------------                       search time
>> cur_delta = compute_energy(p, max_spare_cap_cpu, pd);  /
>> 	--->Traverse all CPUs in pd                   /
>> ---------------------------------------------------- /
>> 	--->em_pd_energy()
>> cur_delta -= base_energy_pd;
>>
>> During the search_time, or when calculate the cpu_util in
>> compute_energy(), there may occurred task dequeue or cpu_util change,
>> it may cause the cur_energy < base_energy_pd, so the cur_delta
>> would be negative. But the cur_delta is unsigned long, at this time,
>> the cur_delta would always bigger than best_delta of last pd.
>>
>> Change the vars to signed long.
> 
> Is that really helping though?
> 
> Yes you will not overflow, but the decision is still 'wrong' if the util
> values are not stable for the entire wake-up. I think folks on the Arm
> side had patches to try and cache the util values upfront, and then use
> them throughout feec() and compute_energy(), which I think would be a
> better fix.
> 
> Dietmar, wdyt?

Yes, we have some patches from Pierre Gondois which introduce a pd cache
to store the CPU utilization values so they can be reused for 'cpu !=
dst_cpu' calculations within find_energy_efficient_cpu() (feec()).

We did run them in our Jan 2021 EAS integration:

https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129

  sched: Allocate pd_cache when EAS is enabled
  sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()

We haven't posted them since we're still looking for a story to justify
the extra complexity. The experiments on Arm64 Juno (2 big, 4 little
CPUs) showed 1-2% failure due to changes of CPU utilization values
during feec(). There was a 5% (big CPU)-10% (little CPU) runtime
reduction for feec() with the patches.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-01 18:07   ` Dietmar Eggemann
@ 2021-04-06 10:59     ` Xuewen Yan
  2021-04-07 14:11       ` Pierre
  0 siblings, 1 reply; 14+ messages in thread
From: Xuewen Yan @ 2021-04-06 10:59 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Quentin Perret, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Steven Rostedt, Benjamin Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, quentin.perret,
	Chunyan Zhang, Ryan Y, Pierre Gondois

Hi Dietmar

On Fri, Apr 2, 2021 at 2:07 AM Dietmar Eggemann
<dietmar.eggemann@arm.com> wrote:
>
> +cc: Pierre Gondois <pierre.gondois@arm.com>
>
> On 30/03/2021 11:45, Quentin Perret wrote:
> > Hi,
> >
> > On Tuesday 30 Mar 2021 at 13:21:54 (+0800), Xuewen Yan wrote:
> >> From: Xuewen Yan <xuewen.yan@unisoc.com>
> >>
> >> now the energy delta compute as follow:
> >>
> >> base_energy_pd = compute_energy(p, -1, pd);
> >>      --->Traverse all CPUs in pd
> >>      --->em_pd_energy()
> >> ----------------------------------------------------- \
> >> search for the max_sapre_cap_cpu                       \
> >> ---------------------------------                       search time
> >> cur_delta = compute_energy(p, max_spare_cap_cpu, pd);  /
> >>      --->Traverse all CPUs in pd                   /
> >> ---------------------------------------------------- /
> >>      --->em_pd_energy()
> >> cur_delta -= base_energy_pd;
> >>
> >> During the search_time, or when calculate the cpu_util in
> >> compute_energy(), there may occurred task dequeue or cpu_util change,
> >> it may cause the cur_energy < base_energy_pd, so the cur_delta
> >> would be negative. But the cur_delta is unsigned long, at this time,
> >> the cur_delta would always bigger than best_delta of last pd.
> >>
> >> Change the vars to signed long.
> >
> > Is that really helping though?
> >
> > Yes you will not overflow, but the decision is still 'wrong' if the util
> > values are not stable for the entire wake-up. I think folks on the Arm
> > side had patches to try and cache the util values upfront, and then use
> > them throughout feec() and compute_energy(), which I think would be a
> > better fix.
> >
> > Dietmar, wdyt?
>
> Yes, we have some patches from Pierre Gondois which introduce a pd cache
> to store the CPU utilization values so they can be reused for 'cpu !=
> dst_cpu' calculations within find_energy_efficient_cpu() (feec()).
>
> We did run them in our Jan 2021 EAS integration:
>
> https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129
>
>   sched: Allocate pd_cache when EAS is enabled
>   sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()
>
> We haven't posted them since we're still looking for a story to justify
> the extra complexity. The experiments on Arm64 Juno (2 big, 4 little
> CPUs) showed 1-2% failure due to changes of CPU utilization values
> during feec(). There was a 5% (big CPU)-10% (little CPU) runtime
> reduction for feec() with the patches.

I test the patch, but the overflow still exists.
In the "sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()"
I wonder why recompute the cpu util when cpu==dst_cpu in compute_energy(),
when the dst_cpu's util change, it also would cause the overflow.

Thanks

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-06 10:59     ` Xuewen Yan
@ 2021-04-07 14:11       ` Pierre
  2021-04-08  5:41         ` Xuewen Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre @ 2021-04-07 14:11 UTC (permalink / raw)
  To: Xuewen Yan, Dietmar Eggemann, qperret
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Steven Rostedt, Benjamin Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Chunyan Zhang, Ryan Y

Hi,
> I test the patch, but the overflow still exists.
> In the "sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()"
> I wonder why recompute the cpu util when cpu==dst_cpu in compute_energy(),
> when the dst_cpu's util change, it also would cause the overflow.

The patches aim to cache the energy values for the CPUs whose 
utilization is not modified (so we don't have to compute it multiple 
times). The values cached are the 'base values' of the CPUs, i.e. when 
the task is not placed on the CPU. When (cpu==dst_cpu) in 
compute_energy(), it means the energy values need to be updated instead 
of using the cached ones.

You are right, there is still a possibility to have a negative delta 
with the patches at:
https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129
Adding a check before subtracting the values, and bailing out in such 
case would avoid this, such as at:
https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/

I think a similar modification should be done in your patch. Even though 
this is a good idea to group the calls to compute_energy() to reduce the 
chances of having updates of utilization values in between the 
compute_energy() calls,
there is still a chance to have updates. I think it happened when I 
applied your patch.

About changing the delta(s) from 'unsigned long' to 'long', I am not 
sure of the meaning of having a negative delta. I thing it would be 
better to check and fail before it happens instead.

Regards


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-07 14:11       ` Pierre
@ 2021-04-08  5:41         ` Xuewen Yan
  2021-04-08  9:33           ` Pierre
  0 siblings, 1 reply; 14+ messages in thread
From: Xuewen Yan @ 2021-04-08  5:41 UTC (permalink / raw)
  To: Pierre
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi

On Wed, Apr 7, 2021 at 10:11 PM Pierre <pierre.gondois@arm.com> wrote:
>
> Hi,
> > I test the patch, but the overflow still exists.
> > In the "sched/fair: Use pd_cache to speed up find_energy_efficient_cpu()"
> > I wonder why recompute the cpu util when cpu==dst_cpu in compute_energy(),
> > when the dst_cpu's util change, it also would cause the overflow.
>
> The patches aim to cache the energy values for the CPUs whose
> utilization is not modified (so we don't have to compute it multiple
> times). The values cached are the 'base values' of the CPUs, i.e. when
> the task is not placed on the CPU. When (cpu==dst_cpu) in
> compute_energy(), it means the energy values need to be updated instead
> of using the cached ones.
>
well, is it better to use the task_util(p) + cache values ? but in
this case, the cache
values may need more parameters.

> You are right, there is still a possibility to have a negative delta
> with the patches at:
> https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129
> Adding a check before subtracting the values, and bailing out in such
> case would avoid this, such as at:
> https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/
>
In your patch, you bail out the case by "go to fail", that means you
don't use eas in such
case. However, in the actual scene, the case often occurr when select
cpu for small task.
As a result, the small task would not select cpu according to the eas,
it may affect
power consumption?

> I think a similar modification should be done in your patch. Even though
> this is a good idea to group the calls to compute_energy() to reduce the
> chances of having updates of utilization values in between the
> compute_energy() calls,
> there is still a chance to have updates. I think it happened when I
> applied your patch.
>
> About changing the delta(s) from 'unsigned long' to 'long', I am not
> sure of the meaning of having a negative delta. I thing it would be
> better to check and fail before it happens instead.
>
> Regards
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-08  5:41         ` Xuewen Yan
@ 2021-04-08  9:33           ` Pierre
  2021-04-09  2:20             ` Xuewen Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre @ 2021-04-08  9:33 UTC (permalink / raw)
  To: Xuewen Yan, Dietmar Eggemann
  Cc: Quentin Perret, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Steven Rostedt, Benjamin Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Chunyan Zhang, Ryan Y

Hi,
> Hi
>
> On Wed, Apr 7, 2021 at 10:11 PM Pierre <pierre.gondois@arm.com> wrote:
> >
> > Hi,
> > > I test the patch, but the overflow still exists.
> > > In the "sched/fair: Use pd_cache to speed up 
> find_energy_efficient_cpu()"
> > > I wonder why recompute the cpu util when cpu==dst_cpu in 
> compute_energy(),
> > > when the dst_cpu's util change, it also would cause the overflow.
> >
> > The patches aim to cache the energy values for the CPUs whose
> > utilization is not modified (so we don't have to compute it multiple
> > times). The values cached are the 'base values' of the CPUs, i.e. when
> > the task is not placed on the CPU. When (cpu==dst_cpu) in
> > compute_energy(), it means the energy values need to be updated instead
> > of using the cached ones.
> >
> well, is it better to use the task_util(p) + cache values ? but in
> this case, the cache
> values may need more parameters.

This patch-set is not significantly improving the execution time of 
feec(). The results we have so far are an improvement of 5-10% in 
execution time, with feec() being executed in < 10us. So the gain is not 
spectacular.

>
> > You are right, there is still a possibility to have a negative delta
> > with the patches at:
> > 
> https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129 
> <https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129>
> > Adding a check before subtracting the values, and bailing out in such
> > case would avoid this, such as at:
> > https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/ 
> <https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/>
> >
> In your patch, you bail out the case by "go to fail", that means you
> don't use eas in such
> case. However, in the actual scene, the case often occurr when select
> cpu for small task.
> As a result, the small task would not select cpu according to the eas,
> it may affect
> power consumption?
With this patch (bailing out), the percentage of feec() returning due to 
a negative delta I get are:
on a Juno-r2, with 2 big CPUs and 4 CPUs (capacity of 383), with a 
workload running during 5s with task having a period of 16 ms and:
  - 50 tasks at 1%:   0.14%
  - 30 tasks at 1%:   0.54%
  - 10 tasks at 1%: < 0.1%
  - 30 tasks at 5%: < 0.1%
  - 10 tasks at 5%: < 0.1%
It doesn't happen so often to me.If we bail out of feec(), the task will 
still have another opportunity in the next call. However I agree this 
can lead to a bad placement when this happens.
>
> > I think a similar modification should be done in your patch. Even though
> > this is a good idea to group the calls to compute_energy() to reduce the
> > chances of having updates of utilization values in between the
> > compute_energy() calls,
> > there is still a chance to have updates. I think it happened when I
> > applied your patch.
> >
> > About changing the delta(s) from 'unsigned long' to 'long', I am not
> > sure of the meaning of having a negative delta. I thing it would be
> > better to check and fail before it happens instead.
> >
> > Regards
> >




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-08  9:33           ` Pierre
@ 2021-04-09  2:20             ` Xuewen Yan
  2021-04-12 10:26               ` Pierre Gondois
  0 siblings, 1 reply; 14+ messages in thread
From: Xuewen Yan @ 2021-04-09  2:20 UTC (permalink / raw)
  To: Pierre
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi
>
> Hi,
> > Hi
> >
> > On Wed, Apr 7, 2021 at 10:11 PM Pierre <pierre.gondois@arm.com> wrote:
> > >
> > > Hi,
> > > > I test the patch, but the overflow still exists.
> > > > In the "sched/fair: Use pd_cache to speed up
> > find_energy_efficient_cpu()"
> > > > I wonder why recompute the cpu util when cpu==dst_cpu in
> > compute_energy(),
> > > > when the dst_cpu's util change, it also would cause the overflow.
> > >
> > > The patches aim to cache the energy values for the CPUs whose
> > > utilization is not modified (so we don't have to compute it multiple
> > > times). The values cached are the 'base values' of the CPUs, i.e. when
> > > the task is not placed on the CPU. When (cpu==dst_cpu) in
> > > compute_energy(), it means the energy values need to be updated instead
> > > of using the cached ones.
> > >
> > well, is it better to use the task_util(p) + cache values ? but in
> > this case, the cache
> > values may need more parameters.
>
> This patch-set is not significantly improving the execution time of
> feec(). The results we have so far are an improvement of 5-10% in
> execution time, with feec() being executed in < 10us. So the gain is not
> spectacular.

well, I meaned to cache all util value and compute energy with caches, when
(cpu==dst_cpu), use caches instead of updating util, and do not get
util with function:
 "effective_cpu_util()", to compute util with cache.
I add more parameters into pd_cache:
struct pd_cache {
        unsigned long util;
        unsigned long util_est;
        unsigned long util_cfs;
        unsigned long util_irq;
        unsigned long util_rt;
        unsigned long util_dl;
        unsigned long bw_dl;
        unsigned long freq_util;
        unsigned long nrg_util;
};
In this way, it can avoid util update while feec. I tested with it,
and the negative delta disappeared.
Maybe this is not a good method, but it does work.
>
> >
> > > You are right, there is still a possibility to have a negative delta
> > > with the patches at:
> > >
> > https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129
> > <https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129>
> > > Adding a check before subtracting the values, and bailing out in such
> > > case would avoid this, such as at:
> > > https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/
> > <https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/>
> > >
> > In your patch, you bail out the case by "go to fail", that means you
> > don't use eas in such
> > case. However, in the actual scene, the case often occurr when select
> > cpu for small task.
> > As a result, the small task would not select cpu according to the eas,
> > it may affect
> > power consumption?
> With this patch (bailing out), the percentage of feec() returning due to
> a negative delta I get are:
> on a Juno-r2, with 2 big CPUs and 4 CPUs (capacity of 383), with a
> workload running during 5s with task having a period of 16 ms and:
>   - 50 tasks at 1%:   0.14%
>   - 30 tasks at 1%:   0.54%
>   - 10 tasks at 1%: < 0.1%
>   - 30 tasks at 5%: < 0.1%
>   - 10 tasks at 5%: < 0.1%
> It doesn't happen so often to me.If we bail out of feec(), the task will
> still have another opportunity in the next call. However I agree this
> can lead to a bad placement when this happens.
> >
> > > I think a similar modification should be done in your patch. Even though
> > > this is a good idea to group the calls to compute_energy() to reduce the
> > > chances of having updates of utilization values in between the
> > > compute_energy() calls,
> > > there is still a chance to have updates. I think it happened when I
> > > applied your patch.
> > >
> > > About changing the delta(s) from 'unsigned long' to 'long', I am not
> > > sure of the meaning of having a negative delta. I thing it would be
> > > better to check and fail before it happens instead.
> > >
> > > Regards
> > >
>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-09  2:20             ` Xuewen Yan
@ 2021-04-12 10:26               ` Pierre Gondois
  2021-04-12 10:52                 ` Xuewen Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Gondois @ 2021-04-12 10:26 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi,
> > > > Hi,
> > > > > I test the patch, but the overflow still exists.
> > > > > In the "sched/fair: Use pd_cache to speed up
> > > find_energy_efficient_cpu()"
> > > > > I wonder why recompute the cpu util when cpu==dst_cpu in
> > > compute_energy(),
> > > > > when the dst_cpu's util change, it also would cause the overflow.
> > > >
> > > > The patches aim to cache the energy values for the CPUs whose
> > > > utilization is not modified (so we don't have to compute it multiple
> > > > times). The values cached are the 'base values' of the CPUs, 
> i.e. when
> > > > the task is not placed on the CPU. When (cpu==dst_cpu) in
> > > > compute_energy(), it means the energy values need to be updated 
> instead
> > > > of using the cached ones.
> > > >
> > > well, is it better to use the task_util(p) + cache values ? but in
> > > this case, the cache
> > > values may need more parameters.
> >
> > This patch-set is not significantly improving the execution time of
> > feec(). The results we have so far are an improvement of 5-10% in
> > execution time, with feec() being executed in < 10us. So the gain is not
> > spectacular.
>
> well, I meaned to cache all util value and compute energy with caches, 
> when
> (cpu==dst_cpu), use caches instead of updating util, and do not get
> util with function:
>  "effective_cpu_util()", to compute util with cache.
> I add more parameters into pd_cache:
> struct pd_cache {
>         unsigned long util;
>         unsigned long util_est;
>         unsigned long util_cfs;
>         unsigned long util_irq;
>         unsigned long util_rt;
>         unsigned long util_dl;
>         unsigned long bw_dl;
>         unsigned long freq_util;
>         unsigned long nrg_util;
> };
> In this way, it can avoid util update while feec. I tested with it,
> and the negative delta disappeared.
> Maybe this is not a good method, but it does work.
If I understand correctly, you put all the fields used by 
core.c:effective_cpu_util() in the caches, allowing to have values not 
subject to updates.
core.c:effective_cpu_util() isn't only called from 
fair.c:compute_energy(). It is used in the cpufreq_schedutil.c and 
cpufreq_cooling.c (through core.c:sched_cpu_util()).
Did you have to duplicate core.c:effective_cpu_util() to have a second 
version using the caches ? If yes, I think the function was meant to be 
unique so that all the utilization estimations go through the same path.

If your concern is to avoid negative delta, I think just bailing out 
when this happens should be sufficient. As shown in the last message, 
having a wrong placement should not happen that often, plus the prev_cpu 
should be used which should be ok.
If you want to cache the values, I think a stronger justification will 
be asked: this seems to be a big modification compared to the initial 
issue, knowing that another simpler solution is available (i.e. bailing 
out). I was not able to prove there was a significant gain in the 
find_energy_efficient_cpu() execution time, but I would be happy if you 
can, or if you find other arguments.

> >
> > >
> > > > You are right, there is still a possibility to have a negative delta
> > > > with the patches at:
> > > >
> > > 
> https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129 
> <https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129>
> > > 
> <https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129 
> <https://gitlab.arm.com/linux-arm/linux-power/-/commits/eas/next/integration-20210129>>
> > > > Adding a check before subtracting the values, and bailing out in 
> such
> > > > case would avoid this, such as at:
> > > > 
> https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/ 
> <https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/>
> > > 
> <https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/ 
> <https://gitlab.arm.com/linux-arm/linux-pg/-/commits/feec_bail_out/>>
> > > >
> > > In your patch, you bail out the case by "go to fail", that means you
> > > don't use eas in such
> > > case. However, in the actual scene, the case often occurr when select
> > > cpu for small task.
> > > As a result, the small task would not select cpu according to the eas,
> > > it may affect
> > > power consumption?
> > With this patch (bailing out), the percentage of feec() returning due to
> > a negative delta I get are:
> > on a Juno-r2, with 2 big CPUs and 4 CPUs (capacity of 383), with a
> > workload running during 5s with task having a period of 16 ms and:
> >   - 50 tasks at 1%:   0.14%
> >   - 30 tasks at 1%:   0.54%
> >   - 10 tasks at 1%: < 0.1%
> >   - 30 tasks at 5%: < 0.1%
> >   - 10 tasks at 5%: < 0.1%
> > It doesn't happen so often to me.If we bail out of feec(), the task will
> > still have another opportunity in the next call. However I agree this
> > can lead to a bad placement when this happens.
> > >
> > > > I think a similar modification should be done in your patch. 
> Even though
> > > > this is a good idea to group the calls to compute_energy() to 
> reduce the
> > > > chances of having updates of utilization values in between the
> > > > compute_energy() calls,
> > > > there is still a chance to have updates. I think it happened when I
> > > > applied your patch.
> > > >
> > > > About changing the delta(s) from 'unsigned long' to 'long', I am not
> > > > sure of the meaning of having a negative delta. I thing it would be
> > > > better to check and fail before it happens instead.
> > > >

Regards



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-12 10:26               ` Pierre Gondois
@ 2021-04-12 10:52                 ` Xuewen Yan
  2021-04-12 17:14                   ` Pierre Gondois
  0 siblings, 1 reply; 14+ messages in thread
From: Xuewen Yan @ 2021-04-12 10:52 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi
>
> Hi,
> > >
> > > This patch-set is not significantly improving the execution time of
> > > feec(). The results we have so far are an improvement of 5-10% in
> > > execution time, with feec() being executed in < 10us. So the gain is not
> > > spectacular.
> >
> > well, I meaned to cache all util value and compute energy with caches,
> > when
> > (cpu==dst_cpu), use caches instead of updating util, and do not get
> > util with function:
> >  "effective_cpu_util()", to compute util with cache.
> > I add more parameters into pd_cache:
> > struct pd_cache {
> >         unsigned long util;
> >         unsigned long util_est;
> >         unsigned long util_cfs;
> >         unsigned long util_irq;
> >         unsigned long util_rt;
> >         unsigned long util_dl;
> >         unsigned long bw_dl;
> >         unsigned long freq_util;
> >         unsigned long nrg_util;
> > };
> > In this way, it can avoid util update while feec. I tested with it,
> > and the negative delta disappeared.
> > Maybe this is not a good method, but it does work.
> If I understand correctly, you put all the fields used by
> core.c:effective_cpu_util() in the caches, allowing to have values not
> subject to updates.
Yes.
> core.c:effective_cpu_util() isn't only called from
> fair.c:compute_energy(). It is used in the cpufreq_schedutil.c and
> cpufreq_cooling.c (through core.c:sched_cpu_util()).
> Did you have to duplicate core.c:effective_cpu_util() to have a second
> version using the caches ? If yes, I think the function was meant to be
> unique so that all the utilization estimations go through the same path.
>
I defined a new function to distinguish it from the effective_cpu_util.

> If your concern is to avoid negative delta, I think just bailing out
> when this happens should be sufficient. As shown in the last message,
> having a wrong placement should not happen that often, plus the prev_cpu
> should be used which should be ok.
In your patch, you didn't actually choose the prev_cpu. you return (-1);

> If you want to cache the values, I think a stronger justification will
> be asked: this seems to be a big modification compared to the initial
> issue, knowing that another simpler solution is available (i.e. bailing
> out). I was not able to prove there was a significant gain in the
> find_energy_efficient_cpu() execution time, but I would be happy if you
> can, or if you find other arguments.
Yes, you are right, perhaps there is indeed no need for such a big modification.

Regards

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-12 10:52                 ` Xuewen Yan
@ 2021-04-12 17:14                   ` Pierre Gondois
  2021-04-13  1:50                     ` Xuewen Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Gondois @ 2021-04-12 17:14 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi
> > > >
> > > > This patch-set is not significantly improving the execution time of
> > > > feec(). The results we have so far are an improvement of 5-10% in
> > > > execution time, with feec() being executed in < 10us. So the 
> gain is not
> > > > spectacular.
> > >
> > > well, I meaned to cache all util value and compute energy with 
> caches,
> > > when
> > > (cpu==dst_cpu), use caches instead of updating util, and do not get
> > > util with function:
> > >  "effective_cpu_util()", to compute util with cache.
> > > I add more parameters into pd_cache:
> > > struct pd_cache {
> > >         unsigned long util;
> > >         unsigned long util_est;
> > >         unsigned long util_cfs;
> > >         unsigned long util_irq;
> > >         unsigned long util_rt;
> > >         unsigned long util_dl;
> > >         unsigned long bw_dl;
> > >         unsigned long freq_util;
> > >         unsigned long nrg_util;
> > > };
> > > In this way, it can avoid util update while feec. I tested with it,
> > > and the negative delta disappeared.
> > > Maybe this is not a good method, but it does work.
> > If I understand correctly, you put all the fields used by
> > core.c:effective_cpu_util() in the caches, allowing to have values not
> > subject to updates.
> Yes.
> > core.c:effective_cpu_util() isn't only called from
> > fair.c:compute_energy(). It is used in the cpufreq_schedutil.c and
> > cpufreq_cooling.c (through core.c:sched_cpu_util()).
> > Did you have to duplicate core.c:effective_cpu_util() to have a second
> > version using the caches ? If yes, I think the function was meant to be
> > unique so that all the utilization estimations go through the same path.
> >
> I defined a new function to distinguish it from the effective_cpu_util.
>
> > If your concern is to avoid negative delta, I think just bailing out
> > when this happens should be sufficient. As shown in the last message,
> > having a wrong placement should not happen that often, plus the prev_cpu
> > should be used which should be ok.
> In your patch, you didn't actually choose the prev_cpu. you return (-1);
>
> > If you want to cache the values, I think a stronger justification will
> > be asked: this seems to be a big modification compared to the initial
> > issue, knowing that another simpler solution is available (i.e. bailing
> > out). I was not able to prove there was a significant gain in the
> > find_energy_efficient_cpu() execution time, but I would be happy if you
> > can, or if you find other arguments.
> Yes, you are right, perhaps there is indeed no need for such a big 
> modification.
>
> Regards

In fair.c:select_task_rq_fair(), if feec() returns a error (< 0), then 
prev_cpu is selected. I think it's better to still let feec() signal 
that something happened and let select_task_rq_fair() select prev_cpu by 
itself.
Are you planning to submit a V2 with the bail out mechanism ?

Regards


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-12 17:14                   ` Pierre Gondois
@ 2021-04-13  1:50                     ` Xuewen Yan
  2021-04-13 10:58                       ` Pierre Gondois
  0 siblings, 1 reply; 14+ messages in thread
From: Xuewen Yan @ 2021-04-13  1:50 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi

On Tue, Apr 13, 2021 at 1:15 AM Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> Hi
> > > > >
> > > > > This patch-set is not significantly improving the execution time of
> > > > > feec(). The results we have so far are an improvement of 5-10% in
> > > > > execution time, with feec() being executed in < 10us. So the
> > gain is not
> > > > > spectacular.
> > > >
> > > > well, I meaned to cache all util value and compute energy with
> > caches,
> > > > when
> > > > (cpu==dst_cpu), use caches instead of updating util, and do not get
> > > > util with function:
> > > >  "effective_cpu_util()", to compute util with cache.
> > > > I add more parameters into pd_cache:
> > > > struct pd_cache {
> > > >         unsigned long util;
> > > >         unsigned long util_est;
> > > >         unsigned long util_cfs;
> > > >         unsigned long util_irq;
> > > >         unsigned long util_rt;
> > > >         unsigned long util_dl;
> > > >         unsigned long bw_dl;
> > > >         unsigned long freq_util;
> > > >         unsigned long nrg_util;
> > > > };
> > > > In this way, it can avoid util update while feec. I tested with it,
> > > > and the negative delta disappeared.
> > > > Maybe this is not a good method, but it does work.
> > > If I understand correctly, you put all the fields used by
> > > core.c:effective_cpu_util() in the caches, allowing to have values not
> > > subject to updates.
> > Yes.
> > > core.c:effective_cpu_util() isn't only called from
> > > fair.c:compute_energy(). It is used in the cpufreq_schedutil.c and
> > > cpufreq_cooling.c (through core.c:sched_cpu_util()).
> > > Did you have to duplicate core.c:effective_cpu_util() to have a second
> > > version using the caches ? If yes, I think the function was meant to be
> > > unique so that all the utilization estimations go through the same path.
> > >
> > I defined a new function to distinguish it from the effective_cpu_util.
> >
> > > If your concern is to avoid negative delta, I think just bailing out
> > > when this happens should be sufficient. As shown in the last message,
> > > having a wrong placement should not happen that often, plus the prev_cpu
> > > should be used which should be ok.
> > In your patch, you didn't actually choose the prev_cpu. you return (-1);
> >
> > > If you want to cache the values, I think a stronger justification will
> > > be asked: this seems to be a big modification compared to the initial
> > > issue, knowing that another simpler solution is available (i.e. bailing
> > > out). I was not able to prove there was a significant gain in the
> > > find_energy_efficient_cpu() execution time, but I would be happy if you
> > > can, or if you find other arguments.
> > Yes, you are right, perhaps there is indeed no need for such a big
> > modification.
> >
> > Regards
>
> In fair.c:select_task_rq_fair(), if feec() returns a error (< 0), then
> prev_cpu is selected. I think it's better to still let feec() signal
> that something happened and let select_task_rq_fair() select prev_cpu by
> itself.
In fair.c:select_task_rq_fair(), when feec() returns a error (< 0),
the new_cpu = find_idlest_cpu() or select_idle_sibling().
In your patch, you should return prev_cpu instead of -1 if you want to
select the prev_cpu.

> Are you planning to submit a V2 with the bail out mechanism ?
Maybe it would be better if you submitted the V2 ? I just check the
patch and raised some questions.
>
Regards

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-13  1:50                     ` Xuewen Yan
@ 2021-04-13 10:58                       ` Pierre Gondois
  2021-04-13 11:59                         ` Xuewen Yan
  0 siblings, 1 reply; 14+ messages in thread
From: Pierre Gondois @ 2021-04-13 10:58 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi,
> >
> > In fair.c:select_task_rq_fair(), if feec() returns a error (< 0), then
> > prev_cpu is selected. I think it's better to still let feec() signal
> > that something happened and let select_task_rq_fair() select prev_cpu by
> > itself.
> In fair.c:select_task_rq_fair(), when feec() returns a error (< 0),
> the new_cpu = find_idlest_cpu() or select_idle_sibling().
> In your patch, you should return prev_cpu instead of -1 if you want to
> select the prev_cpu.
Having a negative delta doesn't imply that prev_cpu is still available.
E.g.: If prev_cpu cannot receive the task (and is skipped), and a  
negative delta appears when evaluating the other CPUs. In such case  
feec() cannot select prev_cpu. dst_cpu must be selected by another function.

> > Are you planning to submit a V2 with the bail out mechanism ?
> Maybe it would be better if you submitted the V2 ? I just check the
> patch and raised some questions.

I m ok to send a V2. I will submit it once we agreed.

Regards


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/fair: use signed long when compute energy delta in eas
  2021-04-13 10:58                       ` Pierre Gondois
@ 2021-04-13 11:59                         ` Xuewen Yan
  0 siblings, 0 replies; 14+ messages in thread
From: Xuewen Yan @ 2021-04-13 11:59 UTC (permalink / raw)
  To: Pierre Gondois
  Cc: Dietmar Eggemann, Quentin Perret, Ingo Molnar, Peter Zijlstra,
	Juri Lelli, Vincent Guittot, Steven Rostedt, Benjamin Segall,
	Mel Gorman, Daniel Bristot de Oliveira, linux-kernel,
	Chunyan Zhang, Ryan Y

Hi
> > >
> > > In fair.c:select_task_rq_fair(), if feec() returns a error (< 0), then
> > > prev_cpu is selected. I think it's better to still let feec() signal
> > > that something happened and let select_task_rq_fair() select prev_cpu by
> > > itself.
> > In fair.c:select_task_rq_fair(), when feec() returns a error (< 0),
> > the new_cpu = find_idlest_cpu() or select_idle_sibling().
> > In your patch, you should return prev_cpu instead of -1 if you want to
> > select the prev_cpu.
> Having a negative delta doesn't imply that prev_cpu is still available.
> E.g.: If prev_cpu cannot receive the task (and is skipped), and a
> negative delta appears when evaluating the other CPUs. In such case
> feec() cannot select prev_cpu. dst_cpu must be selected by another function.
In this case, would it be better to add a condition "prev_delta == ULONG_MAX" ?
Returnig(-1)  could avoid the negative delta, but I still think this
is not the fundamental way to solve the problem.
But I think you can send a V2 with the bail out mechanism.

Regards

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-04-13 12:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-30  5:21 [PATCH] sched/fair: use signed long when compute energy delta in eas Xuewen Yan
2021-03-30  9:45 ` Quentin Perret
2021-04-01 18:07   ` Dietmar Eggemann
2021-04-06 10:59     ` Xuewen Yan
2021-04-07 14:11       ` Pierre
2021-04-08  5:41         ` Xuewen Yan
2021-04-08  9:33           ` Pierre
2021-04-09  2:20             ` Xuewen Yan
2021-04-12 10:26               ` Pierre Gondois
2021-04-12 10:52                 ` Xuewen Yan
2021-04-12 17:14                   ` Pierre Gondois
2021-04-13  1:50                     ` Xuewen Yan
2021-04-13 10:58                       ` Pierre Gondois
2021-04-13 11:59                         ` Xuewen Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).