linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] cupfreq: schedutil: Minor fix and cleanups
@ 2017-03-02  8:33 Viresh Kumar
  2017-03-02  8:33 ` [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy Viresh Kumar
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Viresh Kumar @ 2017-03-02  8:33 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

Hi,

The first patch fixes an issue and the other two do cleanups. I have run
hackbench with these patches and no regressions were noticed.

--
viresh

Viresh Kumar (3):
  cpufreq: schedutil: move cached_raw_freq to struct sugov_policy
  cpufreq: schedutil: Pass sg_policy to get_next_freq()
  cpufreq: schedutil: remove redundant code from
    sugov_next_freq_shared()

 kernel/sched/cpufreq_schedutil.c | 37 +++++++++++++------------------------
 1 file changed, 13 insertions(+), 24 deletions(-)

-- 
2.7.1.410.g6faf27b

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy
  2017-03-02  8:33 [PATCH 0/3] cupfreq: schedutil: Minor fix and cleanups Viresh Kumar
@ 2017-03-02  8:33 ` Viresh Kumar
  2017-03-02 22:05   ` Rafael J. Wysocki
  2017-03-02  8:33 ` [PATCH 2/3] cpufreq: schedutil: Pass sg_policy to get_next_freq() Viresh Kumar
  2017-03-02  8:33 ` [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared() Viresh Kumar
  2 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2017-03-02  8:33 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

cached_raw_freq applies to the entire cpufreq policy and not individual
CPUs. Apart from wasting per-cpu memory, it is actually wrong to keep it
in struct sugov_cpu as we may end up comparing next_freq with a stale
cached_raw_freq of a random CPU.

Move cached_raw_freq to struct sugov_policy.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 306d97e7b57c..721f4e011366 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -35,6 +35,7 @@ struct sugov_policy {
 	u64 last_freq_update_time;
 	s64 freq_update_delay_ns;
 	unsigned int next_freq;
+	unsigned int cached_raw_freq;
 
 	/* The next fields are only needed if fast switch cannot be used. */
 	struct irq_work irq_work;
@@ -51,7 +52,6 @@ struct sugov_cpu {
 	struct update_util_data update_util;
 	struct sugov_policy *sg_policy;
 
-	unsigned int cached_raw_freq;
 	unsigned long iowait_boost;
 	unsigned long iowait_boost_max;
 	u64 last_update;
@@ -145,9 +145,9 @@ static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
 
 	freq = (freq + (freq >> 2)) * util / max;
 
-	if (freq == sg_cpu->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
+	if (freq == sg_policy->cached_raw_freq && sg_policy->next_freq != UINT_MAX)
 		return sg_policy->next_freq;
-	sg_cpu->cached_raw_freq = freq;
+	sg_policy->cached_raw_freq = freq;
 	return cpufreq_driver_resolve_freq(policy, freq);
 }
 
@@ -579,6 +579,7 @@ static int sugov_start(struct cpufreq_policy *policy)
 	sg_policy->next_freq = UINT_MAX;
 	sg_policy->work_in_progress = false;
 	sg_policy->need_freq_update = false;
+	sg_policy->cached_raw_freq = 0;
 
 	for_each_cpu(cpu, policy->cpus) {
 		struct sugov_cpu *sg_cpu = &per_cpu(sugov_cpu, cpu);
@@ -589,7 +590,6 @@ static int sugov_start(struct cpufreq_policy *policy)
 			sg_cpu->max = 0;
 			sg_cpu->flags = SCHED_CPUFREQ_RT;
 			sg_cpu->last_update = 0;
-			sg_cpu->cached_raw_freq = 0;
 			sg_cpu->iowait_boost = 0;
 			sg_cpu->iowait_boost_max = policy->cpuinfo.max_freq;
 			cpufreq_add_update_util_hook(cpu, &sg_cpu->update_util,
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] cpufreq: schedutil: Pass sg_policy to get_next_freq()
  2017-03-02  8:33 [PATCH 0/3] cupfreq: schedutil: Minor fix and cleanups Viresh Kumar
  2017-03-02  8:33 ` [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy Viresh Kumar
@ 2017-03-02  8:33 ` Viresh Kumar
  2017-03-02  8:33 ` [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared() Viresh Kumar
  2 siblings, 0 replies; 16+ messages in thread
From: Viresh Kumar @ 2017-03-02  8:33 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

get_next_freq() uses sg_cpu only to get sg_policy, which the callers of
get_next_freq() already have. Pass sg_policy instead of sg_cpu to
get_next_freq(), to make it more efficient.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 721f4e011366..570925ea7253 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -115,7 +115,7 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
 
 /**
  * get_next_freq - Compute a new frequency for a given cpufreq policy.
- * @sg_cpu: schedutil cpu object to compute the new frequency for.
+ * @sg_policy: schedutil policy object to compute the new frequency for.
  * @util: Current CPU utilization.
  * @max: CPU capacity.
  *
@@ -135,10 +135,9 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
  * next_freq (as calculated above) is returned, subject to policy min/max and
  * cpufreq driver limitations.
  */
-static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util,
-				  unsigned long max)
+static unsigned int get_next_freq(struct sugov_policy *sg_policy,
+				  unsigned long util, unsigned long max)
 {
-	struct sugov_policy *sg_policy = sg_cpu->sg_policy;
 	struct cpufreq_policy *policy = sg_policy->policy;
 	unsigned int freq = arch_scale_freq_invariant() ?
 				policy->cpuinfo.max_freq : policy->cur;
@@ -212,7 +211,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
 	} else {
 		sugov_get_util(&util, &max);
 		sugov_iowait_boost(sg_cpu, &util, &max);
-		next_f = get_next_freq(sg_cpu, util, max);
+		next_f = get_next_freq(sg_policy, util, max);
 	}
 	sugov_update_commit(sg_policy, time, next_f);
 }
@@ -266,7 +265,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
 		sugov_iowait_boost(j_sg_cpu, &util, &max);
 	}
 
-	return get_next_freq(sg_cpu, util, max);
+	return get_next_freq(sg_policy, util, max);
 }
 
 static void sugov_update_shared(struct update_util_data *hook, u64 time,
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-02  8:33 [PATCH 0/3] cupfreq: schedutil: Minor fix and cleanups Viresh Kumar
  2017-03-02  8:33 ` [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy Viresh Kumar
  2017-03-02  8:33 ` [PATCH 2/3] cpufreq: schedutil: Pass sg_policy to get_next_freq() Viresh Kumar
@ 2017-03-02  8:33 ` Viresh Kumar
  2017-03-04  0:03   ` Rafael J. Wysocki
  2 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2017-03-02  8:33 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot, Viresh Kumar

The same code is present both within and outside the loop and it doesn't
look like it provides any additional benefit. Remove the special
handling of sg_cpu and let it happen within the loop.

With this change we will do two extra comparisons for the sg_cpu in the
loop, but the loop will do one less comparison for every other CPU in
the policy.

While at it, also remove the excess parameters of sugov_next_freq_shared().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 570925ea7253..ec56537429a8 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -216,30 +216,20 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
 	sugov_update_commit(sg_policy, time, next_f);
 }
 
-static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
-					   unsigned long util, unsigned long max,
-					   unsigned int flags)
+static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
 {
 	struct sugov_policy *sg_policy = sg_cpu->sg_policy;
 	struct cpufreq_policy *policy = sg_policy->policy;
 	unsigned int max_f = policy->cpuinfo.max_freq;
 	u64 last_freq_update_time = sg_policy->last_freq_update_time;
+	unsigned long util = 0, max = 1;
 	unsigned int j;
 
-	if (flags & SCHED_CPUFREQ_RT_DL)
-		return max_f;
-
-	sugov_iowait_boost(sg_cpu, &util, &max);
-
 	for_each_cpu(j, policy->cpus) {
-		struct sugov_cpu *j_sg_cpu;
+		struct sugov_cpu *j_sg_cpu = &per_cpu(sugov_cpu, j);
 		unsigned long j_util, j_max;
 		s64 delta_ns;
 
-		if (j == smp_processor_id())
-			continue;
-
-		j_sg_cpu = &per_cpu(sugov_cpu, j);
 		/*
 		 * If the CPU utilization was last updated before the previous
 		 * frequency update and the time elapsed between the last update
@@ -288,7 +278,7 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
 	sg_cpu->last_update = time;
 
 	if (sugov_should_update_freq(sg_policy, time)) {
-		next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
+		next_f = sugov_next_freq_shared(sg_cpu);
 		sugov_update_commit(sg_policy, time, next_f);
 	}
 
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy
  2017-03-02  8:33 ` [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy Viresh Kumar
@ 2017-03-02 22:05   ` Rafael J. Wysocki
  2017-03-03  3:07     ` Viresh Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-02 22:05 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Ingo Molnar, Peter Zijlstra, linaro-kernel, linux-pm,
	linux-kernel, Vincent Guittot

On Thursday, March 02, 2017 02:03:20 PM Viresh Kumar wrote:
> cached_raw_freq applies to the entire cpufreq policy and not individual
> CPUs. Apart from wasting per-cpu memory, it is actually wrong to keep it
> in struct sugov_cpu as we may end up comparing next_freq with a stale
> cached_raw_freq of a random CPU.
> 
> Move cached_raw_freq to struct sugov_policy.
> 
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>

Any chance for a Fixes: tag?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy
  2017-03-02 22:05   ` Rafael J. Wysocki
@ 2017-03-03  3:07     ` Viresh Kumar
  0 siblings, 0 replies; 16+ messages in thread
From: Viresh Kumar @ 2017-03-03  3:07 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ingo Molnar, Peter Zijlstra, linaro-kernel, linux-pm,
	linux-kernel, Vincent Guittot

On 02-03-17, 23:05, Rafael J. Wysocki wrote:
> On Thursday, March 02, 2017 02:03:20 PM Viresh Kumar wrote:
> > cached_raw_freq applies to the entire cpufreq policy and not individual
> > CPUs. Apart from wasting per-cpu memory, it is actually wrong to keep it
> > in struct sugov_cpu as we may end up comparing next_freq with a stale
> > cached_raw_freq of a random CPU.
> > 
> > Move cached_raw_freq to struct sugov_policy.
> > 
> > Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> 
> Any chance for a Fixes: tag?

Fixes: 5cbea46984d6 ("cpufreq: schedutil: map raw required frequency
to driver frequency")

Sorry to miss that in the first place.

-- 
viresh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-02  8:33 ` [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared() Viresh Kumar
@ 2017-03-04  0:03   ` Rafael J. Wysocki
  2017-03-04  0:11     ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-04  0:03 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Ingo Molnar, Peter Zijlstra, linaro-kernel, linux-pm,
	linux-kernel, Vincent Guittot

On Thursday, March 02, 2017 02:03:22 PM Viresh Kumar wrote:
> The same code is present both within and outside the loop and it doesn't
> look like it provides any additional benefit.

Well, not quite.  This is on purpose.

Note the "if (j == smp_processor_id())" condition within the loop and think
about how the current CPU is taken into account. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-04  0:03   ` Rafael J. Wysocki
@ 2017-03-04  0:11     ` Rafael J. Wysocki
  2017-03-06  4:45       ` Viresh Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-04  0:11 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Ingo Molnar, Peter Zijlstra, linaro-kernel, linux-pm,
	linux-kernel, Vincent Guittot

On Saturday, March 04, 2017 01:03:17 AM Rafael J. Wysocki wrote:
> On Thursday, March 02, 2017 02:03:22 PM Viresh Kumar wrote:
> > The same code is present both within and outside the loop and it doesn't
> > look like it provides any additional benefit.
> 
> Well, not quite.  This is on purpose.
> 
> Note the "if (j == smp_processor_id())" condition within the loop and think
> about how the current CPU is taken into account. :-)

Ah OK, you did that, sorry.

So one idea is that if SCHED_CPUFREQ_RT_DL is set in flags, we don't even
need to start the loop which is quite a cost to simply notice that there's
nothing to do.

Also I don't quite agree with adding an extra pair of integer multiplications
to that loop just to get rid of the extra args.  That aside from chasing extra
pointers, of course.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-04  0:11     ` Rafael J. Wysocki
@ 2017-03-06  4:45       ` Viresh Kumar
  2017-03-06 12:24         ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2017-03-06  4:45 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Ingo Molnar, Peter Zijlstra, linaro-kernel, linux-pm,
	linux-kernel, Vincent Guittot

On 04-03-17, 01:11, Rafael J. Wysocki wrote:
> So one idea is that if SCHED_CPUFREQ_RT_DL is set in flags, we don't even
> need to start the loop which is quite a cost to simply notice that there's
> nothing to do.

Hmm. Isn't the probability of this flag being set, same for all CPUs in the
policy? If yes, then why do we need to handle the current CPU specially?

> Also I don't quite agree with adding an extra pair of integer multiplications
> to that loop just to get rid of the extra args.

But that should be cheap enough as we would be multiplying with 1 in one of them
and with 0 on the other.

Isn't that better then keeping same code at two places?

Also as I mentioned in the commit log, the number of extra comparisons for the
current CPU will be balanced if we have three CPUs in the policy and with every
other CPU in the policy, we will end up doing one comparison less. With
Quad-core policies, we reduce the number of comparisons by 1 and for octa-core
ones, we reduce it by 5.

-- 
viresh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-06  4:45       ` Viresh Kumar
@ 2017-03-06 12:24         ` Rafael J. Wysocki
  2017-03-07 10:31           ` Viresh Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-06 12:24 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot

On Mon, Mar 6, 2017 at 5:45 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 04-03-17, 01:11, Rafael J. Wysocki wrote:
>> So one idea is that if SCHED_CPUFREQ_RT_DL is set in flags, we don't even
>> need to start the loop which is quite a cost to simply notice that there's
>> nothing to do.
>
> Hmm. Isn't the probability of this flag being set, same for all CPUs in the
> policy?

No, I don't think so.

> If yes, then why do we need to handle the current CPU specially?

We don't need to chase a pointer to get to the flags for the current
CPU (and same goes for util and max) and what if it is the last one in
the policy->cpus mask?

>> Also I don't quite agree with adding an extra pair of integer multiplications
>> to that loop just to get rid of the extra args.
>
> But that should be cheap enough as we would be multiplying with 1 in one of them
> and with 0 on the other.

I'm not sure it will be really that cheap.

> Isn't that better then keeping same code at two places?

Well, it isn't IMO, unless you have numbers to support your point.

> Also as I mentioned in the commit log, the number of extra comparisons for the
> current CPU will be balanced if we have three CPUs in the policy and with every
> other CPU in the policy, we will end up doing one comparison less. With
> Quad-core policies, we reduce the number of comparisons by 1 and for octa-core
> ones, we reduce it by 5.

So to the point, the code was written this way on purpose and not just
by accident as your changelog suggests and if you want to change it,
you need numbers.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-06 12:24         ` Rafael J. Wysocki
@ 2017-03-07 10:31           ` Viresh Kumar
  2017-03-07 13:19             ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2017-03-07 10:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot

On 06-03-17, 13:24, Rafael J. Wysocki wrote:
> On Mon, Mar 6, 2017 at 5:45 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > On 04-03-17, 01:11, Rafael J. Wysocki wrote:
> >> So one idea is that if SCHED_CPUFREQ_RT_DL is set in flags, we don't even
> >> need to start the loop which is quite a cost to simply notice that there's
> >> nothing to do.
> >
> > Hmm. Isn't the probability of this flag being set, same for all CPUs in the
> > policy?
> 
> No, I don't think so.

Why do you think so? I thought all CPU in the policy can have the RT/DL flag set
and the probability of all of them is just the same.

> So to the point, the code was written this way on purpose and not just
> by accident as your changelog suggests and

I didn't wanted to convey that really and I knew that it was written on purpose.

> if you want to change it, you need numbers.

What kind of numbers can we get for such a change ? I tried to take the running
average of the time it takes to execute this routine over 10000 samples, but it
varies a lot even with the same build. Any tests like hackbench, etc wouldn't be
of any help as well.

-- 
viresh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-07 10:31           ` Viresh Kumar
@ 2017-03-07 13:19             ` Rafael J. Wysocki
  2017-03-08  4:18               ` Viresh Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-07 13:19 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Ingo Molnar,
	Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot

On Tue, Mar 7, 2017 at 11:31 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 06-03-17, 13:24, Rafael J. Wysocki wrote:
>> On Mon, Mar 6, 2017 at 5:45 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> > On 04-03-17, 01:11, Rafael J. Wysocki wrote:
>> >> So one idea is that if SCHED_CPUFREQ_RT_DL is set in flags, we don't even
>> >> need to start the loop which is quite a cost to simply notice that there's
>> >> nothing to do.
>> >
>> > Hmm. Isn't the probability of this flag being set, same for all CPUs in the
>> > policy?
>>
>> No, I don't think so.
>
> Why do you think so? I thought all CPU in the policy can have the RT/DL flag set
> and the probability of all of them is just the same.

Well, yes, but if the current CPU has that flag set already, we surely
don't need to check the other ones in the policy?

>> So to the point, the code was written this way on purpose and not just
>> by accident as your changelog suggests and
>
> I didn't wanted to convey that really and I knew that it was written on purpose.
>
>> if you want to change it, you need numbers.
>
> What kind of numbers can we get for such a change ? I tried to take the running
> average of the time it takes to execute this routine over 10000 samples, but it
> varies a lot even with the same build. Any tests like hackbench, etc wouldn't be
> of any help as well.

So why do you think it needs to be changed, but really?

Is that because it is particularly hard to follow or similar?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-07 13:19             ` Rafael J. Wysocki
@ 2017-03-08  4:18               ` Viresh Kumar
  2017-03-08 10:50                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2017-03-08  4:18 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot

On 07-03-17, 14:19, Rafael J. Wysocki wrote:
> On Tue, Mar 7, 2017 at 11:31 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> > Why do you think so? I thought all CPU in the policy can have the RT/DL flag set
> > and the probability of all of them is just the same.
> 
> Well, yes, but if the current CPU has that flag set already, we surely
> don't need to check the other ones in the policy?

That's true for every other CPU in policy too..

> >> So to the point, the code was written this way on purpose and not just
> >> by accident as your changelog suggests and
> >
> > I didn't wanted to convey that really and I knew that it was written on purpose.
> >
> >> if you want to change it, you need numbers.
> >
> > What kind of numbers can we get for such a change ? I tried to take the running
> > average of the time it takes to execute this routine over 10000 samples, but it
> > varies a lot even with the same build. Any tests like hackbench, etc wouldn't be
> > of any help as well.
> 
> So why do you think it needs to be changed, but really?
> 
> Is that because it is particularly hard to follow or similar?

Just that I didn't like keeping the same code at two places (outside
and inside the loop) and the benefit it has.

Anyway, its not straight forward to get any numbers supporting my
argument. I can claim improvement only theoretically by comparing the
number of comparisons that we may end up doing for quad or octa core
policies. Lets abandon this patch as I failed to convince you :)

Thanks for applying the other two patches though.

Cheers.

-- 
viresh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-08  4:18               ` Viresh Kumar
@ 2017-03-08 10:50                 ` Rafael J. Wysocki
  2017-03-08 11:15                   ` Viresh Kumar
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-08 10:50 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Ingo Molnar,
	Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot

On Wed, Mar 8, 2017 at 5:18 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 07-03-17, 14:19, Rafael J. Wysocki wrote:
>> On Tue, Mar 7, 2017 at 11:31 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> > Why do you think so? I thought all CPU in the policy can have the RT/DL flag set
>> > and the probability of all of them is just the same.
>>
>> Well, yes, but if the current CPU has that flag set already, we surely
>> don't need to check the other ones in the policy?
>
> That's true for every other CPU in policy too..

Not exactly.

The flags value for the current CPU is in a hot cache line already (if
not in a register) and it is not necessary to chase a pointer (and
possibly fetch a new cache line) to get to it.

That also applies to util and max for the current CPU, but the benefit
here is debatable.

>> >> So to the point, the code was written this way on purpose and not just
>> >> by accident as your changelog suggests and
>> >
>> > I didn't wanted to convey that really and I knew that it was written on purpose.
>> >
>> >> if you want to change it, you need numbers.
>> >
>> > What kind of numbers can we get for such a change ? I tried to take the running
>> > average of the time it takes to execute this routine over 10000 samples, but it
>> > varies a lot even with the same build. Any tests like hackbench, etc wouldn't be
>> > of any help as well.
>>
>> So why do you think it needs to be changed, but really?
>>
>> Is that because it is particularly hard to follow or similar?
>
> Just that I didn't like keeping the same code at two places (outside
> and inside the loop) and the benefit it has.

So there are two things here, the flags check and the invocation of
sugov_iowait_boost() for the current CPU.

I claim that the flags check is a clear benefit due to what I said above.

The other thing is a way to initialize util and max to sensible
values.  It also can be done the way you did it and that change should
not affect the execution time.

So overall, maybe you can move the flags check to
sugov_update_shared(), so that you don't need to pass flags to
sugov_next_freq_shared(), and then do what you did to util and max.

But that would be a 4.12 change anyway.

> Anyway, its not straight forward to get any numbers supporting my
> argument. I can claim improvement only theoretically by comparing the
> number of comparisons that we may end up doing for quad or octa core
> policies. Lets abandon this patch as I failed to convince you :)
>
> Thanks for applying the other two patches though.

No problem.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-08 10:50                 ` Rafael J. Wysocki
@ 2017-03-08 11:15                   ` Viresh Kumar
  2017-03-08 12:54                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Viresh Kumar @ 2017-03-08 11:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot

On 08-03-17, 11:50, Rafael J. Wysocki wrote:
> So overall, maybe you can move the flags check to
> sugov_update_shared(), so that you don't need to pass flags to
> sugov_next_freq_shared(), and then do what you did to util and max.

Just to confirm, below is what you are suggesting ?

-------------------------8<-------------------------
 
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 78468aa051ab..f5ffe241812e 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -217,30 +217,19 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
        sugov_update_commit(sg_policy, time, next_f);
 }
 
-static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
-                                          unsigned long util, unsigned long max,
-                                          unsigned int flags)
+static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
 {
        struct sugov_policy *sg_policy = sg_cpu->sg_policy;
        struct cpufreq_policy *policy = sg_policy->policy;
-       unsigned int max_f = policy->cpuinfo.max_freq;
        u64 last_freq_update_time = sg_policy->last_freq_update_time;
+       unsigned long util = 0, max = 1;
        unsigned int j;
 
-       if (flags & SCHED_CPUFREQ_RT_DL)
-               return max_f;
-
-       sugov_iowait_boost(sg_cpu, &util, &max);
-
        for_each_cpu(j, policy->cpus) {
-               struct sugov_cpu *j_sg_cpu;
+               struct sugov_cpu *j_sg_cpu = &per_cpu(sugov_cpu, j);
                unsigned long j_util, j_max;
                s64 delta_ns;
 
-               if (j == smp_processor_id())
-                       continue;
-
-               j_sg_cpu = &per_cpu(sugov_cpu, j);
                /*
                 * If the CPU utilization was last updated before the previous
                 * frequency update and the time elapsed between the last update
@@ -254,7 +243,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
                        continue;
                }
                if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
-                       return max_f;
+                       return policy->cpuinfo.max_freq;
 
                j_util = j_sg_cpu->util;
                j_max = j_sg_cpu->max;
@@ -289,7 +278,11 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
        sg_cpu->last_update = time;
 
        if (sugov_should_update_freq(sg_policy, time)) {
-               next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
+               if (flags & SCHED_CPUFREQ_RT_DL)
+                       next_f = sg_policy->policy->cpuinfo.max_freq;
+               else
+                       next_f = sugov_next_freq_shared(sg_cpu);
+
                sugov_update_commit(sg_policy, time, next_f);
        }
 
> But that would be a 4.12 change anyway.

Sure.

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()
  2017-03-08 11:15                   ` Viresh Kumar
@ 2017-03-08 12:54                     ` Rafael J. Wysocki
  0 siblings, 0 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2017-03-08 12:54 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Ingo Molnar,
	Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot

On Wed, Mar 8, 2017 at 12:15 PM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 08-03-17, 11:50, Rafael J. Wysocki wrote:
>> So overall, maybe you can move the flags check to
>> sugov_update_shared(), so that you don't need to pass flags to
>> sugov_next_freq_shared(), and then do what you did to util and max.
>
> Just to confirm, below is what you are suggesting ?

Yes, it is.

> -------------------------8<-------------------------
>
>  1 file changed, 9 insertions(+), 16 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 78468aa051ab..f5ffe241812e 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -217,30 +217,19 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
>         sugov_update_commit(sg_policy, time, next_f);
>  }
>
> -static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
> -                                          unsigned long util, unsigned long max,
> -                                          unsigned int flags)
> +static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu)
>  {
>         struct sugov_policy *sg_policy = sg_cpu->sg_policy;
>         struct cpufreq_policy *policy = sg_policy->policy;
> -       unsigned int max_f = policy->cpuinfo.max_freq;
>         u64 last_freq_update_time = sg_policy->last_freq_update_time;
> +       unsigned long util = 0, max = 1;
>         unsigned int j;
>
> -       if (flags & SCHED_CPUFREQ_RT_DL)
> -               return max_f;
> -
> -       sugov_iowait_boost(sg_cpu, &util, &max);
> -
>         for_each_cpu(j, policy->cpus) {
> -               struct sugov_cpu *j_sg_cpu;
> +               struct sugov_cpu *j_sg_cpu = &per_cpu(sugov_cpu, j);
>                 unsigned long j_util, j_max;
>                 s64 delta_ns;
>
> -               if (j == smp_processor_id())
> -                       continue;
> -
> -               j_sg_cpu = &per_cpu(sugov_cpu, j);
>                 /*
>                  * If the CPU utilization was last updated before the previous
>                  * frequency update and the time elapsed between the last update
> @@ -254,7 +243,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu,
>                         continue;
>                 }
>                 if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
> -                       return max_f;
> +                       return policy->cpuinfo.max_freq;
>
>                 j_util = j_sg_cpu->util;
>                 j_max = j_sg_cpu->max;
> @@ -289,7 +278,11 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
>         sg_cpu->last_update = time;
>
>         if (sugov_should_update_freq(sg_policy, time)) {
> -               next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
> +               if (flags & SCHED_CPUFREQ_RT_DL)
> +                       next_f = sg_policy->policy->cpuinfo.max_freq;
> +               else
> +                       next_f = sugov_next_freq_shared(sg_cpu);
> +
>                 sugov_update_commit(sg_policy, time, next_f);
>         }
>
>> But that would be a 4.12 change anyway.
>
> Sure.

And IMO the subject/changelog should not talk about "redundant code",
because the code in question is not in fact redundant, but about
refactoring the code to eliminate one conditional from the
for_each_cpu() loop and to make that loop treat all CPUs in the same
way (more symmetrically).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-03-08 13:03 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-02  8:33 [PATCH 0/3] cupfreq: schedutil: Minor fix and cleanups Viresh Kumar
2017-03-02  8:33 ` [PATCH 1/3] cpufreq: schedutil: move cached_raw_freq to struct sugov_policy Viresh Kumar
2017-03-02 22:05   ` Rafael J. Wysocki
2017-03-03  3:07     ` Viresh Kumar
2017-03-02  8:33 ` [PATCH 2/3] cpufreq: schedutil: Pass sg_policy to get_next_freq() Viresh Kumar
2017-03-02  8:33 ` [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared() Viresh Kumar
2017-03-04  0:03   ` Rafael J. Wysocki
2017-03-04  0:11     ` Rafael J. Wysocki
2017-03-06  4:45       ` Viresh Kumar
2017-03-06 12:24         ` Rafael J. Wysocki
2017-03-07 10:31           ` Viresh Kumar
2017-03-07 13:19             ` Rafael J. Wysocki
2017-03-08  4:18               ` Viresh Kumar
2017-03-08 10:50                 ` Rafael J. Wysocki
2017-03-08 11:15                   ` Viresh Kumar
2017-03-08 12:54                     ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).