linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
@ 2016-11-11 10:22 Viresh Kumar
  2016-11-11 10:22 ` [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier Viresh Kumar
                   ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-11 10:22 UTC (permalink / raw)
  To: Rafael Wysocki
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Robin Randhawa,
	Viresh Kumar

Hi,

If slow path frequency changes are conducted in a SCHED_OTHER context
then they may be delayed for some amount of time, including
indefinitely, when real time or deadline activity is taking place.

Move the slow path to a real time kernel thread using the kthread worker
infrastructure. In the future the thread should be made SCHED_DEADLINE.
The RT priority is arbitrarily set to 50 for now.

This was tested with Hackbench on ARM Exynos, dual core A15 platform and
no regressions were seen. The second patch has more details on it.

This work was started by Steve Muckle, where he used a simple kthread
instead of kthread-worker and that wasn't sufficient as some guarantees
weren't met.

I was wondering if the same should be done for ondemand/conservative
governors as well ?

--
viresh

Viresh Kumar (3):
  cpufreq: schedutil: enable fast switch earlier
  cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  cpufreq: schedutil: irq-work is used only in slow path

 kernel/sched/cpufreq_schedutil.c | 85 ++++++++++++++++++++++++++++++----------
 1 file changed, 65 insertions(+), 20 deletions(-)

-- 
2.7.1.410.g6faf27b

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-11 10:22 [PATCH 0/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
@ 2016-11-11 10:22 ` Viresh Kumar
  2016-11-11 14:19   ` Peter Zijlstra
  2016-11-11 21:58   ` Rafael J. Wysocki
  2016-11-11 10:22 ` [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
  2016-11-11 10:22 ` [PATCH 3/3] cpufreq: schedutil: irq-work is used only in slow path Viresh Kumar
  2 siblings, 2 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-11 10:22 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Juri Lelli, Robin Randhawa, Viresh Kumar

The fast_switch_enabled flag will be used a bit earlier while converting
the schedutil governor to use kthread worker.

Prepare for that by moving the call to enable it to the beginning of
sugov_init().

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 69e06898997d..ccb2ab89affb 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -416,9 +416,13 @@ static int sugov_init(struct cpufreq_policy *policy)
 	if (policy->governor_data)
 		return -EBUSY;
 
+	cpufreq_enable_fast_switch(policy);
+
 	sg_policy = sugov_policy_alloc(policy);
-	if (!sg_policy)
-		return -ENOMEM;
+	if (!sg_policy) {
+		ret = -ENOMEM;
+		goto disable_fast_switch;
+	}
 
 	mutex_lock(&global_tunables_lock);
 
@@ -456,8 +460,6 @@ static int sugov_init(struct cpufreq_policy *policy)
 
  out:
 	mutex_unlock(&global_tunables_lock);
-
-	cpufreq_enable_fast_switch(policy);
 	return 0;
 
  fail:
@@ -468,6 +470,10 @@ static int sugov_init(struct cpufreq_policy *policy)
 	mutex_unlock(&global_tunables_lock);
 
 	sugov_policy_free(sg_policy);
+
+ disable_fast_switch:
+	cpufreq_disable_fast_switch(policy);
+
 	pr_err("initialization failed (error %d)\n", ret);
 	return ret;
 }
@@ -478,8 +484,6 @@ static void sugov_exit(struct cpufreq_policy *policy)
 	struct sugov_tunables *tunables = sg_policy->tunables;
 	unsigned int count;
 
-	cpufreq_disable_fast_switch(policy);
-
 	mutex_lock(&global_tunables_lock);
 
 	count = gov_attr_set_put(&tunables->attr_set, &sg_policy->tunables_hook);
@@ -490,6 +494,7 @@ static void sugov_exit(struct cpufreq_policy *policy)
 	mutex_unlock(&global_tunables_lock);
 
 	sugov_policy_free(sg_policy);
+	cpufreq_disable_fast_switch(policy);
 }
 
 static int sugov_start(struct cpufreq_policy *policy)
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 10:22 [PATCH 0/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
  2016-11-11 10:22 ` [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier Viresh Kumar
@ 2016-11-11 10:22 ` Viresh Kumar
  2016-11-11 14:32   ` Tommaso Cucinotta
  2016-11-11 22:16   ` Rafael J. Wysocki
  2016-11-11 10:22 ` [PATCH 3/3] cpufreq: schedutil: irq-work is used only in slow path Viresh Kumar
  2 siblings, 2 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-11 10:22 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Juri Lelli, Robin Randhawa, Viresh Kumar, Steve Muckle

If slow path frequency changes are conducted in a SCHED_OTHER context
then they may be delayed for some amount of time, including
indefinitely, when real time or deadline activity is taking place.

Move the slow path to a real time kernel thread. In the future the
thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set
to 50 for now.

Hackbench results on ARM Exynos, dual core A15 platform for 10
iterations:

$ hackbench -s 100 -l 100 -g 10 -f 20

Before			After
---------------------------------
1.808			1.603
1.847			1.251
2.229			1.590
1.952			1.600
1.947			1.257
1.925			1.627
2.694			1.620
1.258			1.621
1.919			1.632
1.250			1.240

Average:

1.8829			1.5041

Based on initial work by Steve Muckle.

Signed-off-by: Steve Muckle <smuckle.linux@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 62 ++++++++++++++++++++++++++++++++--------
 1 file changed, 50 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index ccb2ab89affb..045ce0a4e6d1 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -12,6 +12,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/cpufreq.h>
+#include <linux/kthread.h>
 #include <linux/slab.h>
 #include <trace/events/power.h>
 
@@ -35,8 +36,10 @@ struct sugov_policy {
 
 	/* The next fields are only needed if fast switch cannot be used. */
 	struct irq_work irq_work;
-	struct work_struct work;
+	struct kthread_work work;
 	struct mutex work_lock;
+	struct kthread_worker worker;
+	struct task_struct *thread;
 	bool work_in_progress;
 
 	bool need_freq_update;
@@ -291,9 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
 	raw_spin_unlock(&sg_policy->update_lock);
 }
 
-static void sugov_work(struct work_struct *work)
+static void sugov_work(struct kthread_work *work)
 {
-	struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
+	struct sugov_policy *sg_policy =
+		container_of(work, struct sugov_policy, work);
 
 	mutex_lock(&sg_policy->work_lock);
 	__cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
@@ -308,7 +312,7 @@ static void sugov_irq_work(struct irq_work *irq_work)
 	struct sugov_policy *sg_policy;
 
 	sg_policy = container_of(irq_work, struct sugov_policy, irq_work);
-	schedule_work_on(smp_processor_id(), &sg_policy->work);
+	kthread_queue_work(&sg_policy->worker, &sg_policy->work);
 }
 
 /************************** sysfs interface ************************/
@@ -362,9 +366,23 @@ static struct kobj_type sugov_tunables_ktype = {
 
 static struct cpufreq_governor schedutil_gov;
 
+static void sugov_policy_free(struct sugov_policy *sg_policy)
+{
+	if (!sg_policy->policy->fast_switch_enabled) {
+		kthread_flush_worker(&sg_policy->worker);
+		kthread_stop(sg_policy->thread);
+	}
+
+	mutex_destroy(&sg_policy->work_lock);
+	kfree(sg_policy);
+}
+
 static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
 {
 	struct sugov_policy *sg_policy;
+	struct task_struct *thread;
+	struct sched_param param = { .sched_priority = 50 };
+	int ret;
 
 	sg_policy = kzalloc(sizeof(*sg_policy), GFP_KERNEL);
 	if (!sg_policy)
@@ -372,16 +390,36 @@ static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
 
 	sg_policy->policy = policy;
 	init_irq_work(&sg_policy->irq_work, sugov_irq_work);
-	INIT_WORK(&sg_policy->work, sugov_work);
 	mutex_init(&sg_policy->work_lock);
 	raw_spin_lock_init(&sg_policy->update_lock);
-	return sg_policy;
-}
 
-static void sugov_policy_free(struct sugov_policy *sg_policy)
-{
-	mutex_destroy(&sg_policy->work_lock);
-	kfree(sg_policy);
+	/* kthread only required for slow path */
+	if (policy->fast_switch_enabled)
+		return sg_policy;
+
+	kthread_init_work(&sg_policy->work, sugov_work);
+	kthread_init_worker(&sg_policy->worker);
+	thread = kthread_create(kthread_worker_fn, &sg_policy->worker,
+			"sugov:%d", cpumask_first(policy->related_cpus));
+	if (IS_ERR(thread)) {
+		mutex_destroy(&sg_policy->work_lock);
+		kfree(sg_policy);
+		pr_err("failed to create sugov thread: %ld\n", PTR_ERR(thread));
+		return NULL;
+	}
+	sg_policy->thread = thread;
+
+	ret = sched_setscheduler_nocheck(thread, SCHED_FIFO, &param);
+	if (ret) {
+		sugov_policy_free(sg_policy);
+		pr_warn("%s: failed to set SCHED_FIFO\n", __func__);
+		return NULL;
+	}
+
+	kthread_bind_mask(thread, policy->related_cpus);
+	wake_up_process(thread);
+
+	return sg_policy;
 }
 
 static struct sugov_tunables *sugov_tunables_alloc(struct sugov_policy *sg_policy)
@@ -541,7 +579,7 @@ static void sugov_stop(struct cpufreq_policy *policy)
 	synchronize_sched();
 
 	irq_work_sync(&sg_policy->irq_work);
-	cancel_work_sync(&sg_policy->work);
+	kthread_cancel_work_sync(&sg_policy->work);
 }
 
 static void sugov_limits(struct cpufreq_policy *policy)
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/3] cpufreq: schedutil: irq-work is used only in slow path
  2016-11-11 10:22 [PATCH 0/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
  2016-11-11 10:22 ` [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier Viresh Kumar
  2016-11-11 10:22 ` [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
@ 2016-11-11 10:22 ` Viresh Kumar
  2 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-11 10:22 UTC (permalink / raw)
  To: Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Juri Lelli, Robin Randhawa, Viresh Kumar

Execute the irq-work specific initialization/exit code only when fast
path isn't available.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 kernel/sched/cpufreq_schedutil.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 045ce0a4e6d1..219e717bcd15 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -371,9 +371,9 @@ static void sugov_policy_free(struct sugov_policy *sg_policy)
 	if (!sg_policy->policy->fast_switch_enabled) {
 		kthread_flush_worker(&sg_policy->worker);
 		kthread_stop(sg_policy->thread);
+		mutex_destroy(&sg_policy->work_lock);
 	}
 
-	mutex_destroy(&sg_policy->work_lock);
 	kfree(sg_policy);
 }
 
@@ -389,8 +389,6 @@ static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
 		return NULL;
 
 	sg_policy->policy = policy;
-	init_irq_work(&sg_policy->irq_work, sugov_irq_work);
-	mutex_init(&sg_policy->work_lock);
 	raw_spin_lock_init(&sg_policy->update_lock);
 
 	/* kthread only required for slow path */
@@ -402,7 +400,6 @@ static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
 	thread = kthread_create(kthread_worker_fn, &sg_policy->worker,
 			"sugov:%d", cpumask_first(policy->related_cpus));
 	if (IS_ERR(thread)) {
-		mutex_destroy(&sg_policy->work_lock);
 		kfree(sg_policy);
 		pr_err("failed to create sugov thread: %ld\n", PTR_ERR(thread));
 		return NULL;
@@ -416,6 +413,9 @@ static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
 		return NULL;
 	}
 
+	init_irq_work(&sg_policy->irq_work, sugov_irq_work);
+	mutex_init(&sg_policy->work_lock);
+
 	kthread_bind_mask(thread, policy->related_cpus);
 	wake_up_process(thread);
 
@@ -578,8 +578,10 @@ static void sugov_stop(struct cpufreq_policy *policy)
 
 	synchronize_sched();
 
-	irq_work_sync(&sg_policy->irq_work);
-	kthread_cancel_work_sync(&sg_policy->work);
+	if (!policy->fast_switch_enabled) {
+		irq_work_sync(&sg_policy->irq_work);
+		kthread_cancel_work_sync(&sg_policy->work);
+	}
 }
 
 static void sugov_limits(struct cpufreq_policy *policy)
-- 
2.7.1.410.g6faf27b

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-11 10:22 ` [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier Viresh Kumar
@ 2016-11-11 14:19   ` Peter Zijlstra
  2016-11-11 21:52     ` Rafael J. Wysocki
  2016-11-11 21:58   ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2016-11-11 14:19 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Ingo Molnar, linaro-kernel, linux-pm,
	linux-kernel, Vincent Guittot, Juri Lelli, Robin Randhawa

On Fri, Nov 11, 2016 at 03:52:21PM +0530, Viresh Kumar wrote:
> @@ -456,8 +460,6 @@ static int sugov_init(struct cpufreq_policy *policy)
>  
>   out:
>  	mutex_unlock(&global_tunables_lock);
> -
> -	cpufreq_enable_fast_switch(policy);
>  	return 0;
>  
>   fail:
> @@ -468,6 +470,10 @@ static int sugov_init(struct cpufreq_policy *policy)
>  	mutex_unlock(&global_tunables_lock);
>  
>  	sugov_policy_free(sg_policy);
> +
> + disable_fast_switch:
> +	cpufreq_disable_fast_switch(policy);
> +
>  	pr_err("initialization failed (error %d)\n", ret);
>  	return ret;
>  }

Argh, no indented labels please. Please fix the 3 that snuck in while
you're there.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 10:22 ` [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
@ 2016-11-11 14:32   ` Tommaso Cucinotta
  2016-11-11 14:39     ` Peter Zijlstra
  2016-11-12  5:21     ` Viresh Kumar
  2016-11-11 22:16   ` Rafael J. Wysocki
  1 sibling, 2 replies; 28+ messages in thread
From: Tommaso Cucinotta @ 2016-11-11 14:32 UTC (permalink / raw)
  To: Viresh Kumar, Rafael Wysocki, Ingo Molnar, Peter Zijlstra
  Cc: linaro-kernel, linux-pm, linux-kernel, Vincent Guittot,
	Juri Lelli, Robin Randhawa, Steve Muckle

Hi,

On 11/11/2016 11:22, Viresh Kumar wrote:
> If slow path frequency changes are conducted in a SCHED_OTHER context
> then they may be delayed for some amount of time, including
> indefinitely, when real time or deadline activity is taking place.
>
> Move the slow path to a real time kernel thread. In the future the
> thread should be made SCHED_DEADLINE.

would you have an insight, as to what runtime/deadline/period to set, and/or
what specific timing constraints you'd like to set, just for this cpufreq
slowpath work?

> The RT priority is arbitrarily set
> to 50 for now.
[...]
> +	struct sched_param param = { .sched_priority = 50 };

won't you have a tunable here? (sysctl?)

Thanks,

	T.
-- 
Tommaso Cucinotta, Computer Engineering PhD
Associate Professor at the Real-Time Systems Laboratory (ReTiS)
Scuola Superiore Sant'Anna, Pisa, Italy
http://retis.sssup.it/people/tommaso

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 14:32   ` Tommaso Cucinotta
@ 2016-11-11 14:39     ` Peter Zijlstra
  2016-11-12  5:22       ` Viresh Kumar
  2016-11-12  5:21     ` Viresh Kumar
  1 sibling, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2016-11-11 14:39 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Viresh Kumar, Rafael Wysocki, Ingo Molnar, linaro-kernel,
	linux-pm, linux-kernel, Vincent Guittot, Juri Lelli,
	Robin Randhawa, Steve Muckle

On Fri, Nov 11, 2016 at 03:32:04PM +0100, Tommaso Cucinotta wrote:
> >+	struct sched_param param = { .sched_priority = 50 };
> 
> won't you have a tunable here? (sysctl?)

You can use the regular userspace tools, like schedtool and chrt to set
priorities.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-11 14:19   ` Peter Zijlstra
@ 2016-11-11 21:52     ` Rafael J. Wysocki
  2016-11-11 22:55       ` Peter Zijlstra
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2016-11-11 21:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Viresh Kumar, Rafael Wysocki, Ingo Molnar, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On Fri, Nov 11, 2016 at 3:19 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Nov 11, 2016 at 03:52:21PM +0530, Viresh Kumar wrote:
>> @@ -456,8 +460,6 @@ static int sugov_init(struct cpufreq_policy *policy)
>>
>>   out:
>>       mutex_unlock(&global_tunables_lock);
>> -
>> -     cpufreq_enable_fast_switch(policy);
>>       return 0;
>>
>>   fail:
>> @@ -468,6 +470,10 @@ static int sugov_init(struct cpufreq_policy *policy)
>>       mutex_unlock(&global_tunables_lock);
>>
>>       sugov_policy_free(sg_policy);
>> +
>> + disable_fast_switch:
>> +     cpufreq_disable_fast_switch(policy);
>> +
>>       pr_err("initialization failed (error %d)\n", ret);
>>       return ret;
>>  }
>
> Argh, no indented labels please. Please fix the 3 that snuck in while
> you're there.

Well, you didn't tell me you didn't like them. :-)

Anyway, I can fix this up easily enough.

Any other concerns regarding the patch?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-11 10:22 ` [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier Viresh Kumar
  2016-11-11 14:19   ` Peter Zijlstra
@ 2016-11-11 21:58   ` Rafael J. Wysocki
  2016-11-12  5:19     ` Viresh Kumar
  1 sibling, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2016-11-11 21:58 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On Fri, Nov 11, 2016 at 11:22 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> The fast_switch_enabled flag will be used a bit earlier while converting
> the schedutil governor to use kthread worker.
>
> Prepare for that by moving the call to enable it to the beginning of
> sugov_init().

Fair enough ->

> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  kernel/sched/cpufreq_schedutil.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 69e06898997d..ccb2ab89affb 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -416,9 +416,13 @@ static int sugov_init(struct cpufreq_policy *policy)
>         if (policy->governor_data)
>                 return -EBUSY;
>
> +       cpufreq_enable_fast_switch(policy);
> +
>         sg_policy = sugov_policy_alloc(policy);
> -       if (!sg_policy)
> -               return -ENOMEM;
> +       if (!sg_policy) {
> +               ret = -ENOMEM;
> +               goto disable_fast_switch;
> +       }
>
>         mutex_lock(&global_tunables_lock);
>
> @@ -456,8 +460,6 @@ static int sugov_init(struct cpufreq_policy *policy)
>
>   out:
>         mutex_unlock(&global_tunables_lock);
> -
> -       cpufreq_enable_fast_switch(policy);
>         return 0;
>
>   fail:
> @@ -468,6 +470,10 @@ static int sugov_init(struct cpufreq_policy *policy)
>         mutex_unlock(&global_tunables_lock);
>
>         sugov_policy_free(sg_policy);
> +
> + disable_fast_switch:
> +       cpufreq_disable_fast_switch(policy);
> +
>         pr_err("initialization failed (error %d)\n", ret);
>         return ret;
>  }
> @@ -478,8 +484,6 @@ static void sugov_exit(struct cpufreq_policy *policy)
>         struct sugov_tunables *tunables = sg_policy->tunables;
>         unsigned int count;
>
> -       cpufreq_disable_fast_switch(policy);
> -

->but why is this change necessary?

sugov_stop() has been called already, so the ordering here shouldn't matter.

>         mutex_lock(&global_tunables_lock);
>
>         count = gov_attr_set_put(&tunables->attr_set, &sg_policy->tunables_hook);
> @@ -490,6 +494,7 @@ static void sugov_exit(struct cpufreq_policy *policy)
>         mutex_unlock(&global_tunables_lock);
>
>         sugov_policy_free(sg_policy);
> +       cpufreq_disable_fast_switch(policy);
>  }
>
>  static int sugov_start(struct cpufreq_policy *policy)
> --

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 10:22 ` [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
  2016-11-11 14:32   ` Tommaso Cucinotta
@ 2016-11-11 22:16   ` Rafael J. Wysocki
  2016-11-12  1:31     ` Saravana Kannan
                       ` (2 more replies)
  1 sibling, 3 replies; 28+ messages in thread
From: Rafael J. Wysocki @ 2016-11-11 22:16 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa, Steve Muckle

On Fri, Nov 11, 2016 at 11:22 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> If slow path frequency changes are conducted in a SCHED_OTHER context
> then they may be delayed for some amount of time, including
> indefinitely, when real time or deadline activity is taking place.
>
> Move the slow path to a real time kernel thread. In the future the
> thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set
> to 50 for now.
>
> Hackbench results on ARM Exynos, dual core A15 platform for 10
> iterations:
>
> $ hackbench -s 100 -l 100 -g 10 -f 20
>
> Before                  After
> ---------------------------------
> 1.808                   1.603
> 1.847                   1.251
> 2.229                   1.590
> 1.952                   1.600
> 1.947                   1.257
> 1.925                   1.627
> 2.694                   1.620
> 1.258                   1.621
> 1.919                   1.632
> 1.250                   1.240
>
> Average:
>
> 1.8829                  1.5041
>
> Based on initial work by Steve Muckle.
>
> Signed-off-by: Steve Muckle <smuckle.linux@gmail.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  kernel/sched/cpufreq_schedutil.c | 62 ++++++++++++++++++++++++++++++++--------
>  1 file changed, 50 insertions(+), 12 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index ccb2ab89affb..045ce0a4e6d1 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -12,6 +12,7 @@
>  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
>  #include <linux/cpufreq.h>
> +#include <linux/kthread.h>
>  #include <linux/slab.h>
>  #include <trace/events/power.h>
>
> @@ -35,8 +36,10 @@ struct sugov_policy {
>
>         /* The next fields are only needed if fast switch cannot be used. */
>         struct irq_work irq_work;
> -       struct work_struct work;
> +       struct kthread_work work;
>         struct mutex work_lock;
> +       struct kthread_worker worker;
> +       struct task_struct *thread;
>         bool work_in_progress;
>
>         bool need_freq_update;
> @@ -291,9 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
>         raw_spin_unlock(&sg_policy->update_lock);
>  }
>
> -static void sugov_work(struct work_struct *work)
> +static void sugov_work(struct kthread_work *work)
>  {
> -       struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> +       struct sugov_policy *sg_policy =
> +               container_of(work, struct sugov_policy, work);

Why this change?

>
>         mutex_lock(&sg_policy->work_lock);
>         __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> @@ -308,7 +312,7 @@ static void sugov_irq_work(struct irq_work *irq_work)
>         struct sugov_policy *sg_policy;
>
>         sg_policy = container_of(irq_work, struct sugov_policy, irq_work);
> -       schedule_work_on(smp_processor_id(), &sg_policy->work);
> +       kthread_queue_work(&sg_policy->worker, &sg_policy->work);
>  }
>
>  /************************** sysfs interface ************************/
> @@ -362,9 +366,23 @@ static struct kobj_type sugov_tunables_ktype = {
>
>  static struct cpufreq_governor schedutil_gov;
>
> +static void sugov_policy_free(struct sugov_policy *sg_policy)
> +{
> +       if (!sg_policy->policy->fast_switch_enabled) {
> +               kthread_flush_worker(&sg_policy->worker);
> +               kthread_stop(sg_policy->thread);
> +       }
> +
> +       mutex_destroy(&sg_policy->work_lock);
> +       kfree(sg_policy);
> +}
> +
>  static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
>  {
>         struct sugov_policy *sg_policy;
> +       struct task_struct *thread;
> +       struct sched_param param = { .sched_priority = 50 };

I'd define a symbol for the 50.  It's just one extra line of code ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-11 21:52     ` Rafael J. Wysocki
@ 2016-11-11 22:55       ` Peter Zijlstra
  0 siblings, 0 replies; 28+ messages in thread
From: Peter Zijlstra @ 2016-11-11 22:55 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Rafael Wysocki, Ingo Molnar, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On Fri, Nov 11, 2016 at 10:52:27PM +0100, Rafael J. Wysocki wrote:
> On Fri, Nov 11, 2016 at 3:19 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, Nov 11, 2016 at 03:52:21PM +0530, Viresh Kumar wrote:
> >> @@ -456,8 +460,6 @@ static int sugov_init(struct cpufreq_policy *policy)
> >>
> >>   out:
> >>       mutex_unlock(&global_tunables_lock);
> >> -
> >> -     cpufreq_enable_fast_switch(policy);
> >>       return 0;
> >>
> >>   fail:
> >> @@ -468,6 +470,10 @@ static int sugov_init(struct cpufreq_policy *policy)
> >>       mutex_unlock(&global_tunables_lock);
> >>
> >>       sugov_policy_free(sg_policy);
> >> +
> >> + disable_fast_switch:
> >> +     cpufreq_disable_fast_switch(policy);
> >> +
> >>       pr_err("initialization failed (error %d)\n", ret);
> >>       return ret;
> >>  }
> >
> > Argh, no indented labels please. Please fix the 3 that snuck in while
> > you're there.
> 
> Well, you didn't tell me you didn't like them. :-)
> 
> Anyway, I can fix this up easily enough.
> 
> Any other concerns regarding the patch?

No, looked fine I think, as did the others.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 22:16   ` Rafael J. Wysocki
@ 2016-11-12  1:31     ` Saravana Kannan
  2016-11-12  5:27       ` Viresh Kumar
  2016-11-13 14:37       ` Rafael J. Wysocki
  2016-11-12  5:24     ` Viresh Kumar
  2016-11-13 19:31     ` Steve Muckle
  2 siblings, 2 replies; 28+ messages in thread
From: Saravana Kannan @ 2016-11-12  1:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Rafael Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa, Steve Muckle

On 11/11/2016 02:16 PM, Rafael J. Wysocki wrote:
> On Fri, Nov 11, 2016 at 11:22 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>> If slow path frequency changes are conducted in a SCHED_OTHER context
>> then they may be delayed for some amount of time, including
>> indefinitely, when real time or deadline activity is taking place.
>>
>> Move the slow path to a real time kernel thread. In the future the
>> thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set
>> to 50 for now.
>>
>> Hackbench results on ARM Exynos, dual core A15 platform for 10
>> iterations:
>>
>> $ hackbench -s 100 -l 100 -g 10 -f 20
>>
>> Before                  After
>> ---------------------------------
>> 1.808                   1.603
>> 1.847                   1.251
>> 2.229                   1.590
>> 1.952                   1.600
>> 1.947                   1.257
>> 1.925                   1.627
>> 2.694                   1.620
>> 1.258                   1.621
>> 1.919                   1.632
>> 1.250                   1.240
>>
>> Average:
>>
>> 1.8829                  1.5041
>>
>> Based on initial work by Steve Muckle.
>>
>> Signed-off-by: Steve Muckle <smuckle.linux@gmail.com>
>> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
>> ---
>>   kernel/sched/cpufreq_schedutil.c | 62 ++++++++++++++++++++++++++++++++--------
>>   1 file changed, 50 insertions(+), 12 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index ccb2ab89affb..045ce0a4e6d1 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -12,6 +12,7 @@
>>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>
>>   #include <linux/cpufreq.h>
>> +#include <linux/kthread.h>
>>   #include <linux/slab.h>
>>   #include <trace/events/power.h>
>>
>> @@ -35,8 +36,10 @@ struct sugov_policy {
>>
>>          /* The next fields are only needed if fast switch cannot be used. */
>>          struct irq_work irq_work;
>> -       struct work_struct work;
>> +       struct kthread_work work;
>>          struct mutex work_lock;
>> +       struct kthread_worker worker;
>> +       struct task_struct *thread;
>>          bool work_in_progress;
>>
>>          bool need_freq_update;
>> @@ -291,9 +294,10 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
>>          raw_spin_unlock(&sg_policy->update_lock);
>>   }
>>
>> -static void sugov_work(struct work_struct *work)
>> +static void sugov_work(struct kthread_work *work)
>>   {
>> -       struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
>> +       struct sugov_policy *sg_policy =
>> +               container_of(work, struct sugov_policy, work);
>
> Why this change?
>
>>
>>          mutex_lock(&sg_policy->work_lock);
>>          __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
>> @@ -308,7 +312,7 @@ static void sugov_irq_work(struct irq_work *irq_work)
>>          struct sugov_policy *sg_policy;
>>
>>          sg_policy = container_of(irq_work, struct sugov_policy, irq_work);
>> -       schedule_work_on(smp_processor_id(), &sg_policy->work);
>> +       kthread_queue_work(&sg_policy->worker, &sg_policy->work);
>>   }
>>
>>   /************************** sysfs interface ************************/
>> @@ -362,9 +366,23 @@ static struct kobj_type sugov_tunables_ktype = {
>>
>>   static struct cpufreq_governor schedutil_gov;
>>
>> +static void sugov_policy_free(struct sugov_policy *sg_policy)
>> +{
>> +       if (!sg_policy->policy->fast_switch_enabled) {
>> +               kthread_flush_worker(&sg_policy->worker);
>> +               kthread_stop(sg_policy->thread);
>> +       }
>> +
>> +       mutex_destroy(&sg_policy->work_lock);
>> +       kfree(sg_policy);
>> +}
>> +
>>   static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
>>   {
>>          struct sugov_policy *sg_policy;
>> +       struct task_struct *thread;
>> +       struct sched_param param = { .sched_priority = 50 };
>
> I'd define a symbol for the 50.  It's just one extra line of code ...
>

Hold on a sec. I thought during LPC someone (Peter?) made a point that 
when RT thread run, we should bump the frequency to max? So, schedutil 
is going to trigger schedutil to bump up the frequency to max, right?

-Saravana


-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-11 21:58   ` Rafael J. Wysocki
@ 2016-11-12  5:19     ` Viresh Kumar
  2016-11-13 14:46       ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Viresh Kumar @ 2016-11-12  5:19 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On 12 November 2016 at 03:28, Rafael J. Wysocki <rafael@kernel.org> wrote:

>> @@ -478,8 +484,6 @@ static void sugov_exit(struct cpufreq_policy *policy)
>>         struct sugov_tunables *tunables = sg_policy->tunables;
>>         unsigned int count;
>>
>> -       cpufreq_disable_fast_switch(policy);
>> -
>
> ->but why is this change necessary?
>
> sugov_stop() has been called already, so the ordering here shouldn't matter.

Because sugov_policy_free() would be using the flag fast_switch_enabled.

--
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 14:32   ` Tommaso Cucinotta
  2016-11-11 14:39     ` Peter Zijlstra
@ 2016-11-12  5:21     ` Viresh Kumar
  1 sibling, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-12  5:21 UTC (permalink / raw)
  To: Tommaso Cucinotta
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	linux-pm, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa, Steve Muckle

On 11 November 2016 at 20:02, Tommaso Cucinotta
<tommaso.cucinotta@sssup.it> wrote:

> would you have an insight, as to what runtime/deadline/period to set, and/or
> what specific timing constraints you'd like to set, just for this cpufreq
> slowpath work?

I don't have any such figures for not at least :(

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 14:39     ` Peter Zijlstra
@ 2016-11-12  5:22       ` Viresh Kumar
  2016-11-14  9:22         ` Peter Zijlstra
  0 siblings, 1 reply; 28+ messages in thread
From: Viresh Kumar @ 2016-11-12  5:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tommaso Cucinotta, Rafael Wysocki, Ingo Molnar,
	Lists linaro-kernel, linux-pm, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa, Steve Muckle

On 11 November 2016 at 20:09, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Nov 11, 2016 at 03:32:04PM +0100, Tommaso Cucinotta wrote:
>> >+    struct sched_param param = { .sched_priority = 50 };
>>
>> won't you have a tunable here? (sysctl?)
>
> You can use the regular userspace tools, like schedtool and chrt to set
> priorities.

I wanted to get some help from you on this Peter. The out of tree Interactive
governor has always used MAX_RT_PRIORITY - 1 here instead of 50.

But Steve started with 50. What do you think should the value be ?

--
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 22:16   ` Rafael J. Wysocki
  2016-11-12  1:31     ` Saravana Kannan
@ 2016-11-12  5:24     ` Viresh Kumar
  2016-11-13 19:31     ` Steve Muckle
  2 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-12  5:24 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa, Steve Muckle

On 12 November 2016 at 03:46, Rafael J. Wysocki <rafael@kernel.org> wrote:

>> +static void sugov_work(struct kthread_work *work)
>>  {
>> -       struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
>> +       struct sugov_policy *sg_policy =
>> +               container_of(work, struct sugov_policy, work);
>
> Why this change?

Mistake ..

>>  static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy *policy)
>>  {
>>         struct sugov_policy *sg_policy;
>> +       struct task_struct *thread;
>> +       struct sched_param param = { .sched_priority = 50 };
>
> I'd define a symbol for the 50.  It's just one extra line of code ...

Sure.

As I asked in the cover letter, will you be fine if I send the same patch
for ondemand/conservative governors ?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-12  1:31     ` Saravana Kannan
@ 2016-11-12  5:27       ` Viresh Kumar
  2016-11-14  5:37         ` Viresh Kumar
  2016-11-13 14:37       ` Rafael J. Wysocki
  1 sibling, 1 reply; 28+ messages in thread
From: Viresh Kumar @ 2016-11-12  5:27 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Rafael Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa, Steve Muckle

On 12 November 2016 at 07:01, Saravana Kannan <skannan@codeaurora.org> wrote:
> Hold on a sec. I thought during LPC someone (Peter?) made a point that when
> RT thread run, we should bump the frequency to max?

I wasn't there but AFAIU, this is the case we have currently for the schedutil
governor. And we (mobile world, Linaro) want to change that it doesn't work
that well for us. So perhaps it is just the opposite of what you stated.

> So, schedutil is going
> to trigger schedutil to bump up the frequency to max, right?

How is that question related to this patch ?

--
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-12  1:31     ` Saravana Kannan
  2016-11-12  5:27       ` Viresh Kumar
@ 2016-11-13 14:37       ` Rafael J. Wysocki
  2016-11-13 19:47         ` Steve Muckle
  1 sibling, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2016-11-13 14:37 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Viresh Kumar, Rafael Wysocki, Ingo Molnar,
	Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa, Steve Muckle

On Sat, Nov 12, 2016 at 2:31 AM, Saravana Kannan <skannan@codeaurora.org> wrote:
> On 11/11/2016 02:16 PM, Rafael J. Wysocki wrote:
>>
>> On Fri, Nov 11, 2016 at 11:22 AM, Viresh Kumar <viresh.kumar@linaro.org>
>> wrote:
>>>
>>> If slow path frequency changes are conducted in a SCHED_OTHER context
>>> then they may be delayed for some amount of time, including
>>> indefinitely, when real time or deadline activity is taking place.
>>>
>>> Move the slow path to a real time kernel thread. In the future the
>>> thread should be made SCHED_DEADLINE. The RT priority is arbitrarily set
>>> to 50 for now.
>>>
>>> Hackbench results on ARM Exynos, dual core A15 platform for 10
>>> iterations:
>>>
>>> $ hackbench -s 100 -l 100 -g 10 -f 20
>>>
>>> Before                  After
>>> ---------------------------------
>>> 1.808                   1.603
>>> 1.847                   1.251
>>> 2.229                   1.590
>>> 1.952                   1.600
>>> 1.947                   1.257
>>> 1.925                   1.627
>>> 2.694                   1.620
>>> 1.258                   1.621
>>> 1.919                   1.632
>>> 1.250                   1.240
>>>
>>> Average:
>>>
>>> 1.8829                  1.5041
>>>
>>> Based on initial work by Steve Muckle.
>>>
>>> Signed-off-by: Steve Muckle <smuckle.linux@gmail.com>
>>> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
>>> ---
>>>   kernel/sched/cpufreq_schedutil.c | 62
>>> ++++++++++++++++++++++++++++++++--------
>>>   1 file changed, 50 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/kernel/sched/cpufreq_schedutil.c
>>> b/kernel/sched/cpufreq_schedutil.c
>>> index ccb2ab89affb..045ce0a4e6d1 100644
>>> --- a/kernel/sched/cpufreq_schedutil.c
>>> +++ b/kernel/sched/cpufreq_schedutil.c
>>> @@ -12,6 +12,7 @@
>>>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>>
>>>   #include <linux/cpufreq.h>
>>> +#include <linux/kthread.h>
>>>   #include <linux/slab.h>
>>>   #include <trace/events/power.h>
>>>
>>> @@ -35,8 +36,10 @@ struct sugov_policy {
>>>
>>>          /* The next fields are only needed if fast switch cannot be
>>> used. */
>>>          struct irq_work irq_work;
>>> -       struct work_struct work;
>>> +       struct kthread_work work;
>>>          struct mutex work_lock;
>>> +       struct kthread_worker worker;
>>> +       struct task_struct *thread;
>>>          bool work_in_progress;
>>>
>>>          bool need_freq_update;
>>> @@ -291,9 +294,10 @@ static void sugov_update_shared(struct
>>> update_util_data *hook, u64 time,
>>>          raw_spin_unlock(&sg_policy->update_lock);
>>>   }
>>>
>>> -static void sugov_work(struct work_struct *work)
>>> +static void sugov_work(struct kthread_work *work)
>>>   {
>>> -       struct sugov_policy *sg_policy = container_of(work, struct
>>> sugov_policy, work);
>>> +       struct sugov_policy *sg_policy =
>>> +               container_of(work, struct sugov_policy, work);
>>
>>
>> Why this change?
>>
>>>
>>>          mutex_lock(&sg_policy->work_lock);
>>>          __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
>>> @@ -308,7 +312,7 @@ static void sugov_irq_work(struct irq_work *irq_work)
>>>          struct sugov_policy *sg_policy;
>>>
>>>          sg_policy = container_of(irq_work, struct sugov_policy,
>>> irq_work);
>>> -       schedule_work_on(smp_processor_id(), &sg_policy->work);
>>> +       kthread_queue_work(&sg_policy->worker, &sg_policy->work);
>>>   }
>>>
>>>   /************************** sysfs interface ************************/
>>> @@ -362,9 +366,23 @@ static struct kobj_type sugov_tunables_ktype = {
>>>
>>>   static struct cpufreq_governor schedutil_gov;
>>>
>>> +static void sugov_policy_free(struct sugov_policy *sg_policy)
>>> +{
>>> +       if (!sg_policy->policy->fast_switch_enabled) {
>>> +               kthread_flush_worker(&sg_policy->worker);
>>> +               kthread_stop(sg_policy->thread);
>>> +       }
>>> +
>>> +       mutex_destroy(&sg_policy->work_lock);
>>> +       kfree(sg_policy);
>>> +}
>>> +
>>>   static struct sugov_policy *sugov_policy_alloc(struct cpufreq_policy
>>> *policy)
>>>   {
>>>          struct sugov_policy *sg_policy;
>>> +       struct task_struct *thread;
>>> +       struct sched_param param = { .sched_priority = 50 };
>>
>>
>> I'd define a symbol for the 50.  It's just one extra line of code ...
>>
>
> Hold on a sec. I thought during LPC someone (Peter?) made a point that when
> RT thread run, we should bump the frequency to max? So, schedutil is going
> to trigger schedutil to bump up the frequency to max, right?

No, it isn't, or at least that is unlikely.

sugov_update_commit() sets sg_policy->work_in_progress before queuing
the IRQ work and it is not cleared until the frequency changes in
sugov_work().

OTOH, sugov_should_update_freq() checks sg_policy->work_in_progress
upfront and returns false when it is set, so the governor won't see
its own worker threads run, unless I'm overlooking something highly
non-obvious.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-12  5:19     ` Viresh Kumar
@ 2016-11-13 14:46       ` Rafael J. Wysocki
  2016-11-14  4:06         ` Viresh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2016-11-13 14:46 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Rafael J. Wysocki, Rafael Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa

On Sat, Nov 12, 2016 at 6:19 AM, Viresh Kumar <viresh.kumar@linaro.org> wrote:
> On 12 November 2016 at 03:28, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
>>> @@ -478,8 +484,6 @@ static void sugov_exit(struct cpufreq_policy *policy)
>>>         struct sugov_tunables *tunables = sg_policy->tunables;
>>>         unsigned int count;
>>>
>>> -       cpufreq_disable_fast_switch(policy);
>>> -
>>
>> ->but why is this change necessary?
>>
>> sugov_stop() has been called already, so the ordering here shouldn't matter.
>
> Because sugov_policy_free() would be using the flag fast_switch_enabled.

That's only going to happen in the next patch, though, right?  It
wouldn't hurt to write that in the changelog too.

Besides, I'm not actually sure if starting/stopping the kthread in
sugov_policy_alloc/free() is a good idea.  It sort of conflates the
allocation of memory with kthread creation.  Any chance to untangle
that?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-11 22:16   ` Rafael J. Wysocki
  2016-11-12  1:31     ` Saravana Kannan
  2016-11-12  5:24     ` Viresh Kumar
@ 2016-11-13 19:31     ` Steve Muckle
  2 siblings, 0 replies; 28+ messages in thread
From: Steve Muckle @ 2016-11-13 19:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Viresh Kumar, Rafael Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa

On Fri, Nov 11, 2016 at 11:16:59PM +0100, Rafael J. Wysocki wrote:
> > +       struct sched_param param = { .sched_priority = 50 };
> 
> I'd define a symbol for the 50.  It's just one extra line of code ...

A minor point for sure, but in general what's the motivation for
defining symbols for things which are only used once? It makes it harder
to untangle and learn a piece of code IMO, having to jump to the
definitions of them to see what they are.

thanks,
Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-13 14:37       ` Rafael J. Wysocki
@ 2016-11-13 19:47         ` Steve Muckle
  2016-11-13 22:44           ` Rafael J. Wysocki
  0 siblings, 1 reply; 28+ messages in thread
From: Steve Muckle @ 2016-11-13 19:47 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Saravana Kannan, Viresh Kumar, Rafael Wysocki, Ingo Molnar,
	Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On Sun, Nov 13, 2016 at 03:37:18PM +0100, Rafael J. Wysocki wrote:
> > Hold on a sec. I thought during LPC someone (Peter?) made a point that when
> > RT thread run, we should bump the frequency to max? So, schedutil is going
> > to trigger schedutil to bump up the frequency to max, right?
> 
> No, it isn't, or at least that is unlikely.
> 
> sugov_update_commit() sets sg_policy->work_in_progress before queuing
> the IRQ work and it is not cleared until the frequency changes in
> sugov_work().
> 
> OTOH, sugov_should_update_freq() checks sg_policy->work_in_progress
> upfront and returns false when it is set, so the governor won't see
> its own worker threads run, unless I'm overlooking something highly
> non-obvious.

FWIW my intention with the original version of this patch (which I
neglected to communicate to Viresh) was that it would depend on changing
the frequency policy for RT. I had been using rt_avg. It sounds like
during LPC there were talks of using another metric.

It does appear things would work okay without that but it also seems
a bit fragile. There's the window between when the work_in_progress
gets cleared and the RT kthread yields. I have not thought through the
various scenarios there, what is possible and tested to see if it is
significant enough to impact power-sensitive platforms.

thanks,
Steve

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-13 19:47         ` Steve Muckle
@ 2016-11-13 22:44           ` Rafael J. Wysocki
  2016-11-14  6:36             ` Viresh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Rafael J. Wysocki @ 2016-11-13 22:44 UTC (permalink / raw)
  To: Steve Muckle
  Cc: Rafael J. Wysocki, Saravana Kannan, Viresh Kumar, Rafael Wysocki,
	Ingo Molnar, Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On Sun, Nov 13, 2016 at 8:47 PM, Steve Muckle <smuckle.linux@gmail.com> wrote:
> On Sun, Nov 13, 2016 at 03:37:18PM +0100, Rafael J. Wysocki wrote:
>> > Hold on a sec. I thought during LPC someone (Peter?) made a point that when
>> > RT thread run, we should bump the frequency to max? So, schedutil is going
>> > to trigger schedutil to bump up the frequency to max, right?
>>
>> No, it isn't, or at least that is unlikely.
>>
>> sugov_update_commit() sets sg_policy->work_in_progress before queuing
>> the IRQ work and it is not cleared until the frequency changes in
>> sugov_work().
>>
>> OTOH, sugov_should_update_freq() checks sg_policy->work_in_progress
>> upfront and returns false when it is set, so the governor won't see
>> its own worker threads run, unless I'm overlooking something highly
>> non-obvious.
>
> FWIW my intention with the original version of this patch (which I
> neglected to communicate to Viresh) was that it would depend on changing
> the frequency policy for RT. I had been using rt_avg. It sounds like
> during LPC there were talks of using another metric.
>
> It does appear things would work okay without that but it also seems
> a bit fragile.

Yes, it does.

To a minimum, there should be a comment regarding that in the patches.

> There's the window between when the work_in_progress
> gets cleared and the RT kthread yields. I have not thought through the
> various scenarios there, what is possible and tested to see if it is
> significant enough to impact power-sensitive platforms.

Well, me neither, to be entirely honest. :-)

That said, there is a limited number of call sites for
update_curr_rt(), where SCHED_CPUFREQ_RT is passed to cpufreq
governors: dequeue_task_rt(), put_prev_task_rt(), pick_next_task_rt(),
and task_tick_rt().  I'm not sure how pick_next_task_rt() can be
relevant here at all, though, and task_tick_rt() would need to be
running exactly during the window mentioned above, so it probably is
negligible either, at least on the average.

>From the quick look at the scheduler core, put_prev_task() is mostly
called for running tasks, so that case doesn't look like something to
worry about too, although it would need to be looked through in
detail.  The dequeue part I'm totally unsure about.

In any case, the clearing of work_in_progress might still be deferred
by queuing a regular (non-RT) work item to do that from the kthread
work (that will guarantee "hiding" the kthread work from the
governor), but admittedly that would be a sledgehammer of sorts (and
it might defeat the purpose of the whole exercise) ...

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-13 14:46       ` Rafael J. Wysocki
@ 2016-11-14  4:06         ` Viresh Kumar
  2016-11-14 11:30           ` Viresh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Viresh Kumar @ 2016-11-14  4:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On 13-11-16, 15:46, Rafael J. Wysocki wrote:
> That's only going to happen in the next patch, though, right?  It
> wouldn't hurt to write that in the changelog too.

Sure.

> Besides, I'm not actually sure if starting/stopping the kthread in
> sugov_policy_alloc/free() is a good idea.  It sort of conflates the
> allocation of memory with kthread creation.  Any chance to untangle
> that?

Hmm, so either I can create two new routines for the thread and call
them along with alloc/free. Or I can rename the alloc/free routines
and keep this patch as is.

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-12  5:27       ` Viresh Kumar
@ 2016-11-14  5:37         ` Viresh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-14  5:37 UTC (permalink / raw)
  To: Saravana Kannan
  Cc: Rafael J. Wysocki, Rafael Wysocki, Ingo Molnar, Peter Zijlstra,
	Lists linaro-kernel, Linux PM, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa, Steve Muckle

On 12-11-16, 10:57, Viresh Kumar wrote:
> On 12 November 2016 at 07:01, Saravana Kannan <skannan@codeaurora.org> wrote:
> > Hold on a sec. I thought during LPC someone (Peter?) made a point that when
> > RT thread run, we should bump the frequency to max?
> 
> I wasn't there but AFAIU, this is the case we have currently for the schedutil
> governor. And we (mobile world, Linaro) want to change that it doesn't work
> that well for us. So perhaps it is just the opposite of what you stated.
> 
> > So, schedutil is going
> > to trigger schedutil to bump up the frequency to max, right?
> 
> How is that question related to this patch ?

Trash my last email, I failed to read yours :(

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-13 22:44           ` Rafael J. Wysocki
@ 2016-11-14  6:36             ` Viresh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-14  6:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Steve Muckle, Saravana Kannan, Rafael Wysocki, Ingo Molnar,
	Peter Zijlstra, Lists linaro-kernel, Linux PM,
	Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On 13-11-16, 23:44, Rafael J. Wysocki wrote:
> To a minimum, there should be a comment regarding that in the patches.

Wanted to get the comment written properly before sending that in the patch. Can
you please rectify this based on what you are looking for ?

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index c6bc60078e21..e43f4fd42fb4 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -313,6 +313,20 @@ static void sugov_irq_work(struct irq_work *irq_work)
        struct sugov_policy *sg_policy;
 
        sg_policy = container_of(irq_work, struct sugov_policy, irq_work);
+
+       /*
+        * For Real Time and Deadline tasks, schedutil governor shoots the
+        * frequency to maximum. And special care must be taken to ensure that
+        * this kthread doesn't result in that.
+        *
+        * This is (mostly) guaranteed by the work_in_progress flag. The flag is
+        * updated only at the end of the sugov_work() and before that schedutil
+        * rejects all other frequency scaling requests.
+        *
+        * Though there is a very rare case where the RT thread yields right
+        * after the work_in_progress flag is cleared. The effects of that are
+        * neglected for now.
+        */
        kthread_queue_work(&sg_policy->worker, &sg_policy->work);
 }
 
> In any case, the clearing of work_in_progress might still be deferred
> by queuing a regular (non-RT) work item to do that from the kthread
> work (that will guarantee "hiding" the kthread work from the
> governor), but admittedly that would be a sledgehammer of sorts (and
> it might defeat the purpose of the whole exercise) ...

I agree.

-- 
viresh

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-12  5:22       ` Viresh Kumar
@ 2016-11-14  9:22         ` Peter Zijlstra
  2016-11-14 10:22           ` Viresh Kumar
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Zijlstra @ 2016-11-14  9:22 UTC (permalink / raw)
  To: Viresh Kumar
  Cc: Tommaso Cucinotta, Rafael Wysocki, Ingo Molnar,
	Lists linaro-kernel, linux-pm, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa, Steve Muckle

On Sat, Nov 12, 2016 at 10:52:35AM +0530, Viresh Kumar wrote:
> On 11 November 2016 at 20:09, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, Nov 11, 2016 at 03:32:04PM +0100, Tommaso Cucinotta wrote:
> >> >+    struct sched_param param = { .sched_priority = 50 };
> >>
> >> won't you have a tunable here? (sysctl?)
> >
> > You can use the regular userspace tools, like schedtool and chrt to set
> > priorities.
> 
> I wanted to get some help from you on this Peter. The out of tree Interactive
> governor has always used MAX_RT_PRIORITY - 1 here instead of 50.
> 
> But Steve started with 50. What do you think should the value be ?

Any static prio value is wrong (static prio assignment requires system
knowledge that the kernel doesn't and cannot have), 50 is what threaded
IRQs default too as well IIRC, so it would at least be consistent with
that.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
  2016-11-14  9:22         ` Peter Zijlstra
@ 2016-11-14 10:22           ` Viresh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-14 10:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Tommaso Cucinotta, Rafael Wysocki, Ingo Molnar,
	Lists linaro-kernel, linux-pm, Linux Kernel Mailing List,
	Vincent Guittot, Juri Lelli, Robin Randhawa, Steve Muckle

On 14-11-16, 10:22, Peter Zijlstra wrote:
> Any static prio value is wrong (static prio assignment requires system
> knowledge that the kernel doesn't and cannot have), 50 is what threaded
> IRQs default too as well IIRC, so it would at least be consistent with
> that.

Yes you are correct and I have found a better way of defining the
priority in this case using that code instead of magic figure 50.

MAX_USER_RT_PRIO/2 :)

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier
  2016-11-14  4:06         ` Viresh Kumar
@ 2016-11-14 11:30           ` Viresh Kumar
  0 siblings, 0 replies; 28+ messages in thread
From: Viresh Kumar @ 2016-11-14 11:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael Wysocki, Ingo Molnar, Peter Zijlstra, Lists linaro-kernel,
	Linux PM, Linux Kernel Mailing List, Vincent Guittot, Juri Lelli,
	Robin Randhawa

On 14-11-16, 09:36, Viresh Kumar wrote:
> On 13-11-16, 15:46, Rafael J. Wysocki wrote:
> > That's only going to happen in the next patch, though, right?  It
> > wouldn't hurt to write that in the changelog too.
> 
> Sure.
> 
> > Besides, I'm not actually sure if starting/stopping the kthread in
> > sugov_policy_alloc/free() is a good idea.  It sort of conflates the
> > allocation of memory with kthread creation.  Any chance to untangle
> > that?
> 
> Hmm, so either I can create two new routines for the thread and call
> them along with alloc/free. Or I can rename the alloc/free routines
> and keep this patch as is.

I have created separate routines in my new version (which I will send
tomorrow).

-- 
viresh

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2016-11-14 11:37 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-11 10:22 [PATCH 0/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
2016-11-11 10:22 ` [PATCH 1/3] cpufreq: schedutil: enable fast switch earlier Viresh Kumar
2016-11-11 14:19   ` Peter Zijlstra
2016-11-11 21:52     ` Rafael J. Wysocki
2016-11-11 22:55       ` Peter Zijlstra
2016-11-11 21:58   ` Rafael J. Wysocki
2016-11-12  5:19     ` Viresh Kumar
2016-11-13 14:46       ` Rafael J. Wysocki
2016-11-14  4:06         ` Viresh Kumar
2016-11-14 11:30           ` Viresh Kumar
2016-11-11 10:22 ` [PATCH 2/3] cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task Viresh Kumar
2016-11-11 14:32   ` Tommaso Cucinotta
2016-11-11 14:39     ` Peter Zijlstra
2016-11-12  5:22       ` Viresh Kumar
2016-11-14  9:22         ` Peter Zijlstra
2016-11-14 10:22           ` Viresh Kumar
2016-11-12  5:21     ` Viresh Kumar
2016-11-11 22:16   ` Rafael J. Wysocki
2016-11-12  1:31     ` Saravana Kannan
2016-11-12  5:27       ` Viresh Kumar
2016-11-14  5:37         ` Viresh Kumar
2016-11-13 14:37       ` Rafael J. Wysocki
2016-11-13 19:47         ` Steve Muckle
2016-11-13 22:44           ` Rafael J. Wysocki
2016-11-14  6:36             ` Viresh Kumar
2016-11-12  5:24     ` Viresh Kumar
2016-11-13 19:31     ` Steve Muckle
2016-11-11 10:22 ` [PATCH 3/3] cpufreq: schedutil: irq-work is used only in slow path Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).