linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
@ 2019-12-20 16:48 Qais Yousef
  2020-01-07 13:42 ` Quentin Perret
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Qais Yousef @ 2019-12-20 16:48 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin
  Cc: Juri Lelli, Vincent Guittot, Ben Segall, Mel Gorman,
	valentin.schneider, qperret, Patrick Bellasi, linux-kernel,
	linux-fsdevel, Qais Yousef

RT tasks by default try to run at the highest capacity/performance
level. When uclamp is selected this default behavior is retained by
enforcing the uclamp_util_min of the RT tasks to be
uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
value.

See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").

On battery powered devices, this default behavior could consume more
power, and it is desired to be able to tune it down. While uclamp allows
tuning this by changing the uclamp_util_min of the individual tasks, but
this is cumbersome and error prone.

To control the default behavior globally by system admins and device
integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
change the default uclamp_util_min value of the RT tasks.

Whenever the new default changes, it'd be applied on the next wakeup of
the RT task, assuming that it still uses the system default value and
not a user applied one.

If the uclamp_util_min of an RT task is 0, then the RT utilization of
the rq is used to drive the frequency selection in schedutil for RT
tasks.

Tested on Juno-r2 in combination of the RT capacity awareness patches.
By default an RT task will go to the highest capacity CPU and run at the
maximum frequency. With this patch the RT task can run anywhere and
doesn't cause the frequency to be maximum all the time.

Signed-off-by: Qais Yousef <qais.yousef@arm.com>
---
 include/linux/sched/sysctl.h |  1 +
 kernel/sched/core.c          | 54 ++++++++++++++++++++++++++++++++----
 kernel/sched/rt.c            |  6 ++++
 kernel/sched/sched.h         |  4 +++
 kernel/sysctl.c              |  7 +++++
 5 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index d4f6215ee03f..ec73d8db2092 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -59,6 +59,7 @@ extern int sysctl_sched_rt_runtime;
 #ifdef CONFIG_UCLAMP_TASK
 extern unsigned int sysctl_sched_uclamp_util_min;
 extern unsigned int sysctl_sched_uclamp_util_max;
+extern unsigned int sysctl_sched_rt_uclamp_util_min;
 #endif
 
 #ifdef CONFIG_CFS_BANDWIDTH
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 90e4b00ace89..a8ab0bb7a967 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -792,6 +792,23 @@ unsigned int sysctl_sched_uclamp_util_min = SCHED_CAPACITY_SCALE;
 /* Max allowed maximum utilization */
 unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
 
+/*
+ * By default RT tasks run at the maximum performance point/capacity of the
+ * system. Uclamp enforces this by always setting UCLAMP_MIN of RT tasks to
+ * SCHED_CAPACITY_SCALE.
+ *
+ * This knob allows admins to change the default behavior when uclamp is being
+ * used. In battery powered devices particularly running at the maximum
+ * capacity will increase energy consumption and shorten the battery life.
+ *
+ * This knob only affects the default value RT uses when a new RT task is
+ * forked or has just changed policy to RT and no uclamp user settings were
+ * applied (ie: the task didn't modify the default value to a new value.
+ *
+ * This knob will not override the system default values defined above.
+ */
+unsigned int sysctl_sched_rt_uclamp_util_min = SCHED_CAPACITY_SCALE;
+
 /* All clamps are required to be less or equal than these values */
 static struct uclamp_se uclamp_default[UCLAMP_CNT];
 
@@ -919,6 +936,14 @@ uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
 	return uc_req;
 }
 
+void uclamp_rt_sync_default_util_min(struct task_struct *p)
+{
+	struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
+
+	if (!uc_se->user_defined)
+		uclamp_se_set(uc_se, sysctl_sched_rt_uclamp_util_min, false);
+}
+
 unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
 {
 	struct uclamp_se uc_eff;
@@ -1116,12 +1141,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
 				loff_t *ppos)
 {
 	bool update_root_tg = false;
-	int old_min, old_max;
+	int old_min, old_max, old_rt_min;
 	int result;
 
 	mutex_lock(&uclamp_mutex);
 	old_min = sysctl_sched_uclamp_util_min;
 	old_max = sysctl_sched_uclamp_util_max;
+	old_rt_min = sysctl_sched_rt_uclamp_util_min;
 
 	result = proc_dointvec(table, write, buffer, lenp, ppos);
 	if (result)
@@ -1129,12 +1155,23 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
 	if (!write)
 		goto done;
 
+	/*
+	 * The new value will be applied to all RT tasks the next time they
+	 * wakeup, assuming the task is using the system default and not a user
+	 * specified value. In the latter we shall leave the value as the user
+	 * requested.
+	 */
 	if (sysctl_sched_uclamp_util_min > sysctl_sched_uclamp_util_max ||
 	    sysctl_sched_uclamp_util_max > SCHED_CAPACITY_SCALE) {
 		result = -EINVAL;
 		goto undo;
 	}
 
+	if (sysctl_sched_rt_uclamp_util_min > SCHED_CAPACITY_SCALE) {
+		result = -EINVAL;
+		goto undo;
+	}
+
 	if (old_min != sysctl_sched_uclamp_util_min) {
 		uclamp_se_set(&uclamp_default[UCLAMP_MIN],
 			      sysctl_sched_uclamp_util_min, false);
@@ -1160,6 +1197,7 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
 undo:
 	sysctl_sched_uclamp_util_min = old_min;
 	sysctl_sched_uclamp_util_max = old_max;
+	sysctl_sched_rt_uclamp_util_min = old_rt_min;
 done:
 	mutex_unlock(&uclamp_mutex);
 
@@ -1202,9 +1240,12 @@ static void __setscheduler_uclamp(struct task_struct *p,
 		if (uc_se->user_defined)
 			continue;
 
-		/* By default, RT tasks always get 100% boost */
+		/*
+		 * By default, RT tasks always get 100% boost, which the admins
+		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
+		 */
 		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
-			clamp_value = uclamp_none(UCLAMP_MAX);
+			clamp_value = sysctl_sched_rt_uclamp_util_min;
 
 		uclamp_se_set(uc_se, clamp_value, false);
 	}
@@ -1236,9 +1277,12 @@ static void uclamp_fork(struct task_struct *p)
 	for_each_clamp_id(clamp_id) {
 		unsigned int clamp_value = uclamp_none(clamp_id);
 
-		/* By default, RT tasks always get 100% boost */
+		/*
+		 * By default, RT tasks always get 100% boost, which the admins
+		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
+		 */
 		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
-			clamp_value = uclamp_none(UCLAMP_MAX);
+			clamp_value = sysctl_sched_rt_uclamp_util_min;
 
 		uclamp_se_set(&p->uclamp_req[clamp_id], clamp_value, false);
 	}
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index e591d40fd645..19572dfc175b 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
  */
 static void task_woken_rt(struct rq *rq, struct task_struct *p)
 {
+	/*
+	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
+	 * we apply any new value on the next wakeup, which is here.
+	 */
+	uclamp_rt_sync_default_util_min(p);
+
 	if (!task_running(rq, p) &&
 	    !test_tsk_need_resched(rq->curr) &&
 	    p->nr_cpus_allowed > 1 &&
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 280a3c735935..337bf17b1a9d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
 #endif /* CONFIG_CPU_FREQ */
 
 #ifdef CONFIG_UCLAMP_TASK
+void uclamp_rt_sync_default_util_min(struct task_struct *p);
+
 unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
 
 static __always_inline
@@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
 	return uclamp_util_with(rq, util, NULL);
 }
 #else /* CONFIG_UCLAMP_TASK */
+void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
+
 static inline unsigned int uclamp_util_with(struct rq *rq, unsigned int util,
 					    struct task_struct *p)
 {
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 70665934d53e..06183762daac 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= sysctl_sched_uclamp_handler,
 	},
+	{
+		.procname	= "sched_rt_util_clamp_min",
+		.data		= &sysctl_sched_rt_uclamp_util_min,
+		.maxlen		= sizeof(unsigned int),
+		.mode		= 0644,
+		.proc_handler	= sysctl_sched_uclamp_handler,
+	},
 #endif
 #ifdef CONFIG_SCHED_AUTOGROUP
 	{
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2019-12-20 16:48 [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min Qais Yousef
@ 2020-01-07 13:42 ` Quentin Perret
  2020-01-07 19:30   ` Dietmar Eggemann
  2020-01-09 11:12   ` Qais Yousef
  2020-01-08 13:44 ` Peter Zijlstra
  2020-01-08 18:56 ` Patrick Bellasi
  2 siblings, 2 replies; 25+ messages in thread
From: Quentin Perret @ 2020-01-07 13:42 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	Patrick Bellasi, linux-kernel, linux-fsdevel, kernel-team

Hi Qais,

On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> +/*
> + * By default RT tasks run at the maximum performance point/capacity of the
> + * system. Uclamp enforces this by always setting UCLAMP_MIN of RT tasks to
> + * SCHED_CAPACITY_SCALE.
> + *
> + * This knob allows admins to change the default behavior when uclamp is being
> + * used. In battery powered devices particularly running at the maximum
> + * capacity will increase energy consumption and shorten the battery life.
> + *
> + * This knob only affects the default value RT uses when a new RT task is
> + * forked or has just changed policy to RT and no uclamp user settings were
> + * applied (ie: the task didn't modify the default value to a new value.
> + *
> + * This knob will not override the system default values defined above.
> + */

I suppose this comment could go in the sysctl doc file instead ?

> +unsigned int sysctl_sched_rt_uclamp_util_min = SCHED_CAPACITY_SCALE;

I would suggest renaming the knob as 'sysctl_sched_rt_default_uclamp_min'
or something along those lines to make it clear it's a default value.

And for consistency with the existing code, perhaps set the default to
uclamp_none(UCLAMP_MAX) instead of an explicit SCHED_CAPACITY_SCALE?

>  /* All clamps are required to be less or equal than these values */
>  static struct uclamp_se uclamp_default[UCLAMP_CNT];
>  
> @@ -919,6 +936,14 @@ uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
>  	return uc_req;
>  }
>  
> +void uclamp_rt_sync_default_util_min(struct task_struct *p)
> +{
> +	struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
> +
> +	if (!uc_se->user_defined)
> +		uclamp_se_set(uc_se, sysctl_sched_rt_uclamp_util_min, false);
> +}
> +
>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
>  {
>  	struct uclamp_se uc_eff;
> @@ -1116,12 +1141,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  				loff_t *ppos)
>  {
>  	bool update_root_tg = false;
> -	int old_min, old_max;
> +	int old_min, old_max, old_rt_min;
>  	int result;
>  
>  	mutex_lock(&uclamp_mutex);
>  	old_min = sysctl_sched_uclamp_util_min;
>  	old_max = sysctl_sched_uclamp_util_max;
> +	old_rt_min = sysctl_sched_rt_uclamp_util_min;
>  
>  	result = proc_dointvec(table, write, buffer, lenp, ppos);
>  	if (result)
> @@ -1129,12 +1155,23 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  	if (!write)
>  		goto done;
>  
> +	/*
> +	 * The new value will be applied to all RT tasks the next time they
> +	 * wakeup, assuming the task is using the system default and not a user
> +	 * specified value. In the latter we shall leave the value as the user
> +	 * requested.
> +	 */
>  	if (sysctl_sched_uclamp_util_min > sysctl_sched_uclamp_util_max ||
>  	    sysctl_sched_uclamp_util_max > SCHED_CAPACITY_SCALE) {
>  		result = -EINVAL;
>  		goto undo;
>  	}
>  
> +	if (sysctl_sched_rt_uclamp_util_min > SCHED_CAPACITY_SCALE) {
> +		result = -EINVAL;
> +		goto undo;
> +	}
> +
>  	if (old_min != sysctl_sched_uclamp_util_min) {
>  		uclamp_se_set(&uclamp_default[UCLAMP_MIN],
>  			      sysctl_sched_uclamp_util_min, false);
> @@ -1160,6 +1197,7 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  undo:
>  	sysctl_sched_uclamp_util_min = old_min;
>  	sysctl_sched_uclamp_util_max = old_max;
> +	sysctl_sched_rt_uclamp_util_min = old_rt_min;
>  done:
>  	mutex_unlock(&uclamp_mutex);
>  
> @@ -1202,9 +1240,12 @@ static void __setscheduler_uclamp(struct task_struct *p,
>  		if (uc_se->user_defined)
>  			continue;
>  
> -		/* By default, RT tasks always get 100% boost */
> +		/*
> +		 * By default, RT tasks always get 100% boost, which the admins
> +		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
> +		 */
>  		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> -			clamp_value = uclamp_none(UCLAMP_MAX);
> +			clamp_value = sysctl_sched_rt_uclamp_util_min;
>  
>  		uclamp_se_set(uc_se, clamp_value, false);
>  	}
> @@ -1236,9 +1277,12 @@ static void uclamp_fork(struct task_struct *p)
>  	for_each_clamp_id(clamp_id) {
>  		unsigned int clamp_value = uclamp_none(clamp_id);
>  
> -		/* By default, RT tasks always get 100% boost */
> +		/*
> +		 * By default, RT tasks always get 100% boost, which the admins
> +		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
> +		 */
>  		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> -			clamp_value = uclamp_none(UCLAMP_MAX);
> +			clamp_value = sysctl_sched_rt_uclamp_util_min;
>  
>  		uclamp_se_set(&p->uclamp_req[clamp_id], clamp_value, false);
>  	}
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index e591d40fd645..19572dfc175b 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
>   */
>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
>  {
> +	/*
> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> +	 * we apply any new value on the next wakeup, which is here.
> +	 */
> +	uclamp_rt_sync_default_util_min(p);

The task has already been enqueued and sugov has been called by then I
think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?

> +
>  	if (!task_running(rq, p) &&
>  	    !test_tsk_need_resched(rq->curr) &&
>  	    p->nr_cpus_allowed > 1 &&
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 280a3c735935..337bf17b1a9d 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
>  #endif /* CONFIG_CPU_FREQ */
>  
>  #ifdef CONFIG_UCLAMP_TASK
> +void uclamp_rt_sync_default_util_min(struct task_struct *p);
> +
>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
>  
>  static __always_inline
> @@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
>  	return uclamp_util_with(rq, util, NULL);
>  }
>  #else /* CONFIG_UCLAMP_TASK */
> +void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
> +
>  static inline unsigned int uclamp_util_with(struct rq *rq, unsigned int util,
>  					    struct task_struct *p)
>  {
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 70665934d53e..06183762daac 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= sysctl_sched_uclamp_handler,
>  	},
> +	{
> +		.procname	= "sched_rt_util_clamp_min",
> +		.data		= &sysctl_sched_rt_uclamp_util_min,
> +		.maxlen		= sizeof(unsigned int),
> +		.mode		= 0644,
> +		.proc_handler	= sysctl_sched_uclamp_handler,
> +	},
>  #endif
>  #ifdef CONFIG_SCHED_AUTOGROUP
>  	{
> -- 
> 2.17.1
> 

Apart from the small things above, this seems like a sensible idea and
would indeed be useful, so thanks for the patch!

Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-07 13:42 ` Quentin Perret
@ 2020-01-07 19:30   ` Dietmar Eggemann
  2020-01-08  9:51     ` Quentin Perret
  2020-01-09 11:16     ` Qais Yousef
  2020-01-09 11:12   ` Qais Yousef
  1 sibling, 2 replies; 25+ messages in thread
From: Dietmar Eggemann @ 2020-01-07 19:30 UTC (permalink / raw)
  To: Quentin Perret, Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, Patrick Bellasi, linux-kernel,
	linux-fsdevel, kernel-team

On 07/01/2020 14:42, Quentin Perret wrote:
> Hi Qais,
> 
> On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:

[...]

>> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>> index e591d40fd645..19572dfc175b 100644
>> --- a/kernel/sched/rt.c
>> +++ b/kernel/sched/rt.c
>> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
>>   */
>>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
>>  {
>> +	/*
>> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
>> +	 * we apply any new value on the next wakeup, which is here.
>> +	 */
>> +	uclamp_rt_sync_default_util_min(p);
> 
> The task has already been enqueued and sugov has been called by then I
> think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?

That's probably better.
Just to be sure ...we want this feature (an existing rt task gets its
UCLAMP_MIN value set when the sysctl changes) because there could be rt
tasks running before the sysctl is set?

>> +
>>  	if (!task_running(rq, p) &&
>>  	    !test_tsk_need_resched(rq->curr) &&
>>  	    p->nr_cpus_allowed > 1 &&
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index 280a3c735935..337bf17b1a9d 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
>>  #endif /* CONFIG_CPU_FREQ */
>>  
>>  #ifdef CONFIG_UCLAMP_TASK
>> +void uclamp_rt_sync_default_util_min(struct task_struct *p);
>> +
>>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
>>  
>>  static __always_inline
>> @@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
>>  	return uclamp_util_with(rq, util, NULL);
>>  }
>>  #else /* CONFIG_UCLAMP_TASK */
>> +void uclamp_rt_sync_default_util_min(struct task_struct *p) {}

-void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
+static inline void uclamp_rt_sync_default_util_min(struct task_struct
*p) {}

[...]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-07 19:30   ` Dietmar Eggemann
@ 2020-01-08  9:51     ` Quentin Perret
  2020-01-08 19:16       ` Patrick Bellasi
  2020-01-09 11:36       ` Qais Yousef
  2020-01-09 11:16     ` Qais Yousef
  1 sibling, 2 replies; 25+ messages in thread
From: Quentin Perret @ 2020-01-08  9:51 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Qais Yousef, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	Patrick Bellasi, linux-kernel, linux-fsdevel, kernel-team

On Tuesday 07 Jan 2020 at 20:30:36 (+0100), Dietmar Eggemann wrote:
> On 07/01/2020 14:42, Quentin Perret wrote:
> > Hi Qais,
> > 
> > On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> 
> [...]
> 
> >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> >> index e591d40fd645..19572dfc175b 100644
> >> --- a/kernel/sched/rt.c
> >> +++ b/kernel/sched/rt.c
> >> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
> >>   */
> >>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
> >>  {
> >> +	/*
> >> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> >> +	 * we apply any new value on the next wakeup, which is here.
> >> +	 */
> >> +	uclamp_rt_sync_default_util_min(p);
> > 
> > The task has already been enqueued and sugov has been called by then I
> > think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?
> 
> That's probably better.
> Just to be sure ...we want this feature (an existing rt task gets its
> UCLAMP_MIN value set when the sysctl changes) because there could be rt
> tasks running before the sysctl is set?

Yeah, I was wondering the same thing, but I'd expect sysadmin to want
this. We could change the min clamp of existing RT tasks in userspace
instead, but given how simple Qais' lazy update code is, the in-kernel
looks reasonable to me. No strong opinion, though.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2019-12-20 16:48 [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min Qais Yousef
  2020-01-07 13:42 ` Quentin Perret
@ 2020-01-08 13:44 ` Peter Zijlstra
  2020-01-08 19:08   ` Patrick Bellasi
  2020-01-09 13:00   ` Qais Yousef
  2020-01-08 18:56 ` Patrick Bellasi
  2 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2020-01-08 13:44 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Dietmar Eggemann, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, qperret, Patrick Bellasi,
	linux-kernel, linux-fsdevel

On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> RT tasks by default try to run at the highest capacity/performance
> level. When uclamp is selected this default behavior is retained by
> enforcing the uclamp_util_min of the RT tasks to be
> uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> value.
> 
> See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> 
> On battery powered devices, this default behavior could consume more
> power, and it is desired to be able to tune it down. While uclamp allows
> tuning this by changing the uclamp_util_min of the individual tasks, but
> this is cumbersome and error prone.
> 
> To control the default behavior globally by system admins and device
> integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> change the default uclamp_util_min value of the RT tasks.
> 
> Whenever the new default changes, it'd be applied on the next wakeup of
> the RT task, assuming that it still uses the system default value and
> not a user applied one.

This is because these RT tasks are not in a cgroup or not affected by
cgroup settings? I feel the justification is a little thin here.

> If the uclamp_util_min of an RT task is 0, then the RT utilization of
> the rq is used to drive the frequency selection in schedutil for RT
> tasks.

Did cpu_uclamp_write() forget to check for input<0 ?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2019-12-20 16:48 [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min Qais Yousef
  2020-01-07 13:42 ` Quentin Perret
  2020-01-08 13:44 ` Peter Zijlstra
@ 2020-01-08 18:56 ` Patrick Bellasi
  2020-01-09  1:35   ` Valentin Schneider
  2 siblings, 1 reply; 25+ messages in thread
From: Patrick Bellasi @ 2020-01-08 18:56 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	qperret, linux-kernel, linux-fsdevel

Hi Qais,

On 20-Dec 16:48, Qais Yousef wrote:
> RT tasks by default try to run at the highest capacity/performance
> level. When uclamp is selected this default behavior is retained by
> enforcing the uclamp_util_min of the RT tasks to be
> uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> value.
> 
> See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> 
> On battery powered devices, this default behavior could consume more
> power, and it is desired to be able to tune it down. While uclamp allows
> tuning this by changing the uclamp_util_min of the individual tasks, but
> this is cumbersome and error prone.
> 
> To control the default behavior globally by system admins and device
> integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> change the default uclamp_util_min value of the RT tasks.
> 
> Whenever the new default changes, it'd be applied on the next wakeup of
> the RT task, assuming that it still uses the system default value and
> not a user applied one.
> 
> If the uclamp_util_min of an RT task is 0, then the RT utilization of
> the rq is used to drive the frequency selection in schedutil for RT
> tasks.
> 
> Tested on Juno-r2 in combination of the RT capacity awareness patches.
> By default an RT task will go to the highest capacity CPU and run at the
> maximum frequency. With this patch the RT task can run anywhere and
> doesn't cause the frequency to be maximum all the time.
> 
> Signed-off-by: Qais Yousef <qais.yousef@arm.com>
> ---
>  include/linux/sched/sysctl.h |  1 +
>  kernel/sched/core.c          | 54 ++++++++++++++++++++++++++++++++----
>  kernel/sched/rt.c            |  6 ++++
>  kernel/sched/sched.h         |  4 +++
>  kernel/sysctl.c              |  7 +++++
>  5 files changed, 67 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
> index d4f6215ee03f..ec73d8db2092 100644
> --- a/include/linux/sched/sysctl.h
> +++ b/include/linux/sched/sysctl.h
> @@ -59,6 +59,7 @@ extern int sysctl_sched_rt_runtime;
>  #ifdef CONFIG_UCLAMP_TASK
>  extern unsigned int sysctl_sched_uclamp_util_min;
>  extern unsigned int sysctl_sched_uclamp_util_max;
> +extern unsigned int sysctl_sched_rt_uclamp_util_min;
>  #endif
>  
>  #ifdef CONFIG_CFS_BANDWIDTH
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 90e4b00ace89..a8ab0bb7a967 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -792,6 +792,23 @@ unsigned int sysctl_sched_uclamp_util_min = SCHED_CAPACITY_SCALE;
>  /* Max allowed maximum utilization */
>  unsigned int sysctl_sched_uclamp_util_max = SCHED_CAPACITY_SCALE;
>  
> +/*
> + * By default RT tasks run at the maximum performance point/capacity of the
> + * system. Uclamp enforces this by always setting UCLAMP_MIN of RT tasks to
> + * SCHED_CAPACITY_SCALE.
> + *
> + * This knob allows admins to change the default behavior when uclamp is being
> + * used. In battery powered devices particularly running at the maximum
> + * capacity will increase energy consumption and shorten the battery life.
> + *
> + * This knob only affects the default value RT uses when a new RT task is
> + * forked or has just changed policy to RT and no uclamp user settings were
> + * applied (ie: the task didn't modify the default value to a new value.
> + *
> + * This knob will not override the system default values defined above.
> + */
> +unsigned int sysctl_sched_rt_uclamp_util_min = SCHED_CAPACITY_SCALE;
> +
>  /* All clamps are required to be less or equal than these values */
>  static struct uclamp_se uclamp_default[UCLAMP_CNT];
>  
> @@ -919,6 +936,14 @@ uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
>  	return uc_req;
>  }
>  
> +void uclamp_rt_sync_default_util_min(struct task_struct *p)
> +{
> +	struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
> +
> +	if (!uc_se->user_defined)
> +		uclamp_se_set(uc_se, sysctl_sched_rt_uclamp_util_min, false);
> +}
> +

Here you are force setting the task-specific _requests_ to match the
system-wide _constraints_. This is not required and it's also
conceptually wrong, since you mix two concepts: requests and
constraints.

System-default values must never be synchronized with task-specific
values. This allows to always satisfy task _requests_ when not
conflicting with system-wide (or task-group) _constraints_.

For example, assuming we have a task with util_min=500 and we keep
changing the system-wide constraints, we would like the following
effective clamps to be enforced:

   time | system-wide | task-specific | effective clamp
   -----+-------------+---------------+-----------------
     t0 |        1024 |           500 |             500
     t1 |           0 |           500 |               0
     t2 |         200 |           500 |             200
     t3 |         600 |           500 |             500

If the taks should then change it's requested util_min:

   time | system-wide | task-specific | effective clamp
   -----+-------------+---------------+----------------
     t4 |         600 |          800  |             600
     t6 |        1024 |          800  |             800

If you force set the task-specific requests to match the system-wide
constraints, you cannot get the above described behaviors since you
keep overwriting the task _requests_ with system-wide _constraints_.

Thus, requests and contraints must always float independently and
used to compute the effective clamp at task wakeup time via:

   enqueue_task(rq, p, flags)
     uclamp_rq_inc(rq, p)
       uclamp_rq_inc_id(rq, p, clamp_id)
         uclamp_eff_get(p, clamp_id)
           uclamp_tg_restrict(p, clamp_id)
     p->sched_class->enqueue_task(rq, p, flags)

where the task-specific request is restricted considering its task group
effective value (the constraint).

Do note that the root task group effective value (for cfs) tasks is kept
in sync with the system default value and propagated down to the
effective value of all subgroups.

Do note also that the effective value is computed before calling into
the scheduling class's enqueue_task(). Which means that we have the
right value in place before we poke sugov.

Thus, a proper implementation of what you need should just
replicate/generalize what we already do for cfs tasks.

>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
>  {
>  	struct uclamp_se uc_eff;
> @@ -1116,12 +1141,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  				loff_t *ppos)
>  {
>  	bool update_root_tg = false;
> -	int old_min, old_max;
> +	int old_min, old_max, old_rt_min;
>  	int result;
>  
>  	mutex_lock(&uclamp_mutex);
>  	old_min = sysctl_sched_uclamp_util_min;
>  	old_max = sysctl_sched_uclamp_util_max;
> +	old_rt_min = sysctl_sched_rt_uclamp_util_min;
>  
>  	result = proc_dointvec(table, write, buffer, lenp, ppos);
>  	if (result)
> @@ -1129,12 +1155,23 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  	if (!write)
>  		goto done;
>  
> +	/*
> +	 * The new value will be applied to all RT tasks the next time they
> +	 * wakeup, assuming the task is using the system default and not a user
> +	 * specified value. In the latter we shall leave the value as the user
> +	 * requested.
> +	 */
>  	if (sysctl_sched_uclamp_util_min > sysctl_sched_uclamp_util_max ||
>  	    sysctl_sched_uclamp_util_max > SCHED_CAPACITY_SCALE) {
>  		result = -EINVAL;
>  		goto undo;
>  	}
>  
> +	if (sysctl_sched_rt_uclamp_util_min > SCHED_CAPACITY_SCALE) {
> +		result = -EINVAL;
> +		goto undo;
> +	}
> +
>  	if (old_min != sysctl_sched_uclamp_util_min) {
>  		uclamp_se_set(&uclamp_default[UCLAMP_MIN],
>  			      sysctl_sched_uclamp_util_min, false);
> @@ -1160,6 +1197,7 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  undo:
>  	sysctl_sched_uclamp_util_min = old_min;
>  	sysctl_sched_uclamp_util_max = old_max;
> +	sysctl_sched_rt_uclamp_util_min = old_rt_min;
>  done:
>  	mutex_unlock(&uclamp_mutex);
>  
> @@ -1202,9 +1240,12 @@ static void __setscheduler_uclamp(struct task_struct *p,
>  		if (uc_se->user_defined)
>  			continue;
>  
> -		/* By default, RT tasks always get 100% boost */
> +		/*
> +		 * By default, RT tasks always get 100% boost, which the admins
> +		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
> +		 */
>  		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> -			clamp_value = uclamp_none(UCLAMP_MAX);
> +			clamp_value = sysctl_sched_rt_uclamp_util_min;
>  
>  		uclamp_se_set(uc_se, clamp_value, false);
>  	}
> @@ -1236,9 +1277,12 @@ static void uclamp_fork(struct task_struct *p)
>  	for_each_clamp_id(clamp_id) {
>  		unsigned int clamp_value = uclamp_none(clamp_id);
>  
> -		/* By default, RT tasks always get 100% boost */
> +		/*
> +		 * By default, RT tasks always get 100% boost, which the admins
> +		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
> +		 */
>  		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> -			clamp_value = uclamp_none(UCLAMP_MAX);
> +			clamp_value = sysctl_sched_rt_uclamp_util_min;
>  
>  		uclamp_se_set(&p->uclamp_req[clamp_id], clamp_value, false);
>  	}
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index e591d40fd645..19572dfc175b 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
>   */
>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
>  {
> +	/*
> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> +	 * we apply any new value on the next wakeup, which is here.
> +	 */
> +	uclamp_rt_sync_default_util_min(p);
> +
>  	if (!task_running(rq, p) &&
>  	    !test_tsk_need_resched(rq->curr) &&
>  	    p->nr_cpus_allowed > 1 &&
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 280a3c735935..337bf17b1a9d 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
>  #endif /* CONFIG_CPU_FREQ */
>  
>  #ifdef CONFIG_UCLAMP_TASK
> +void uclamp_rt_sync_default_util_min(struct task_struct *p);
> +
>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
>  
>  static __always_inline
> @@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
>  	return uclamp_util_with(rq, util, NULL);
>  }
>  #else /* CONFIG_UCLAMP_TASK */
> +void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
> +
>  static inline unsigned int uclamp_util_with(struct rq *rq, unsigned int util,
>  					    struct task_struct *p)
>  {
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 70665934d53e..06183762daac 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= sysctl_sched_uclamp_handler,
>  	},
> +	{
> +		.procname	= "sched_rt_util_clamp_min",
> +		.data		= &sysctl_sched_rt_uclamp_util_min,
> +		.maxlen		= sizeof(unsigned int),
> +		.mode		= 0644,
> +		.proc_handler	= sysctl_sched_uclamp_handler,
> +	},
>  #endif
>  #ifdef CONFIG_SCHED_AUTOGROUP
>  	{
> -- 
> 2.17.1
> 

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-08 13:44 ` Peter Zijlstra
@ 2020-01-08 19:08   ` Patrick Bellasi
  2020-01-09 13:00   ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Patrick Bellasi @ 2020-01-08 19:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Qais Yousef, Ingo Molnar, Dietmar Eggemann, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	qperret, linux-kernel, linux-fsdevel

On 08-Jan 14:44, Peter Zijlstra wrote:
> On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> > RT tasks by default try to run at the highest capacity/performance
> > level. When uclamp is selected this default behavior is retained by
> > enforcing the uclamp_util_min of the RT tasks to be
> > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> > value.
> > 
> > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> > 
> > On battery powered devices, this default behavior could consume more
> > power, and it is desired to be able to tune it down. While uclamp allows
> > tuning this by changing the uclamp_util_min of the individual tasks, but
> > this is cumbersome and error prone.
> > 
> > To control the default behavior globally by system admins and device
> > integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> > change the default uclamp_util_min value of the RT tasks.
> > 
> > Whenever the new default changes, it'd be applied on the next wakeup of
> > the RT task, assuming that it still uses the system default value and
> > not a user applied one.
> 
> This is because these RT tasks are not in a cgroup or not affected by
> cgroup settings? I feel the justification is a little thin here.

RT task are kind of special right now. To keep simple the initial
implementation we hardcoded the behavior: always run at max OPP unless
explicitely asked by a task-specific value.

To add a system wide setting specifically for RT tasks, we need to
generalize what we already do for CFS tasks and keep the behavior of
the two classes aligned (apart for the default value).
IOW, no rt.c specific code should be required.

> > If the uclamp_util_min of an RT task is 0, then the RT utilization of
> > the rq is used to drive the frequency selection in schedutil for RT
> > tasks.
> 
> Did cpu_uclamp_write() forget to check for input<0 ?

The cgroup API uses percentages, which gets only sanitized [0..100].00
values.

Moreover, capacity_from_percent() returns a uclamp_request.util which
is a u64. Thus, there should not be issues related to negative values.
Writing such a value should just fail the write syscall.


-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-08  9:51     ` Quentin Perret
@ 2020-01-08 19:16       ` Patrick Bellasi
  2020-01-09 11:36       ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Patrick Bellasi @ 2020-01-08 19:16 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Dietmar Eggemann, Qais Yousef, Ingo Molnar, Peter Zijlstra,
	Steven Rostedt, Luis Chamberlain, Kees Cook, Iurii Zaikin,
	Juri Lelli, Vincent Guittot, Ben Segall, Mel Gorman,
	valentin.schneider, linux-kernel, linux-fsdevel, kernel-team

On 08-Jan 09:51, Quentin Perret wrote:
> On Tuesday 07 Jan 2020 at 20:30:36 (+0100), Dietmar Eggemann wrote:
> > On 07/01/2020 14:42, Quentin Perret wrote:
> > > Hi Qais,
> > > 
> > > On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> > 
> > [...]
> > 
> > >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > >> index e591d40fd645..19572dfc175b 100644
> > >> --- a/kernel/sched/rt.c
> > >> +++ b/kernel/sched/rt.c
> > >> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
> > >>   */
> > >>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
> > >>  {
> > >> +	/*
> > >> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> > >> +	 * we apply any new value on the next wakeup, which is here.
> > >> +	 */
> > >> +	uclamp_rt_sync_default_util_min(p);
> > > 
> > > The task has already been enqueued and sugov has been called by then I
> > > think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?

No, the above sync should neven be done.

For CFS tasks is currenly working by computing the effective clamp
from core.c::enqueue_task() before calling into the scheduling class
callback.

We should use a similar approach for RT tasks thus avoiding rt.c
specific code and ensuring correct effective values are _aggregated_
out of task's _requests_ and system-wide's _constraints_ before
calling sugov.

> > That's probably better.
> > Just to be sure ...we want this feature (an existing rt task gets its
> > UCLAMP_MIN value set when the sysctl changes)

No, that's not a feature. That's an hack I would avoid.

> > because there could be rt tasks running before the sysctl is set?
> 
> Yeah, I was wondering the same thing, but I'd expect sysadmin to want
> this. We could change the min clamp of existing RT tasks in userspace
> instead, but given how simple Qais' lazy update code is, the in-kernel
> looks reasonable to me. No strong opinion, though.
> 
> Thanks,
> Quentin

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-08 18:56 ` Patrick Bellasi
@ 2020-01-09  1:35   ` Valentin Schneider
  2020-01-09  9:21     ` Patrick Bellasi
  2020-01-09 13:15     ` Qais Yousef
  0 siblings, 2 replies; 25+ messages in thread
From: Valentin Schneider @ 2020-01-09  1:35 UTC (permalink / raw)
  To: Patrick Bellasi, Qais Yousef
  Cc: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, qperret, linux-kernel,
	linux-fsdevel

On 08/01/2020 18:56, Patrick Bellasi wrote:
> Here you are force setting the task-specific _requests_ to match the
> system-wide _constraints_. This is not required and it's also
> conceptually wrong, since you mix two concepts: requests and
> constraints.
> 
> System-default values must never be synchronized with task-specific
> values. This allows to always satisfy task _requests_ when not
> conflicting with system-wide (or task-group) _constraints_.
> 
> For example, assuming we have a task with util_min=500 and we keep
> changing the system-wide constraints, we would like the following
> effective clamps to be enforced:
> 
>    time | system-wide | task-specific | effective clamp
>    -----+-------------+---------------+-----------------
>      t0 |        1024 |           500 |             500
>      t1 |           0 |           500 |               0
>      t2 |         200 |           500 |             200
>      t3 |         600 |           500 |             500
> 
> If the taks should then change it's requested util_min:
> 
>    time | system-wide | task-specific | effective clamp
>    -----+-------------+---------------+----------------
>      t4 |         600 |          800  |             600
>      t6 |        1024 |          800  |             800
> 
> If you force set the task-specific requests to match the system-wide
> constraints, you cannot get the above described behaviors since you
> keep overwriting the task _requests_ with system-wide _constraints_.
> 

But is what Qais' proposing really a system-wide *constraint*? What we want
to do here is have a knob for the RT uclamp.min values, because gotomax isn't
viable (for mobile, you know the story!). This leaves user_defined values
alone, so you should be able to reproduce exactly what you described above.
If I take your t3 and t4 examples:

| time | system-wide | rt default | task-specific | user_defined | effective |                       
|------+-------------+------------+---------------+--------------+-----------|                       
| t3   |         600 |       1024 |           500 | Y            |       500 |                       
| t4   |         600 |       1024 |           800 | Y            |       600 |

If the values were *not* user-defined, then it would depend on the default
knob Qais is introducing:

| time | system-wide | rt default | task-specific | user_defined | effective |                       
|------+-------------+------------+---------------+--------------+-----------|                       
| t3   |         600 |       1024 |          1024 | N            |       600 |                       
| t4   |         600 |          0 |             0 | N            |         0 | 

It's not forcing the task-specific value to the system-wide RT value, it's
just using it as tweakable default. At least that's how I understand it,
did I miss something?

> Thus, requests and contraints must always float independently and
> used to compute the effective clamp at task wakeup time via:
> 
>    enqueue_task(rq, p, flags)
>      uclamp_rq_inc(rq, p)
>        uclamp_rq_inc_id(rq, p, clamp_id)
>          uclamp_eff_get(p, clamp_id)
>            uclamp_tg_restrict(p, clamp_id)
>      p->sched_class->enqueue_task(rq, p, flags)
> 
> where the task-specific request is restricted considering its task group
> effective value (the constraint).
> 
> Do note that the root task group effective value (for cfs) tasks is kept
> in sync with the system default value and propagated down to the
> effective value of all subgroups.
> 
> Do note also that the effective value is computed before calling into
> the scheduling class's enqueue_task(). Which means that we have the
> right value in place before we poke sugov.
> 
> Thus, a proper implementation of what you need should just
> replicate/generalize what we already do for cfs tasks.
> 

Reading

  7274a5c1bbec ("sched/uclamp: Propagate system defaults to the root group")

I see "The clamp values are not tunable at the level of the root task group".
This means that, for mobile systems where we want a default uclamp.min of 0
for RT tasks, we would need to create a cgroup for all RT tasks (and tweak
its uclamp.min, but from playing around a bit I see that defaults to 0).

(Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
when turning it on, but I think we don't have to if we just want things like
uclamp value propagation?)

It's quite more work than the simple thing Qais is introducing (and on both
user and kernel side).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-09  1:35   ` Valentin Schneider
@ 2020-01-09  9:21     ` Patrick Bellasi
  2020-01-09 13:38       ` Qais Yousef
  2020-01-14 21:34       ` Qais Yousef
  2020-01-09 13:15     ` Qais Yousef
  1 sibling, 2 replies; 25+ messages in thread
From: Patrick Bellasi @ 2020-01-09  9:21 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Qais Yousef, Ingo Molnar, Peter Zijlstra, Dietmar Eggemann,
	Steven Rostedt, Luis Chamberlain, Kees Cook, Iurii Zaikin,
	Juri Lelli, Vincent Guittot, Ben Segall, Mel Gorman, qperret,
	linux-kernel, linux-fsdevel

On 09-Jan 01:35, Valentin Schneider wrote:
> On 08/01/2020 18:56, Patrick Bellasi wrote:
> > Here you are force setting the task-specific _requests_ to match the
> > system-wide _constraints_. This is not required and it's also
> > conceptually wrong, since you mix two concepts: requests and
> > constraints.
> > 
> > System-default values must never be synchronized with task-specific
> > values. This allows to always satisfy task _requests_ when not
> > conflicting with system-wide (or task-group) _constraints_.
> > 
> > For example, assuming we have a task with util_min=500 and we keep
> > changing the system-wide constraints, we would like the following
> > effective clamps to be enforced:
> > 
> >    time | system-wide | task-specific | effective clamp
> >    -----+-------------+---------------+-----------------
> >      t0 |        1024 |           500 |             500
> >      t1 |           0 |           500 |               0
> >      t2 |         200 |           500 |             200
> >      t3 |         600 |           500 |             500
> > 
> > If the taks should then change it's requested util_min:
> > 
> >    time | system-wide | task-specific | effective clamp
> >    -----+-------------+---------------+----------------
> >      t4 |         600 |          800  |             600
> >      t6 |        1024 |          800  |             800
> > 
> > If you force set the task-specific requests to match the system-wide
> > constraints, you cannot get the above described behaviors since you
> > keep overwriting the task _requests_ with system-wide _constraints_.
> > 
> 
> But is what Qais' proposing really a system-wide *constraint*? What we want
> to do here is have a knob for the RT uclamp.min values, because gotomax isn't
> viable (for mobile, you know the story!).

Yes I know, and I also see that what Qais is proposing can work.
I'm just saying that IMO it's not the best way to add that feature.

> This leaves user_defined values alone, so you should be able to
> reproduce exactly what you described above.

Yes, Qais is acutally affecting the effective value but it does it
that from the RT sched class, which is not effective for sugov, and most
importantly with a solution which is not the same we already use for
CFS tasks.

> If I take your t3 and t4 examples:
> 
> | time | system-wide | rt default | task-specific | user_defined | effective |                       
> |------+-------------+------------+---------------+--------------+-----------|                       
> | t3   |         600 |       1024 |           500 | Y            |       500 |                       
> | t4   |         600 |       1024 |           800 | Y            |       600 |
> 
> If the values were *not* user-defined, then it would depend on the default
> knob Qais is introducing:
> 
> | time | system-wide | rt default | task-specific | user_defined | effective |                       
> |------+-------------+------------+---------------+--------------+-----------|                       
> | t3   |         600 |       1024 |          1024 | N            |       600 |                       
> | t4   |         600 |          0 |             0 | N            |         0 | 
> 
> It's not forcing the task-specific value to the system-wide RT value, it's
> just using it as tweakable default. At least that's how I understand it,
> did I miss something?

You right, but the "tweakable" default should be implemented the same
way we do for CFS.

> > Thus, requests and contraints must always float independently and
> > used to compute the effective clamp at task wakeup time via:
> > 
> >    enqueue_task(rq, p, flags)
> >      uclamp_rq_inc(rq, p)
> >        uclamp_rq_inc_id(rq, p, clamp_id)
> >          uclamp_eff_get(p, clamp_id)
> >            uclamp_tg_restrict(p, clamp_id)
> >      p->sched_class->enqueue_task(rq, p, flags)
> > 
> > where the task-specific request is restricted considering its task group
> > effective value (the constraint).
> > 
> > Do note that the root task group effective value (for cfs) tasks is kept
> > in sync with the system default value and propagated down to the
> > effective value of all subgroups.
> > 
> > Do note also that the effective value is computed before calling into
> > the scheduling class's enqueue_task(). Which means that we have the
> > right value in place before we poke sugov.
> > 
> > Thus, a proper implementation of what you need should just
> > replicate/generalize what we already do for cfs tasks.
> > 
> 
> Reading
> 
>   7274a5c1bbec ("sched/uclamp: Propagate system defaults to the root group")
> 
> I see "The clamp values are not tunable at the level of the root task group".
> This means that, for mobile systems where we want a default uclamp.min of 0
> for RT tasks, we would need to create a cgroup for all RT tasks (and tweak
> its uclamp.min, but from playing around a bit I see that defaults to 0).

That's not entirely true. In that patch we introduce cgroup support
but, if you look at the code before that patch, for CFS tasks there is
only:
 - CFS task-specific values (min,max)=(0,1024) by default
 - CFS system-wide tunables (min,max)=(1024,1024) by default
and a change on the system-wide tunable allows for example to enforce
a uclamp_max=200 on all tasks.

A similar solution can be implemented for RT tasks, where we have:
 - RT task-specific values (min,max)=(1024,1024) by default
 - RT system-wide tunables (min,max)=(1024,1024) by default
 and a change on the system-wide tunable allows for example to enforce
 a uclamp_min=200 on all tasks.
 
> (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> when turning it on, but I think we don't have to if we just want things like
> uclamp value propagation?)

No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
That's because in this case:
  - uclamp_tg_restrict() returns just the task requested value
  - uclamp_eff_get() _always_ restricts the requested value considering
    the system defaults
 
> It's quite more work than the simple thing Qais is introducing (and on both
> user and kernel side).

But if in the future we will want to extend CGroups support to RT then
we will feel the pains because we do the effective computation in two
different places.

Do note that a proper CGroup support requires that the system default
values defines the values for the root group and are consistently
propagated down the hierarchy. Thus we need to add a dedicated pair of
cgroup attributes, e.g. cpu.util.rt.{min.max}.

To recap, we don't need CGROUP support right now but just to add a new
default tracking similar to what we do for CFS.

We already proposed such a support in one of the initial versions of
the uclamp series:
   Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
   https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
where a:
   static struct uclamp_se uclamp_default_perf[UCLAMP_CNT];
was added which was kind of similar/dual to what we do for CFS tasks
by using:
   static struct uclamp_se uclamp_default[UCLAMP_CNT];

That patch will not apply anymore of course, but the concepts still
hold:
  1. a new pair of sysctl_sched_uclamp_util_{min,max}_rt are available
     in userspace
  2. in sysctl_sched_uclamp_handler() we updated uclamp_default_perf
     (maybe better calling it uclamp_default_rt?) according to the
     sysfs tunables
  3. the uclamp_default_rt values are used in uclamp_eff_get() to
     _always_ restrict the task specific value returned by
     uclamp_tg_restrict() (on !CONFIG_UCLAMP_TASK_GROUP builds)

We will have to add some RT specific logic in uclamp_eff_get(), but
likely something as simple as the following should be ehought?

---8<---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 44123b4d14e8..f4a6b120da91 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -897,6 +897,15 @@ uclamp_tg_restrict(struct task_struct *p, enum uclamp_id clamp_id)
        return uc_req;
 }
 
+static inline struct uclamp_se
+uclamp_default_get(struct task_struct *p, clamp_id)
+{
+  if (rt_task(p))
+    return  uclamp_default_rt[clamp_id];
+
+  return  uclamp_default[clamp_id];
+}
+
 /*
  * The effective clamp bucket index of a task depends on, by increasing
  * priority:
@@ -909,7 +918,7 @@ static inline struct uclamp_se
 uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
 {
        struct uclamp_se uc_req = uclamp_tg_restrict(p, clamp_id);
-       struct uclamp_se uc_max = uclamp_default[clamp_id];
+       struct uclamp_se uc_max = uclamp_default_get(p clamp_id);
 
        /* System default restrictions always apply */
        if (unlikely(uc_req.value > uc_max.value))

---8<---

Adding a proper cgroup support would also be better. Otherwise, maybe
for the time being we can live with just system-defaults to constrain
the task-specific values with something like:

---8<---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 44123b4d14e8..73638c0cc65a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -877,6 +877,10 @@ static inline struct uclamp_se
 uclamp_tg_restrict(struct task_struct *p, enum uclamp_id clamp_id)
 {
        struct uclamp_se uc_req = p->uclamp_req[clamp_id];
+
+  if (rt_task(p))
+    return uc_req;
+
 #ifdef CONFIG_UCLAMP_TASK_GROUP
        struct uclamp_se uc_max;

---8<---

Or accept (without the above) that we can constraint RT tasks using
CFS clamps (i.e. cpu.util.{min,max}) when running in a CGroups.
Which is ugly but can work for many use-cases if runtimes like Android
take create to create cgroup specifically dedicated to RT tasks. :/

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-07 13:42 ` Quentin Perret
  2020-01-07 19:30   ` Dietmar Eggemann
@ 2020-01-09 11:12   ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-09 11:12 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Ingo Molnar, Peter Zijlstra, Dietmar Eggemann, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	Patrick Bellasi, linux-kernel, linux-fsdevel, kernel-team

Hi Quentin

On 01/07/20 13:42, Quentin Perret wrote:
> Hi Qais,
> 
> On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> > +/*
> > + * By default RT tasks run at the maximum performance point/capacity of the
> > + * system. Uclamp enforces this by always setting UCLAMP_MIN of RT tasks to
> > + * SCHED_CAPACITY_SCALE.
> > + *
> > + * This knob allows admins to change the default behavior when uclamp is being
> > + * used. In battery powered devices particularly running at the maximum
> > + * capacity will increase energy consumption and shorten the battery life.
> > + *
> > + * This knob only affects the default value RT uses when a new RT task is
> > + * forked or has just changed policy to RT and no uclamp user settings were
> > + * applied (ie: the task didn't modify the default value to a new value.
> > + *
> > + * This knob will not override the system default values defined above.
> > + */
> 
> I suppose this comment could go in the sysctl doc file instead ?

Sure. Does it hurt to keep the comment here too though?

> 
> > +unsigned int sysctl_sched_rt_uclamp_util_min = SCHED_CAPACITY_SCALE;
> 
> I would suggest renaming the knob as 'sysctl_sched_rt_default_uclamp_min'
> or something along those lines to make it clear it's a default value.

+1

> 
> And for consistency with the existing code, perhaps set the default to
> uclamp_none(UCLAMP_MAX) instead of an explicit SCHED_CAPACITY_SCALE?

Can do. But this means you need to move the initialization to sched_init() to
do a function call. Using SCHED_CAPACITY_SCALE is consistent with other uclamp
sysctl definition - there's no big advantage to this code shuffling to achieve
the same thing?

[...]

> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index e591d40fd645..19572dfc175b 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
> >   */
> >  static void task_woken_rt(struct rq *rq, struct task_struct *p)
> >  {
> > +	/*
> > +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> > +	 * we apply any new value on the next wakeup, which is here.
> > +	 */
> > +	uclamp_rt_sync_default_util_min(p);
> 
> The task has already been enqueued and sugov has been called by then I
> think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?

Yeah that might be a better place - I'm not sure why I didn't consider it. I'll
try it out.

> 
> > +
> >  	if (!task_running(rq, p) &&
> >  	    !test_tsk_need_resched(rq->curr) &&
> >  	    p->nr_cpus_allowed > 1 &&
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index 280a3c735935..337bf17b1a9d 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
> >  #endif /* CONFIG_CPU_FREQ */
> >  
> >  #ifdef CONFIG_UCLAMP_TASK
> > +void uclamp_rt_sync_default_util_min(struct task_struct *p);
> > +
> >  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
> >  
> >  static __always_inline
> > @@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
> >  	return uclamp_util_with(rq, util, NULL);
> >  }
> >  #else /* CONFIG_UCLAMP_TASK */
> > +void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
> > +
> >  static inline unsigned int uclamp_util_with(struct rq *rq, unsigned int util,
> >  					    struct task_struct *p)
> >  {
> > diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> > index 70665934d53e..06183762daac 100644
> > --- a/kernel/sysctl.c
> > +++ b/kernel/sysctl.c
> > @@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
> >  		.mode		= 0644,
> >  		.proc_handler	= sysctl_sched_uclamp_handler,
> >  	},
> > +	{
> > +		.procname	= "sched_rt_util_clamp_min",
> > +		.data		= &sysctl_sched_rt_uclamp_util_min,
> > +		.maxlen		= sizeof(unsigned int),
> > +		.mode		= 0644,
> > +		.proc_handler	= sysctl_sched_uclamp_handler,
> > +	},
> >  #endif
> >  #ifdef CONFIG_SCHED_AUTOGROUP
> >  	{
> > -- 
> > 2.17.1
> > 
> 
> Apart from the small things above, this seems like a sensible idea and
> would indeed be useful, so thanks for the patch!

Thanks for having a look!

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-07 19:30   ` Dietmar Eggemann
  2020-01-08  9:51     ` Quentin Perret
@ 2020-01-09 11:16     ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-09 11:16 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: Quentin Perret, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	Patrick Bellasi, linux-kernel, linux-fsdevel, kernel-team

On 01/07/20 20:30, Dietmar Eggemann wrote:
> On 07/01/2020 14:42, Quentin Perret wrote:
> > Hi Qais,
> > 
> > On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> 
> [...]
> 
> >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> >> index e591d40fd645..19572dfc175b 100644
> >> --- a/kernel/sched/rt.c
> >> +++ b/kernel/sched/rt.c
> >> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
> >>   */
> >>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
> >>  {
> >> +	/*
> >> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> >> +	 * we apply any new value on the next wakeup, which is here.
> >> +	 */
> >> +	uclamp_rt_sync_default_util_min(p);
> > 
> > The task has already been enqueued and sugov has been called by then I
> > think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?
> 
> That's probably better.
> Just to be sure ...we want this feature (an existing rt task gets its
> UCLAMP_MIN value set when the sysctl changes) because there could be rt
> tasks running before the sysctl is set?

Yes. If the default value changes this sync will propagate it to all current RT
task to reflect the new value. It is done in a lazy way though, so there will
be a window where the RT task runs with the old value.

> 
> >> +
> >>  	if (!task_running(rq, p) &&
> >>  	    !test_tsk_need_resched(rq->curr) &&
> >>  	    p->nr_cpus_allowed > 1 &&
> >> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> >> index 280a3c735935..337bf17b1a9d 100644
> >> --- a/kernel/sched/sched.h
> >> +++ b/kernel/sched/sched.h
> >> @@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
> >>  #endif /* CONFIG_CPU_FREQ */
> >>  
> >>  #ifdef CONFIG_UCLAMP_TASK
> >> +void uclamp_rt_sync_default_util_min(struct task_struct *p);
> >> +
> >>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
> >>  
> >>  static __always_inline
> >> @@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
> >>  	return uclamp_util_with(rq, util, NULL);
> >>  }
> >>  #else /* CONFIG_UCLAMP_TASK */
> >> +void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
> 
> -void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
> +static inline void uclamp_rt_sync_default_util_min(struct task_struct
> *p) {}

+1

Thanks!

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-08  9:51     ` Quentin Perret
  2020-01-08 19:16       ` Patrick Bellasi
@ 2020-01-09 11:36       ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-09 11:36 UTC (permalink / raw)
  To: Quentin Perret
  Cc: Dietmar Eggemann, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
	Luis Chamberlain, Kees Cook, Iurii Zaikin, Juri Lelli,
	Vincent Guittot, Ben Segall, Mel Gorman, valentin.schneider,
	Patrick Bellasi, linux-kernel, linux-fsdevel, kernel-team

On 01/08/20 09:51, Quentin Perret wrote:
> On Tuesday 07 Jan 2020 at 20:30:36 (+0100), Dietmar Eggemann wrote:
> > On 07/01/2020 14:42, Quentin Perret wrote:
> > > Hi Qais,
> > > 
> > > On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> > 
> > [...]
> > 
> > >> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > >> index e591d40fd645..19572dfc175b 100644
> > >> --- a/kernel/sched/rt.c
> > >> +++ b/kernel/sched/rt.c
> > >> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
> > >>   */
> > >>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
> > >>  {
> > >> +	/*
> > >> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> > >> +	 * we apply any new value on the next wakeup, which is here.
> > >> +	 */
> > >> +	uclamp_rt_sync_default_util_min(p);
> > > 
> > > The task has already been enqueued and sugov has been called by then I
> > > think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?
> > 
> > That's probably better.
> > Just to be sure ...we want this feature (an existing rt task gets its
> > UCLAMP_MIN value set when the sysctl changes) because there could be rt
> > tasks running before the sysctl is set?
> 
> Yeah, I was wondering the same thing, but I'd expect sysadmin to want
> this. We could change the min clamp of existing RT tasks in userspace
> instead, but given how simple Qais' lazy update code is, the in-kernel
> looks reasonable to me. No strong opinion, though.

The way I see this being used is set in init.rc. If any RT tasks were created
(most likely kthreads) before that they'll just be updated on the next
wakeup.

Of course this approach allows the value to change any point of time when the
system is running without having to do a reboot/recompile or kick a special
script/app to modify all existing RT tasks and continuously monitor new ones.

Another advantage is that apps that have special requirement (like professional
audio) can use the per-task uclamp API to bump their uclamp_min without
conflicting with the desired generic value for all other RT tasks.

IOW, we can easily at run time control the baseline performance for RT tasks
with a single knob without interfering with RT tasks that opt-in to modify
their own uclamp values.

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-08 13:44 ` Peter Zijlstra
  2020-01-08 19:08   ` Patrick Bellasi
@ 2020-01-09 13:00   ` Qais Yousef
  2020-01-10 13:39     ` Peter Zijlstra
  2020-01-10 13:42     ` Peter Zijlstra
  1 sibling, 2 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-09 13:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dietmar Eggemann, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, qperret, Patrick Bellasi,
	linux-kernel, linux-fsdevel

On 01/08/20 14:44, Peter Zijlstra wrote:
> On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> > RT tasks by default try to run at the highest capacity/performance
> > level. When uclamp is selected this default behavior is retained by
> > enforcing the uclamp_util_min of the RT tasks to be
> > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> > value.
> > 
> > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> > 
> > On battery powered devices, this default behavior could consume more
> > power, and it is desired to be able to tune it down. While uclamp allows
> > tuning this by changing the uclamp_util_min of the individual tasks, but
> > this is cumbersome and error prone.
> > 
> > To control the default behavior globally by system admins and device
> > integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> > change the default uclamp_util_min value of the RT tasks.
> > 
> > Whenever the new default changes, it'd be applied on the next wakeup of
> > the RT task, assuming that it still uses the system default value and
> > not a user applied one.
> 
> This is because these RT tasks are not in a cgroup or not affected by
> cgroup settings? I feel the justification is a little thin here.

The uclamp_min for RT tasks is always hardcoded to 1024 at the moment. So even
if they belong to a cgroup->uclamp_min = 0, they'll still run at max frequency,
no?

To control this behavior with cgroups one must have a daemon that:

	while true:
		for_each_rt_task:
			set task->uclamp_min = 0
			add to rt_cgroup
			rt_cgroup.util_min = $DESIRED_DEFAULT_RT_UCLMAP_MIN

		sleep $M_SECONDS

The above will overwrite the task util_min in case it was modified by the app
or sysadmin.


OR we can do

The counter intuitive usage of uclamp_max to throttle the default boost

	while true:
		for_each_rt_task():
			add to rt_cgroup
			rt_cgroup.util_max = $DESIRED_DEFAULT_RT_UCLMAP_MIN

		sleep $M_SECONDS

Or did I miss something?

Apologies if the justification was thin. The problem seemed too obvious to me
and maybe I missed that it might not be.

What I am trying to do is make this hardcoded value a configurable parameter so
it can be set to anything at runtime. Which gives the desired behavior of
giving the RT task the minimum boost without pushing the system to highest
performance level which would consume a lot of energy.

I anticipate this to be set once in init scripts. But I can see how this can be
modified later because for instance the device is charging and power isn't an
issue so get the max performance anyway.

Also different power save mode can modify this value at runtime.

I think this would benefit mobile and laptop equally.

Keep in mind that kthreads and irq_threads are RT tasks too. So not all
RT tasks in the system are user triggered.

> 
> > If the uclamp_util_min of an RT task is 0, then the RT utilization of
> > the rq is used to drive the frequency selection in schedutil for RT
> > tasks.
> 
> Did cpu_uclamp_write() forget to check for input<0 ?

Hmm just tried that and it seems so

# echo -1 > cpu.uclamp.min
# cat cpu.uclamp.min
42949671.96

capacity_from_percent(); we check for

7301                 if (req.percent > UCLAMP_PERCENT_SCALE) {
7302                         req.ret = -ERANGE;
7303                         return req;
7304                 }

But req.percent is s64, maybe it should be u64?

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-09  1:35   ` Valentin Schneider
  2020-01-09  9:21     ` Patrick Bellasi
@ 2020-01-09 13:15     ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-09 13:15 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Patrick Bellasi, Ingo Molnar, Peter Zijlstra, Dietmar Eggemann,
	Steven Rostedt, Luis Chamberlain, Kees Cook, Iurii Zaikin,
	Juri Lelli, Vincent Guittot, Ben Segall, Mel Gorman, qperret,
	linux-kernel, linux-fsdevel

On 01/09/20 01:35, Valentin Schneider wrote:
> On 08/01/2020 18:56, Patrick Bellasi wrote:
> > Here you are force setting the task-specific _requests_ to match the
> > system-wide _constraints_. This is not required and it's also
> > conceptually wrong, since you mix two concepts: requests and
> > constraints.
> > 
> > System-default values must never be synchronized with task-specific
> > values. This allows to always satisfy task _requests_ when not
> > conflicting with system-wide (or task-group) _constraints_.
> > 
> > For example, assuming we have a task with util_min=500 and we keep
> > changing the system-wide constraints, we would like the following
> > effective clamps to be enforced:
> > 
> >    time | system-wide | task-specific | effective clamp
> >    -----+-------------+---------------+-----------------
> >      t0 |        1024 |           500 |             500
> >      t1 |           0 |           500 |               0
> >      t2 |         200 |           500 |             200
> >      t3 |         600 |           500 |             500
> > 
> > If the taks should then change it's requested util_min:
> > 
> >    time | system-wide | task-specific | effective clamp
> >    -----+-------------+---------------+----------------
> >      t4 |         600 |          800  |             600
> >      t6 |        1024 |          800  |             800
> > 
> > If you force set the task-specific requests to match the system-wide
> > constraints, you cannot get the above described behaviors since you
> > keep overwriting the task _requests_ with system-wide _constraints_.
> > 
> 
> But is what Qais' proposing really a system-wide *constraint*? What we want
> to do here is have a knob for the RT uclamp.min values, because gotomax isn't
> viable (for mobile, you know the story!). This leaves user_defined values
> alone, so you should be able to reproduce exactly what you described above.
> If I take your t3 and t4 examples:
> 
> | time | system-wide | rt default | task-specific | user_defined | effective |                       
> |------+-------------+------------+---------------+--------------+-----------|                       
> | t3   |         600 |       1024 |           500 | Y            |       500 |                       
> | t4   |         600 |       1024 |           800 | Y            |       600 |
> 
> If the values were *not* user-defined, then it would depend on the default
> knob Qais is introducing:
> 
> | time | system-wide | rt default | task-specific | user_defined | effective |                       
> |------+-------------+------------+---------------+--------------+-----------|                       
> | t3   |         600 |       1024 |          1024 | N            |       600 |                       
> | t4   |         600 |          0 |             0 | N            |         0 | 
> 
> It's not forcing the task-specific value to the system-wide RT value, it's
> just using it as tweakable default. At least that's how I understand it,
> did I miss something?

Yes that's exactly what it should be. I am making the existing hardcoded value
a configurable parameter + some logic to make sure the new value propagates
correctly when it changes since the hardcoded value is set once when a task is
created.

> 
> > Thus, requests and contraints must always float independently and
> > used to compute the effective clamp at task wakeup time via:
> > 
> >    enqueue_task(rq, p, flags)
> >      uclamp_rq_inc(rq, p)
> >        uclamp_rq_inc_id(rq, p, clamp_id)
> >          uclamp_eff_get(p, clamp_id)
> >            uclamp_tg_restrict(p, clamp_id)
> >      p->sched_class->enqueue_task(rq, p, flags)
> > 
> > where the task-specific request is restricted considering its task group
> > effective value (the constraint).
> > 
> > Do note that the root task group effective value (for cfs) tasks is kept
> > in sync with the system default value and propagated down to the
> > effective value of all subgroups.
> > 
> > Do note also that the effective value is computed before calling into
> > the scheduling class's enqueue_task(). Which means that we have the
> > right value in place before we poke sugov.
> > 
> > Thus, a proper implementation of what you need should just
> > replicate/generalize what we already do for cfs tasks.
> > 
> 
> Reading
> 
>   7274a5c1bbec ("sched/uclamp: Propagate system defaults to the root group")
> 
> I see "The clamp values are not tunable at the level of the root task group".
> This means that, for mobile systems where we want a default uclamp.min of 0
> for RT tasks, we would need to create a cgroup for all RT tasks (and tweak
> its uclamp.min, but from playing around a bit I see that defaults to 0).
> 
> (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> when turning it on, but I think we don't have to if we just want things like
> uclamp value propagation?)
> 
> It's quite more work than the simple thing Qais is introducing (and on both
> user and kernel side).

I don't see the daemon solution is particularly pretty or intuitive for admins
to control the default boost value of the RT tasks.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-09  9:21     ` Patrick Bellasi
@ 2020-01-09 13:38       ` Qais Yousef
  2020-01-14 21:34       ` Qais Yousef
  1 sibling, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-09 13:38 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra,
	Dietmar Eggemann, Steven Rostedt, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, qperret, linux-kernel, linux-fsdevel

On 01/09/20 10:21, Patrick Bellasi wrote:
> On 09-Jan 01:35, Valentin Schneider wrote:
> > On 08/01/2020 18:56, Patrick Bellasi wrote:
> > > Here you are force setting the task-specific _requests_ to match the
> > > system-wide _constraints_. This is not required and it's also
> > > conceptually wrong, since you mix two concepts: requests and
> > > constraints.
> > > 
> > > System-default values must never be synchronized with task-specific
> > > values. This allows to always satisfy task _requests_ when not
> > > conflicting with system-wide (or task-group) _constraints_.
> > > 
> > > For example, assuming we have a task with util_min=500 and we keep
> > > changing the system-wide constraints, we would like the following
> > > effective clamps to be enforced:
> > > 
> > >    time | system-wide | task-specific | effective clamp
> > >    -----+-------------+---------------+-----------------
> > >      t0 |        1024 |           500 |             500
> > >      t1 |           0 |           500 |               0
> > >      t2 |         200 |           500 |             200
> > >      t3 |         600 |           500 |             500
> > > 
> > > If the taks should then change it's requested util_min:
> > > 
> > >    time | system-wide | task-specific | effective clamp
> > >    -----+-------------+---------------+----------------
> > >      t4 |         600 |          800  |             600
> > >      t6 |        1024 |          800  |             800
> > > 
> > > If you force set the task-specific requests to match the system-wide
> > > constraints, you cannot get the above described behaviors since you
> > > keep overwriting the task _requests_ with system-wide _constraints_.
> > > 
> > 
> > But is what Qais' proposing really a system-wide *constraint*? What we want
> > to do here is have a knob for the RT uclamp.min values, because gotomax isn't
> > viable (for mobile, you know the story!).
> 
> Yes I know, and I also see that what Qais is proposing can work.
> I'm just saying that IMO it's not the best way to add that feature.

What's the drawback? My proposed solution is a simple extension of what you did
to make sure RT task go to max OPP by default. So if this is wrong this
naturally means the original solution was wrong too?

> 
> > This leaves user_defined values alone, so you should be able to
> > reproduce exactly what you described above.
> 
> Yes, Qais is acutally affecting the effective value but it does it
> that from the RT sched class, which is not effective for sugov, and most
> importantly with a solution which is not the same we already use for
> CFS tasks.

Quentin suggested using uclamp_rq_inc() to update the default uclamp.min of the
RT tasks. This removes my call in task_woken_rt() - although I still have to
test it. Does this address your concern?

> 
> > If I take your t3 and t4 examples:
> > 
> > | time | system-wide | rt default | task-specific | user_defined | effective |                       
> > |------+-------------+------------+---------------+--------------+-----------|                       
> > | t3   |         600 |       1024 |           500 | Y            |       500 |                       
> > | t4   |         600 |       1024 |           800 | Y            |       600 |
> > 
> > If the values were *not* user-defined, then it would depend on the default
> > knob Qais is introducing:
> > 
> > | time | system-wide | rt default | task-specific | user_defined | effective |                       
> > |------+-------------+------------+---------------+--------------+-----------|                       
> > | t3   |         600 |       1024 |          1024 | N            |       600 |                       
> > | t4   |         600 |          0 |             0 | N            |         0 | 
> > 
> > It's not forcing the task-specific value to the system-wide RT value, it's
> > just using it as tweakable default. At least that's how I understand it,
> > did I miss something?
> 
> You right, but the "tweakable" default should be implemented the same
> way we do for CFS.

I am unable to see your point of view here I am afraid :(

> 
> > > Thus, requests and contraints must always float independently and
> > > used to compute the effective clamp at task wakeup time via:
> > > 
> > >    enqueue_task(rq, p, flags)
> > >      uclamp_rq_inc(rq, p)
> > >        uclamp_rq_inc_id(rq, p, clamp_id)
> > >          uclamp_eff_get(p, clamp_id)
> > >            uclamp_tg_restrict(p, clamp_id)
> > >      p->sched_class->enqueue_task(rq, p, flags)
> > > 
> > > where the task-specific request is restricted considering its task group
> > > effective value (the constraint).
> > > 
> > > Do note that the root task group effective value (for cfs) tasks is kept
> > > in sync with the system default value and propagated down to the
> > > effective value of all subgroups.
> > > 
> > > Do note also that the effective value is computed before calling into
> > > the scheduling class's enqueue_task(). Which means that we have the
> > > right value in place before we poke sugov.
> > > 
> > > Thus, a proper implementation of what you need should just
> > > replicate/generalize what we already do for cfs tasks.
> > > 
> > 
> > Reading
> > 
> >   7274a5c1bbec ("sched/uclamp: Propagate system defaults to the root group")
> > 
> > I see "The clamp values are not tunable at the level of the root task group".
> > This means that, for mobile systems where we want a default uclamp.min of 0
> > for RT tasks, we would need to create a cgroup for all RT tasks (and tweak
> > its uclamp.min, but from playing around a bit I see that defaults to 0).
> 
> That's not entirely true. In that patch we introduce cgroup support
> but, if you look at the code before that patch, for CFS tasks there is
> only:
>  - CFS task-specific values (min,max)=(0,1024) by default
>  - CFS system-wide tunables (min,max)=(1024,1024) by default
> and a change on the system-wide tunable allows for example to enforce
> a uclamp_max=200 on all tasks.
> 
> A similar solution can be implemented for RT tasks, where we have:
>  - RT task-specific values (min,max)=(1024,1024) by default
>  - RT system-wide tunables (min,max)=(1024,1024) by default
>  and a change on the system-wide tunable allows for example to enforce
>  a uclamp_min=200 on all tasks.
>  
> > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > when turning it on, but I think we don't have to if we just want things like
> > uclamp value propagation?)
> 
> No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> That's because in this case:
>   - uclamp_tg_restrict() returns just the task requested value
>   - uclamp_eff_get() _always_ restricts the requested value considering
>     the system defaults
>  
> > It's quite more work than the simple thing Qais is introducing (and on both
> > user and kernel side).
> 
> But if in the future we will want to extend CGroups support to RT then
> we will feel the pains because we do the effective computation in two
> different places.
> 
> Do note that a proper CGroup support requires that the system default
> values defines the values for the root group and are consistently
> propagated down the hierarchy. Thus we need to add a dedicated pair of
> cgroup attributes, e.g. cpu.util.rt.{min.max}.
> 
> To recap, we don't need CGROUP support right now but just to add a new
> default tracking similar to what we do for CFS.
> 
> We already proposed such a support in one of the initial versions of
> the uclamp series:
>    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
>    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
> where a:
>    static struct uclamp_se uclamp_default_perf[UCLAMP_CNT];
> was added which was kind of similar/dual to what we do for CFS tasks
> by using:
>    static struct uclamp_se uclamp_default[UCLAMP_CNT];
> 
> That patch will not apply anymore of course, but the concepts still
> hold:
>   1. a new pair of sysctl_sched_uclamp_util_{min,max}_rt are available
>      in userspace
>   2. in sysctl_sched_uclamp_handler() we updated uclamp_default_perf
>      (maybe better calling it uclamp_default_rt?) according to the
>      sysfs tunables
>   3. the uclamp_default_rt values are used in uclamp_eff_get() to
>      _always_ restrict the task specific value returned by
>      uclamp_tg_restrict() (on !CONFIG_UCLAMP_TASK_GROUP builds)
> 
> We will have to add some RT specific logic in uclamp_eff_get(), but
> likely something as simple as the following should be ehought?
> 
> ---8<---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 44123b4d14e8..f4a6b120da91 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -897,6 +897,15 @@ uclamp_tg_restrict(struct task_struct *p, enum uclamp_id clamp_id)
>         return uc_req;
>  }
>  
> +static inline struct uclamp_se
> +uclamp_default_get(struct task_struct *p, clamp_id)
> +{
> +  if (rt_task(p))
> +    return  uclamp_default_rt[clamp_id];

Is this supposed to come from the new sysctl?

I guess your patch assumes we remove the hardcoded set of uclamp_min for RT
task on creation?

> +
> +  return  uclamp_default[clamp_id];
> +}
> +
>  /*
>   * The effective clamp bucket index of a task depends on, by increasing
>   * priority:
> @@ -909,7 +918,7 @@ static inline struct uclamp_se
>  uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
>  {
>         struct uclamp_se uc_req = uclamp_tg_restrict(p, clamp_id);
> -       struct uclamp_se uc_max = uclamp_default[clamp_id];
> +       struct uclamp_se uc_max = uclamp_default_get(p clamp_id);
>  
>         /* System default restrictions always apply */
>         if (unlikely(uc_req.value > uc_max.value))

If I set the RT default.uclamp.min to 1024, but on task creation the RT task
uclamp.min is 0; how this will enforce the 1024?

Maybe I need to look at this again on another day when I don't have a headache
and my head is clearer.

But if your objection is merely based on the way the lazy sync/update of the
default value is done then this is workable.

As it stands I don't see how what you propose is cleaner or better than what
I propose. I see my solution is more straightforward and obvious than this one.
But I'll ponder over it more as I could have easily missed some subtlety here
or there :-)

Thanks a lot!

--
Qais Yousef

> 
> ---8<---
> 
> Adding a proper cgroup support would also be better. Otherwise, maybe
> for the time being we can live with just system-defaults to constrain
> the task-specific values with something like:
> 
> ---8<---
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 44123b4d14e8..73638c0cc65a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -877,6 +877,10 @@ static inline struct uclamp_se
>  uclamp_tg_restrict(struct task_struct *p, enum uclamp_id clamp_id)
>  {
>         struct uclamp_se uc_req = p->uclamp_req[clamp_id];
> +
> +  if (rt_task(p))
> +    return uc_req;
> +
>  #ifdef CONFIG_UCLAMP_TASK_GROUP
>         struct uclamp_se uc_max;
> 
> ---8<---
> 
> Or accept (without the above) that we can constraint RT tasks using
> CFS clamps (i.e. cpu.util.{min,max}) when running in a CGroups.
> Which is ugly but can work for many use-cases if runtimes like Android
> take create to create cgroup specifically dedicated to RT tasks. :/
> 
> -- 
> #include <best/regards.h>
> 
> Patrick Bellasi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-09 13:00   ` Qais Yousef
@ 2020-01-10 13:39     ` Peter Zijlstra
  2020-01-12 23:35       ` Qais Yousef
  2020-01-10 13:42     ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2020-01-10 13:39 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Dietmar Eggemann, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, qperret, Patrick Bellasi,
	linux-kernel, linux-fsdevel

On Thu, Jan 09, 2020 at 01:00:58PM +0000, Qais Yousef wrote:
> On 01/08/20 14:44, Peter Zijlstra wrote:
> > On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> > > RT tasks by default try to run at the highest capacity/performance
> > > level. When uclamp is selected this default behavior is retained by
> > > enforcing the uclamp_util_min of the RT tasks to be
> > > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> > > value.
> > > 
> > > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> > > 
> > > On battery powered devices, this default behavior could consume more
> > > power, and it is desired to be able to tune it down. While uclamp allows
> > > tuning this by changing the uclamp_util_min of the individual tasks, but
> > > this is cumbersome and error prone.
> > > 
> > > To control the default behavior globally by system admins and device
> > > integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> > > change the default uclamp_util_min value of the RT tasks.
> > > 
> > > Whenever the new default changes, it'd be applied on the next wakeup of
> > > the RT task, assuming that it still uses the system default value and
> > > not a user applied one.
> > 
> > This is because these RT tasks are not in a cgroup or not affected by
> > cgroup settings? I feel the justification is a little thin here.
> 
> The uclamp_min for RT tasks is always hardcoded to 1024 at the moment. So even
> if they belong to a cgroup->uclamp_min = 0, they'll still run at max frequency,
> no?

Argh, this is that counter intuitive max aggregate nonsense biting me.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-09 13:00   ` Qais Yousef
  2020-01-10 13:39     ` Peter Zijlstra
@ 2020-01-10 13:42     ` Peter Zijlstra
  2020-01-12 23:31       ` Qais Yousef
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2020-01-10 13:42 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Ingo Molnar, Dietmar Eggemann, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, qperret, Patrick Bellasi,
	linux-kernel, linux-fsdevel

On Thu, Jan 09, 2020 at 01:00:58PM +0000, Qais Yousef wrote:
> On 01/08/20 14:44, Peter Zijlstra wrote:

> > Did cpu_uclamp_write() forget to check for input<0 ?
> 
> Hmm just tried that and it seems so
> 
> # echo -1 > cpu.uclamp.min
> # cat cpu.uclamp.min
> 42949671.96
> 
> capacity_from_percent(); we check for
> 
> 7301                 if (req.percent > UCLAMP_PERCENT_SCALE) {
> 7302                         req.ret = -ERANGE;
> 7303                         return req;
> 7304                 }
> 
> But req.percent is s64, maybe it should be u64?

		if ((u64)req.percent > UCLAMP_PERCENT_SCALE)

should do, I think.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-10 13:42     ` Peter Zijlstra
@ 2020-01-12 23:31       ` Qais Yousef
  0 siblings, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-12 23:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dietmar Eggemann, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, qperret, Patrick Bellasi,
	linux-kernel, linux-fsdevel

On 01/10/20 14:42, Peter Zijlstra wrote:
> On Thu, Jan 09, 2020 at 01:00:58PM +0000, Qais Yousef wrote:
> > On 01/08/20 14:44, Peter Zijlstra wrote:
> 
> > > Did cpu_uclamp_write() forget to check for input<0 ?
> > 
> > Hmm just tried that and it seems so
> > 
> > # echo -1 > cpu.uclamp.min
> > # cat cpu.uclamp.min
> > 42949671.96
> > 
> > capacity_from_percent(); we check for
> > 
> > 7301                 if (req.percent > UCLAMP_PERCENT_SCALE) {
> > 7302                         req.ret = -ERANGE;
> > 7303                         return req;
> > 7304                 }
> > 
> > But req.percent is s64, maybe it should be u64?
> 
> 		if ((u64)req.percent > UCLAMP_PERCENT_SCALE)
> 
> should do, I think.

Okay I'll send a separate fix for that.

Cheers

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-10 13:39     ` Peter Zijlstra
@ 2020-01-12 23:35       ` Qais Yousef
  0 siblings, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-12 23:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Dietmar Eggemann, Steven Rostedt, Luis Chamberlain,
	Kees Cook, Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, valentin.schneider, qperret, Patrick Bellasi,
	linux-kernel, linux-fsdevel

On 01/10/20 14:39, Peter Zijlstra wrote:
> On Thu, Jan 09, 2020 at 01:00:58PM +0000, Qais Yousef wrote:
> > On 01/08/20 14:44, Peter Zijlstra wrote:
> > > On Fri, Dec 20, 2019 at 04:48:38PM +0000, Qais Yousef wrote:
> > > > RT tasks by default try to run at the highest capacity/performance
> > > > level. When uclamp is selected this default behavior is retained by
> > > > enforcing the uclamp_util_min of the RT tasks to be
> > > > uclamp_none(UCLAMP_MAX), which is SCHED_CAPACITY_SCALE; the maximum
> > > > value.
> > > > 
> > > > See commit 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks").
> > > > 
> > > > On battery powered devices, this default behavior could consume more
> > > > power, and it is desired to be able to tune it down. While uclamp allows
> > > > tuning this by changing the uclamp_util_min of the individual tasks, but
> > > > this is cumbersome and error prone.
> > > > 
> > > > To control the default behavior globally by system admins and device
> > > > integrators, introduce the new sysctl_sched_rt_uclamp_util_min to
> > > > change the default uclamp_util_min value of the RT tasks.
> > > > 
> > > > Whenever the new default changes, it'd be applied on the next wakeup of
> > > > the RT task, assuming that it still uses the system default value and
> > > > not a user applied one.
> > > 
> > > This is because these RT tasks are not in a cgroup or not affected by
> > > cgroup settings? I feel the justification is a little thin here.
> > 
> > The uclamp_min for RT tasks is always hardcoded to 1024 at the moment. So even
> > if they belong to a cgroup->uclamp_min = 0, they'll still run at max frequency,
> > no?
> 
> Argh, this is that counter intuitive max aggregate nonsense biting me.

Yeah I thought we already have a mechanism to control this, until you try to do
it then you find out we don't. Not conveniently at least.

Are you okay with the sysctl to tune this behavior then?

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-09  9:21     ` Patrick Bellasi
  2020-01-09 13:38       ` Qais Yousef
@ 2020-01-14 21:34       ` Qais Yousef
  2020-01-22 10:19         ` Patrick Bellasi
  1 sibling, 1 reply; 25+ messages in thread
From: Qais Yousef @ 2020-01-14 21:34 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra,
	Dietmar Eggemann, Steven Rostedt, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, qperret, linux-kernel, linux-fsdevel

On 01/09/20 10:21, Patrick Bellasi wrote:
> That's not entirely true. In that patch we introduce cgroup support
> but, if you look at the code before that patch, for CFS tasks there is
> only:
>  - CFS task-specific values (min,max)=(0,1024) by default
>  - CFS system-wide tunables (min,max)=(1024,1024) by default
> and a change on the system-wide tunable allows for example to enforce
> a uclamp_max=200 on all tasks.
> 
> A similar solution can be implemented for RT tasks, where we have:
>  - RT task-specific values (min,max)=(1024,1024) by default
>  - RT system-wide tunables (min,max)=(1024,1024) by default
>  and a change on the system-wide tunable allows for example to enforce
>  a uclamp_min=200 on all tasks.

I feel I'm already getting lost in the complexity of the interaction here. Do
we really need to go that path?

So we will end up with a default system wide for all tasks + a CFS system wide
default + an RT system wide default?

As I understand it, we have a single system wide default now.

>  
> > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > when turning it on, but I think we don't have to if we just want things like
> > uclamp value propagation?)
> 
> No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> That's because in this case:
>   - uclamp_tg_restrict() returns just the task requested value
>   - uclamp_eff_get() _always_ restricts the requested value considering
>     the system defaults
>  
> > It's quite more work than the simple thing Qais is introducing (and on both
> > user and kernel side).
> 
> But if in the future we will want to extend CGroups support to RT then
> we will feel the pains because we do the effective computation in two
> different places.

Hmm what you're suggesting here is that we want to have
cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.

It makes more sense to create a special group for all rt tasks rather than
treat rt tasks in a cgroup differently.

> 
> Do note that a proper CGroup support requires that the system default
> values defines the values for the root group and are consistently
> propagated down the hierarchy. Thus we need to add a dedicated pair of
> cgroup attributes, e.g. cpu.util.rt.{min.max}.
> 
> To recap, we don't need CGROUP support right now but just to add a new
> default tracking similar to what we do for CFS.
> 
> We already proposed such a support in one of the initial versions of
> the uclamp series:
>    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
>    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/

IIUC what you're suggesting is:

	1. Use the sysctl to specify the default_rt_uclamp_min
	2. Enforce this value in uclamp_eff_get() rather than my sync logic
	3. Remove the current hack to always set
	   rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)

If I got it correctly I'd be happy to experiment with it if this is what
you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
problem you're trying to highlight.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-14 21:34       ` Qais Yousef
@ 2020-01-22 10:19         ` Patrick Bellasi
  2020-01-22 11:45           ` Qais Yousef
  0 siblings, 1 reply; 25+ messages in thread
From: Patrick Bellasi @ 2020-01-22 10:19 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra,
	Dietmar Eggemann, Steven Rostedt, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, qperret, linux-kernel, linux-fsdevel

On 14-Jan 21:34, Qais Yousef wrote:
> On 01/09/20 10:21, Patrick Bellasi wrote:
> > That's not entirely true. In that patch we introduce cgroup support
> > but, if you look at the code before that patch, for CFS tasks there is
> > only:
> >  - CFS task-specific values (min,max)=(0,1024) by default
> >  - CFS system-wide tunables (min,max)=(1024,1024) by default
> > and a change on the system-wide tunable allows for example to enforce
> > a uclamp_max=200 on all tasks.
> > 
> > A similar solution can be implemented for RT tasks, where we have:
> >  - RT task-specific values (min,max)=(1024,1024) by default
> >  - RT system-wide tunables (min,max)=(1024,1024) by default
> >  and a change on the system-wide tunable allows for example to enforce
> >  a uclamp_min=200 on all tasks.
> 
> I feel I'm already getting lost in the complexity of the interaction here. Do
> we really need to go that path?
> 
> So we will end up with a default system wide for all tasks + a CFS system wide
> default + an RT system wide default?
> 
> As I understand it, we have a single system wide default now.

Right now we have one system wide default and that's both for all
CFS/RT tasks, when cgroups are not in use, or for root group and
autogroup CFS/RT tasks, when cgroups are in use.

What I'm proposing is to limit the usage of the current system wide
default to CFS tasks only, while we add a similar new one specifically
for RT tasks.

At the end we will have two system wide defaults, not three.

> > > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > > when turning it on, but I think we don't have to if we just want things like
> > > uclamp value propagation?)
> > 
> > No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> > That's because in this case:
> >   - uclamp_tg_restrict() returns just the task requested value
> >   - uclamp_eff_get() _always_ restricts the requested value considering
> >     the system defaults
> >  
> > > It's quite more work than the simple thing Qais is introducing (and on both
> > > user and kernel side).
> > 
> > But if in the future we will want to extend CGroups support to RT then
> > we will feel the pains because we do the effective computation in two
> > different places.
> 
> Hmm what you're suggesting here is that we want to have
> cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.

That's exactly what we already do for other controllers. For example,
if you look at the bandwidth controller, we have separate knobs for
CFS and RT tasks.

> It makes more sense to create a special group for all rt tasks rather than
> treat rt tasks in a cgroup differently.

Don't see why that should make more sense. This can work of course but
it would enforce a more strict configuration and usage of cgroups to
userspace.

I also have some doubths about this approach matching the delegation
model principles.

> > Do note that a proper CGroup support requires that the system default
> > values defines the values for the root group and are consistently
> > propagated down the hierarchy. Thus we need to add a dedicated pair of
> > cgroup attributes, e.g. cpu.util.rt.{min.max}.
> > 
> > To recap, we don't need CGROUP support right now but just to add a new
> > default tracking similar to what we do for CFS.
> >
> > We already proposed such a support in one of the initial versions of
> > the uclamp series:
> >    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
> >    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
> 
> IIUC what you're suggesting is:
> 
> 	1. Use the sysctl to specify the default_rt_uclamp_min
> 	2. Enforce this value in uclamp_eff_get() rather than my sync logic
> 	3. Remove the current hack to always set
> 	   rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)

Right, that's the essence...

> If I got it correctly I'd be happy to experiment with it if this is what
> you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
> problem you're trying to highlight.

... from what your write above I think you got it right.

In my previous posting:

   Message-ID: <20200109092137.GA2811@darkstar>
   https://lore.kernel.org/lkml/20200109092137.GA2811@darkstar/

there is also the code snippet which should be good enough to extend
uclamp_eff_get(). Apart from that, what remains is:
- to add the two new sysfs knobs for sysctl_sched_uclamp_util_{min,max}_rt
- make a call about how rt tasks in cgroups are clamped, a simple
  proposal is in the second snippet of my message above.

Best,
Patrick

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-22 10:19         ` Patrick Bellasi
@ 2020-01-22 11:45           ` Qais Yousef
  2020-01-22 12:44             ` Patrick Bellasi
  0 siblings, 1 reply; 25+ messages in thread
From: Qais Yousef @ 2020-01-22 11:45 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra,
	Dietmar Eggemann, Steven Rostedt, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, qperret, linux-kernel, linux-fsdevel

On 01/22/20 11:19, Patrick Bellasi wrote:
> On 14-Jan 21:34, Qais Yousef wrote:
> > On 01/09/20 10:21, Patrick Bellasi wrote:
> > > That's not entirely true. In that patch we introduce cgroup support
> > > but, if you look at the code before that patch, for CFS tasks there is
> > > only:
> > >  - CFS task-specific values (min,max)=(0,1024) by default
> > >  - CFS system-wide tunables (min,max)=(1024,1024) by default
> > > and a change on the system-wide tunable allows for example to enforce
> > > a uclamp_max=200 on all tasks.
> > > 
> > > A similar solution can be implemented for RT tasks, where we have:
> > >  - RT task-specific values (min,max)=(1024,1024) by default
> > >  - RT system-wide tunables (min,max)=(1024,1024) by default
> > >  and a change on the system-wide tunable allows for example to enforce
> > >  a uclamp_min=200 on all tasks.
> > 
> > I feel I'm already getting lost in the complexity of the interaction here. Do
> > we really need to go that path?
> > 
> > So we will end up with a default system wide for all tasks + a CFS system wide
> > default + an RT system wide default?
> > 
> > As I understand it, we have a single system wide default now.
> 
> Right now we have one system wide default and that's both for all
> CFS/RT tasks, when cgroups are not in use, or for root group and
> autogroup CFS/RT tasks, when cgroups are in use.
> 
> What I'm proposing is to limit the usage of the current system wide
> default to CFS tasks only, while we add a similar new one specifically
> for RT tasks.

This change of behavior will be user-visible. I doubt anyone depends on it yet,
so if we want to change it now is the time.

The main issue I see here is that the system wide are *always* enforced. They
guarantee a global restriction. Which you're proposing to split into CFS and RT
system wide global restrictions.

But what I am proposing here is setting the *default* fork value of RT uclamp
min. Which is currently hardcoded to uclamp_none(UCLAMP_MAX) = 1024.

I think the 2 concepts are different to a system wide RT control. Or I am still
confused about your vision of how this should really work. I don't see how what
you propose could solve my problem.

For example:

	sysctl_sched_uclamp_util_min = 1024

means that tasks can set their uclamp_min to [0:1024], but

	sysctl_sched_uclamp_util_min = 800

means that tasks can set their uclamp_min to [0:800]

Which when splitting this to CFS and RT controls, they'd still mean control
the possible range the tasks can use.

But what I want here is that

	sysctl_sched_rt_default_uclamp_util_min = 800

Will set p->uclamp_min = 800 when a new RT task is forked and will stay like
this until the user modifies it. And update all current RT tasks that are still
using the default value to the new one when ever this changes.

If this value is outside the system wide allowed range; it'll be clamped
correctly by uclamp_eff_get().

Note that the default fork values for CFS are (0, 1024) and we don't have a way
to change this at the moment. Can be easily done if someone comes up with a use
case for it.

> 
> At the end we will have two system wide defaults, not three.
> 
> > > > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > > > when turning it on, but I think we don't have to if we just want things like
> > > > uclamp value propagation?)
> > > 
> > > No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> > > That's because in this case:
> > >   - uclamp_tg_restrict() returns just the task requested value
> > >   - uclamp_eff_get() _always_ restricts the requested value considering
> > >     the system defaults
> > >  
> > > > It's quite more work than the simple thing Qais is introducing (and on both
> > > > user and kernel side).
> > > 
> > > But if in the future we will want to extend CGroups support to RT then
> > > we will feel the pains because we do the effective computation in two
> > > different places.
> > 
> > Hmm what you're suggesting here is that we want to have
> > cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.
> 
> That's exactly what we already do for other controllers. For example,
> if you look at the bandwidth controller, we have separate knobs for
> CFS and RT tasks.
> 
> > It makes more sense to create a special group for all rt tasks rather than
> > treat rt tasks in a cgroup differently.
> 
> Don't see why that should make more sense. This can work of course but
> it would enforce a more strict configuration and usage of cgroups to
> userspace.

It makes sense to me because cgroups offer a logical separation, and if tasks
within the same group need to be treated differently then they're logically
separated and need to be grouped differently consequently.

The uclamp values are not RT only properties.

I can see the convenience of doing it the way you propose though. Not sure
I agree about the added complexity worth it.

> 
> I also have some doubths about this approach matching the delegation
> model principles.

I don't know about that TBH.

> 
> > > Do note that a proper CGroup support requires that the system default
> > > values defines the values for the root group and are consistently
> > > propagated down the hierarchy. Thus we need to add a dedicated pair of
> > > cgroup attributes, e.g. cpu.util.rt.{min.max}.
> > > 
> > > To recap, we don't need CGROUP support right now but just to add a new
> > > default tracking similar to what we do for CFS.
> > >
> > > We already proposed such a support in one of the initial versions of
> > > the uclamp series:
> > >    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
> > >    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
> > 
> > IIUC what you're suggesting is:
> > 
> > 	1. Use the sysctl to specify the default_rt_uclamp_min
> > 	2. Enforce this value in uclamp_eff_get() rather than my sync logic
> > 	3. Remove the current hack to always set
> > 	   rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)
> 
> Right, that's the essence...
> 
> > If I got it correctly I'd be happy to experiment with it if this is what
> > you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
> > problem you're trying to highlight.
> 
> ... from what your write above I think you got it right.
> 
> In my previous posting:
> 
>    Message-ID: <20200109092137.GA2811@darkstar>
>    https://lore.kernel.org/lkml/20200109092137.GA2811@darkstar/
> 
> there is also the code snippet which should be good enough to extend
> uclamp_eff_get(). Apart from that, what remains is:

I still have my doubts this will achieve what I want, but hopefully will have
time to experiment with this today. My worry is that this function enforces
restrictions, but what I want to do is set a default. I still can't see how
they are the same; which your answers hints they are.

> - to add the two new sysfs knobs for sysctl_sched_uclamp_util_{min,max}_rt

See my answer above.

> - make a call about how rt tasks in cgroups are clamped, a simple
>   proposal is in the second snippet of my message above.

I'm not keen on changing how this works and it is outside of the scope of this
patch anyway. Or at least I don't see how they are related yet.

Thanks!

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-22 11:45           ` Qais Yousef
@ 2020-01-22 12:44             ` Patrick Bellasi
  2020-01-22 14:57               ` Qais Yousef
  0 siblings, 1 reply; 25+ messages in thread
From: Patrick Bellasi @ 2020-01-22 12:44 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra,
	Dietmar Eggemann, Steven Rostedt, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, qperret, linux-kernel, linux-fsdevel

On 22-Jan 11:45, Qais Yousef wrote:
> On 01/22/20 11:19, Patrick Bellasi wrote:
> > On 14-Jan 21:34, Qais Yousef wrote:
> > > On 01/09/20 10:21, Patrick Bellasi wrote:
> > > > That's not entirely true. In that patch we introduce cgroup support
> > > > but, if you look at the code before that patch, for CFS tasks there is
> > > > only:
> > > >  - CFS task-specific values (min,max)=(0,1024) by default
> > > >  - CFS system-wide tunables (min,max)=(1024,1024) by default
> > > > and a change on the system-wide tunable allows for example to enforce
> > > > a uclamp_max=200 on all tasks.
> > > > 
> > > > A similar solution can be implemented for RT tasks, where we have:
> > > >  - RT task-specific values (min,max)=(1024,1024) by default
> > > >  - RT system-wide tunables (min,max)=(1024,1024) by default
> > > >  and a change on the system-wide tunable allows for example to enforce
> > > >  a uclamp_min=200 on all tasks.
> > > 
> > > I feel I'm already getting lost in the complexity of the interaction here. Do
> > > we really need to go that path?
> > > 
> > > So we will end up with a default system wide for all tasks + a CFS system wide
> > > default + an RT system wide default?
> > > 
> > > As I understand it, we have a single system wide default now.
> > 
> > Right now we have one system wide default and that's both for all
> > CFS/RT tasks, when cgroups are not in use, or for root group and
> > autogroup CFS/RT tasks, when cgroups are in use.
> > 
> > What I'm proposing is to limit the usage of the current system wide
> > default to CFS tasks only, while we add a similar new one specifically
> > for RT tasks.
> 
> This change of behavior will be user-visible. I doubt anyone depends on it yet,
> so if we want to change it now is the time.

As it will be forcing users to move tasks into a different group.

> The main issue I see here is that the system wide are *always* enforced. They
> guarantee a global restriction. Which you're proposing to split into CFS and RT
> system wide global restrictions.

For RT tasks, since we don't have a utilization signal, we always
clamp the 1024 value. That's why forcing a default can be obtained by
just setting a constraints.

Do note that, right now we don't have a concept of default _requested_
clamp value. Perhaps having used uclamp_default[UCLAMP_CNT] has been
misleading but what that value does is just setting the default
_max constraint_ value. If you don't change them, we allow min,max
constraints to have any value.

> But what I am proposing here is setting the *default* fork value of RT uclamp
> min. Which is currently hardcoded to uclamp_none(UCLAMP_MAX) = 1024.

Yes, I got what you propose. And that's what I don't like.

The default _requested_ clamp value for RT task is 1024 dy definition,
since we decided that RT tasks should run to max OPP by default.

> I think the 2 concepts are different to a system wide RT control. Or I am still
> confused about your vision of how this should really work. I don't see how what
> you propose could solve my problem.
> 
> For example:
> 
> 	sysctl_sched_uclamp_util_min = 1024
> 
> means that tasks can set their uclamp_min to [0:1024], but
> 
> 	sysctl_sched_uclamp_util_min = 800
> 
> means that tasks can set their uclamp_min to [0:800]

You don't have to play with the min clamp.

To recap, for RT tasks:
 - the default _requested_ value is 1024
 - we always clamp the requested value considering the system wide
   clamps
 - the system wide clamps for RT tasks will be, by default, set to:
   (util_min,util_max) = (1024,1024)

According to the above we have the default behaviors, RT tasks running
at max OPP.

Now, let say you wanna your RT tasks to run only up to 200.
Well, what you need to do is to clamp the default _requested_ value
(1024) with a clamp range like (*,200):

   rt_clamped_util = uclamp_util(1024, (*,200)) = 200

IOW, what you will use to reduce the OPP is something like:

     sysctl_sched_uclamp_util_rt_max = 200

You just need to reduce the MAX clamp constraint, which will enforce
whatever the task requires and/or its min to be not greater the 200.


> Which when splitting this to CFS and RT controls, they'd still mean control
> the possible range the tasks can use.
> 
> But what I want here is that
> 
> 	sysctl_sched_rt_default_uclamp_util_min = 800
> 
> Will set p->uclamp_min = 800 when a new RT task is forked and will stay like
> this until the user modifies it. And update all current RT tasks that are still
> using the default value to the new one when ever this changes.

Right, that will work, but you are adding design concepts on top of a
design which allows already you to get what you want: just by properly
setting the constraints.

> If this value is outside the system wide allowed range; it'll be clamped
> correctly by uclamp_eff_get().
> 
> Note that the default fork values for CFS are (0, 1024) and we don't have a way
> to change this at the moment.

That's not true, the default value at FORK is to copy the parent
clamps. Unless you ask for a reset on fork.


> > 
> > At the end we will have two system wide defaults, not three.
> > 
> > > > > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > > > > when turning it on, but I think we don't have to if we just want things like
> > > > > uclamp value propagation?)
> > > > 
> > > > No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> > > > That's because in this case:
> > > >   - uclamp_tg_restrict() returns just the task requested value
> > > >   - uclamp_eff_get() _always_ restricts the requested value considering
> > > >     the system defaults
> > > >  
> > > > > It's quite more work than the simple thing Qais is introducing (and on both
> > > > > user and kernel side).
> > > > 
> > > > But if in the future we will want to extend CGroups support to RT then
> > > > we will feel the pains because we do the effective computation in two
> > > > different places.
> > > 
> > > Hmm what you're suggesting here is that we want to have
> > > cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.
> > 
> > That's exactly what we already do for other controllers. For example,
> > if you look at the bandwidth controller, we have separate knobs for
> > CFS and RT tasks.
> > 
> > > It makes more sense to create a special group for all rt tasks rather than
> > > treat rt tasks in a cgroup differently.
> > 
> > Don't see why that should make more sense. This can work of course but
> > it would enforce a more strict configuration and usage of cgroups to
> > userspace.
> 
> It makes sense to me because cgroups offer a logical separation, and if tasks
> within the same group need to be treated differently then they're logically
> separated and need to be grouped differently consequently.

And that's why we can have knobs for different scheduling classes in
the same task group.

> The uclamp values are not RT only properties.

In the current implementation, but that was not certainly the original
intent. Ref to:

   Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
   https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/

I did remove the CFS/RT specialization just to keep uclamp simple for
the first merge. But now that we have it in and people understand the
RT use-case better, it's just possible to complete the design with
specialized CFS and RT default constraints and knobs.

I also remember that Vincent pointed out, at a certain point, that RT
and CFS tasks should be distinguished internally when it comes to
aggregate the constraints. This will require more work, but what I'm
proposing since the beginning is the first step toward this solution.

> I can see the convenience of doing it the way you propose though. Not sure
> I agree about the added complexity worth it.

I keep not seeing the added complexity. We just need to add two new system
wide knobs and a small filter in the uclamp_eff_get().

> > I also have some doubths about this approach matching the delegation
> > model principles.
> 
> I don't know about that TBH.

Then it's worth a reading. I remember I discovered quite a lot of
caviets about how cgroups are supposed to be supported and used.

I have the doubth that reflecting your default concept into something
playing well with cgroups could be tricky. If anything else, the
constraints propagation approach we use know is something Teju already
gave us his blessing on.

> > > > Do note that a proper CGroup support requires that the system default
> > > > values defines the values for the root group and are consistently
> > > > propagated down the hierarchy. Thus we need to add a dedicated pair of
> > > > cgroup attributes, e.g. cpu.util.rt.{min.max}.
> > > > 
> > > > To recap, we don't need CGROUP support right now but just to add a new
> > > > default tracking similar to what we do for CFS.
> > > >
> > > > We already proposed such a support in one of the initial versions of
> > > > the uclamp series:
> > > >    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
> > > >    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
> > > 
> > > IIUC what you're suggesting is:
> > > 
> > > 	1. Use the sysctl to specify the default_rt_uclamp_min
> > > 	2. Enforce this value in uclamp_eff_get() rather than my sync logic
> > > 	3. Remove the current hack to always set
> > > 	   rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)
> > 
> > Right, that's the essence...
> > 
> > > If I got it correctly I'd be happy to experiment with it if this is what
> > > you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
> > > problem you're trying to highlight.
> > 
> > ... from what your write above I think you got it right.
> > 
> > In my previous posting:
> > 
> >    Message-ID: <20200109092137.GA2811@darkstar>
> >    https://lore.kernel.org/lkml/20200109092137.GA2811@darkstar/
> > 
> > there is also the code snippet which should be good enough to extend
> > uclamp_eff_get(). Apart from that, what remains is:
> 
> I still have my doubts this will achieve what I want, but hopefully will have
> time to experiment with this today.

Right, just remember you need to reduce the max clamp, not the min as
you was proposing in the example above.

> My worry is that this function enforces
> restrictions, but what I want to do is set a default. I still can't see how
> they are the same; which your answers hints they are.

It all boils down to see that:
   
   rt_clamped_util = uclamp_util(1024, (*,200)) = 200

> > - to add the two new sysfs knobs for sysctl_sched_uclamp_util_{min,max}_rt
> 
> See my answer above.
> 
> > - make a call about how rt tasks in cgroups are clamped, a simple
> >   proposal is in the second snippet of my message above.
> 
> I'm not keen on changing how this works and it is outside of the scope of this
> patch anyway. Or at least I don't see how they are related yet.

Sure you'll realize once (if) you add the small uclamp_eff_get()
change I'm proposing.

Best,
Patrick

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
  2020-01-22 12:44             ` Patrick Bellasi
@ 2020-01-22 14:57               ` Qais Yousef
  0 siblings, 0 replies; 25+ messages in thread
From: Qais Yousef @ 2020-01-22 14:57 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: Valentin Schneider, Ingo Molnar, Peter Zijlstra,
	Dietmar Eggemann, Steven Rostedt, Luis Chamberlain, Kees Cook,
	Iurii Zaikin, Juri Lelli, Vincent Guittot, Ben Segall,
	Mel Gorman, qperret, linux-kernel, linux-fsdevel

On 01/22/20 13:44, Patrick Bellasi wrote:
> On 22-Jan 11:45, Qais Yousef wrote:
> > On 01/22/20 11:19, Patrick Bellasi wrote:
> > > On 14-Jan 21:34, Qais Yousef wrote:
> > > > On 01/09/20 10:21, Patrick Bellasi wrote:
> > > > > That's not entirely true. In that patch we introduce cgroup support
> > > > > but, if you look at the code before that patch, for CFS tasks there is
> > > > > only:
> > > > >  - CFS task-specific values (min,max)=(0,1024) by default
> > > > >  - CFS system-wide tunables (min,max)=(1024,1024) by default
> > > > > and a change on the system-wide tunable allows for example to enforce
> > > > > a uclamp_max=200 on all tasks.
> > > > > 
> > > > > A similar solution can be implemented for RT tasks, where we have:
> > > > >  - RT task-specific values (min,max)=(1024,1024) by default
> > > > >  - RT system-wide tunables (min,max)=(1024,1024) by default
> > > > >  and a change on the system-wide tunable allows for example to enforce
> > > > >  a uclamp_min=200 on all tasks.
> > > > 
> > > > I feel I'm already getting lost in the complexity of the interaction here. Do
> > > > we really need to go that path?
> > > > 
> > > > So we will end up with a default system wide for all tasks + a CFS system wide
> > > > default + an RT system wide default?
> > > > 
> > > > As I understand it, we have a single system wide default now.
> > > 
> > > Right now we have one system wide default and that's both for all
> > > CFS/RT tasks, when cgroups are not in use, or for root group and
> > > autogroup CFS/RT tasks, when cgroups are in use.
> > > 
> > > What I'm proposing is to limit the usage of the current system wide
> > > default to CFS tasks only, while we add a similar new one specifically
> > > for RT tasks.
> > 
> > This change of behavior will be user-visible. I doubt anyone depends on it yet,
> > so if we want to change it now is the time.
> 
> As it will be forcing users to move tasks into a different group.

I don't understand.

What I meant is that there will be possible ABI breakage since the procfs
entries will disappear if we change the behavior. I don't think that's
a problem since unlikely there are dependent users yet.

> 
> > The main issue I see here is that the system wide are *always* enforced. They
> > guarantee a global restriction. Which you're proposing to split into CFS and RT
> > system wide global restrictions.
> 
> For RT tasks, since we don't have a utilization signal, we always
> clamp the 1024 value. That's why forcing a default can be obtained by
> just setting a constraints.
> 
> Do note that, right now we don't have a concept of default _requested_
> clamp value. Perhaps having used uclamp_default[UCLAMP_CNT] has been
> misleading but what that value does is just setting the default
> _max constraint_ value. If you don't change them, we allow min,max
> constraints to have any value.
> 
> > But what I am proposing here is setting the *default* fork value of RT uclamp
> > min. Which is currently hardcoded to uclamp_none(UCLAMP_MAX) = 1024.
> 
> Yes, I got what you propose. And that's what I don't like.

Because... ?

> 
> The default _requested_ clamp value for RT task is 1024 dy definition,
> since we decided that RT tasks should run to max OPP by default.

Yes and they will remain such that with this patch. But what we need is a way
to control this default value.

> 
> > I think the 2 concepts are different to a system wide RT control. Or I am still
> > confused about your vision of how this should really work. I don't see how what
> > you propose could solve my problem.
> > 
> > For example:
> > 
> > 	sysctl_sched_uclamp_util_min = 1024
> > 
> > means that tasks can set their uclamp_min to [0:1024], but
> > 
> > 	sysctl_sched_uclamp_util_min = 800
> > 
> > means that tasks can set their uclamp_min to [0:800]
> 
> You don't have to play with the min clamp.
> 
> To recap, for RT tasks:
>  - the default _requested_ value is 1024
>  - we always clamp the requested value considering the system wide
>    clamps
>  - the system wide clamps for RT tasks will be, by default, set to:
>    (util_min,util_max) = (1024,1024)
> 
> According to the above we have the default behaviors, RT tasks running
> at max OPP.
> 
> Now, let say you wanna your RT tasks to run only up to 200.
> Well, what you need to do is to clamp the default _requested_ value
> (1024) with a clamp range like (*,200):
> 
>    rt_clamped_util = uclamp_util(1024, (*,200)) = 200
> 
> IOW, what you will use to reduce the OPP is something like:
> 
>      sysctl_sched_uclamp_util_rt_max = 200
> 
> You just need to reduce the MAX clamp constraint, which will enforce
> whatever the task requires and/or its min to be not greater the 200.

This won't work. You don't only override the default value here but disallow
*all* RT tasks from going above 200; which is not what we want to achieve here.

What I want to achieve here is that if an RT task is forked; allow the system
administrator to override the default value to be anything other than 1024;
which is power hungry and not what most battery powered devices want.

But if there's an app that creates an RT task and has a special request to set
its uclamp to 1024; then I don't think we should disallow that. Unless the sys
admin uses the system wide default values to restrict this (like extreme power
saving mode).

Doing it the way you suggest can't differentiate between sensible default RT
behavior (or rather a good compromise between RT performance vs power) that sys
integrator can setup and the generic need for a task to change its uclamp value
to boost/cap itself.

> 
> 
> > Which when splitting this to CFS and RT controls, they'd still mean control
> > the possible range the tasks can use.
> > 
> > But what I want here is that
> > 
> > 	sysctl_sched_rt_default_uclamp_util_min = 800
> > 
> > Will set p->uclamp_min = 800 when a new RT task is forked and will stay like
> > this until the user modifies it. And update all current RT tasks that are still
> > using the default value to the new one when ever this changes.
> 
> Right, that will work, but you are adding design concepts on top of a
> design which allows already you to get what you want: just by properly
> setting the constraints.

That's where I am confused. I think this is rather an extension of what we
have; we hardcode a value and now this value is runtime configurable. I don't
see why it's a new design concept.

> 
> > If this value is outside the system wide allowed range; it'll be clamped
> > correctly by uclamp_eff_get().
> > 
> > Note that the default fork values for CFS are (0, 1024) and we don't have a way
> > to change this at the moment.
> 
> That's not true, the default value at FORK is to copy the parent
> clamps. Unless you ask for a reset on fork.

Which are inherited from init; which is set to (0, 1024). So unless a task
requests to change its value, then everything by default will be (0, 1024).

But yes, I should have mentioned this exception here.

> 
> 
> > > 
> > > At the end we will have two system wide defaults, not three.
> > > 
> > > > > > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > > > > > when turning it on, but I think we don't have to if we just want things like
> > > > > > uclamp value propagation?)
> > > > > 
> > > > > No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> > > > > That's because in this case:
> > > > >   - uclamp_tg_restrict() returns just the task requested value
> > > > >   - uclamp_eff_get() _always_ restricts the requested value considering
> > > > >     the system defaults
> > > > >  
> > > > > > It's quite more work than the simple thing Qais is introducing (and on both
> > > > > > user and kernel side).
> > > > > 
> > > > > But if in the future we will want to extend CGroups support to RT then
> > > > > we will feel the pains because we do the effective computation in two
> > > > > different places.
> > > > 
> > > > Hmm what you're suggesting here is that we want to have
> > > > cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.
> > > 
> > > That's exactly what we already do for other controllers. For example,
> > > if you look at the bandwidth controller, we have separate knobs for
> > > CFS and RT tasks.
> > > 
> > > > It makes more sense to create a special group for all rt tasks rather than
> > > > treat rt tasks in a cgroup differently.
> > > 
> > > Don't see why that should make more sense. This can work of course but
> > > it would enforce a more strict configuration and usage of cgroups to
> > > userspace.
> > 
> > It makes sense to me because cgroups offer a logical separation, and if tasks
> > within the same group need to be treated differently then they're logically
> > separated and need to be grouped differently consequently.
> 
> And that's why we can have knobs for different scheduling classes in
> the same task group.
> 
> > The uclamp values are not RT only properties.
> 
> In the current implementation, but that was not certainly the original
> intent. Ref to:
> 
>    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
>    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
> 
> I did remove the CFS/RT specialization just to keep uclamp simple for
> the first merge. But now that we have it in and people understand the
> RT use-case better, it's just possible to complete the design with
> specialized CFS and RT default constraints and knobs.

But it's already an ABI, no? We can't just change an ABI - although at this
moment in time we maybe can. But I need the maintainers to chip in and agree on
this as I wasn't part of the whole uclamp discussion in the past.

> 
> I also remember that Vincent pointed out, at a certain point, that RT
> and CFS tasks should be distinguished internally when it comes to
> aggregate the constraints. This will require more work, but what I'm
> proposing since the beginning is the first step toward this solution.

Like above. I was under the impression what was merged now is the final shape
of how uclamp needs to look like and no design pieces are missing. If it is,
then we must react quickly before dependent apps are created and it'd be hard
to modify it without creating a regression.

> 
> > I can see the convenience of doing it the way you propose though. Not sure
> > I agree about the added complexity worth it.
> 
> I keep not seeing the added complexity. We just need to add two new system
> wide knobs and a small filter in the uclamp_eff_get().

I don't think the interaction between all these knobs, the per task uclamp
settings + cgroup uclamp settings is trivial.

For the record, several of us were under the impression we can control the
default RT tasks behavior using what we have but we found out we don't (not
conveniently at least), hence the need for this patch.

> 
> > > I also have some doubths about this approach matching the delegation
> > > model principles.
> > 
> > I don't know about that TBH.
> 
> Then it's worth a reading. I remember I discovered quite a lot of
> caviets about how cgroups are supposed to be supported and used.

I will if I thought I need to play with cgroup settings. So far I don't see
that I need to.

> 
> I have the doubth that reflecting your default concept into something
> playing well with cgroups could be tricky. If anything else, the
> constraints propagation approach we use know is something Teju already
> gave us his blessing on.

The way currently this patch works, there's no need to do any cgroup changes.
Everything should continue to work as intended, or at least currently merged.

If I set sysctl_sched_rt_default_uclamp_util_min = 0, but a cgroup sets the
cpu.uclamp.min = 512, the cgroup will still be applied correctly.

Currently, the p->uclamp_min of all RT tasks is 1024, so a cpu.uclamp.min = 512
will continue allow it to run at 1024. Unless cpu.uclamp.max is set to anything
< 1024, then it'd be clamped to that.

> 
> > > > > Do note that a proper CGroup support requires that the system default
> > > > > values defines the values for the root group and are consistently
> > > > > propagated down the hierarchy. Thus we need to add a dedicated pair of
> > > > > cgroup attributes, e.g. cpu.util.rt.{min.max}.
> > > > > 
> > > > > To recap, we don't need CGROUP support right now but just to add a new
> > > > > default tracking similar to what we do for CFS.
> > > > >
> > > > > We already proposed such a support in one of the initial versions of
> > > > > the uclamp series:
> > > > >    Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
> > > > >    https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
> > > > 
> > > > IIUC what you're suggesting is:
> > > > 
> > > > 	1. Use the sysctl to specify the default_rt_uclamp_min
> > > > 	2. Enforce this value in uclamp_eff_get() rather than my sync logic
> > > > 	3. Remove the current hack to always set
> > > > 	   rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)
> > > 
> > > Right, that's the essence...
> > > 
> > > > If I got it correctly I'd be happy to experiment with it if this is what
> > > > you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
> > > > problem you're trying to highlight.
> > > 
> > > ... from what your write above I think you got it right.
> > > 
> > > In my previous posting:
> > > 
> > >    Message-ID: <20200109092137.GA2811@darkstar>
> > >    https://lore.kernel.org/lkml/20200109092137.GA2811@darkstar/
> > > 
> > > there is also the code snippet which should be good enough to extend
> > > uclamp_eff_get(). Apart from that, what remains is:
> > 
> > I still have my doubts this will achieve what I want, but hopefully will have
> > time to experiment with this today.
> 
> Right, just remember you need to reduce the max clamp, not the min as
> you was proposing in the example above.

Hmm okay I'll keep this in mind when I get to writing the code. But as it
stands I see we still have a major disagreement on what this sysctl knob means,
so implementing that doesn't make sense yet.

> 
> > My worry is that this function enforces
> > restrictions, but what I want to do is set a default. I still can't see how
> > they are the same; which your answers hints they are.
> 
> It all boils down to see that:
>    
>    rt_clamped_util = uclamp_util(1024, (*,200)) = 200
> 
> > > - to add the two new sysfs knobs for sysctl_sched_uclamp_util_{min,max}_rt
> > 
> > See my answer above.
> > 
> > > - make a call about how rt tasks in cgroups are clamped, a simple
> > >   proposal is in the second snippet of my message above.
> > 
> > I'm not keen on changing how this works and it is outside of the scope of this
> > patch anyway. Or at least I don't see how they are related yet.
> 
> Sure you'll realize once (if) you add the small uclamp_eff_get()
> change I'm proposing.

I don't mind changing the implementation to use uclamp_eff_get() instead of the
lazy sync approach. But I think what we disagree on is what this sysctl knob
changes. So I'll wait for this discussion to settle first.

Thanks

--
Qais Yousef

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-01-22 14:57 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-20 16:48 [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min Qais Yousef
2020-01-07 13:42 ` Quentin Perret
2020-01-07 19:30   ` Dietmar Eggemann
2020-01-08  9:51     ` Quentin Perret
2020-01-08 19:16       ` Patrick Bellasi
2020-01-09 11:36       ` Qais Yousef
2020-01-09 11:16     ` Qais Yousef
2020-01-09 11:12   ` Qais Yousef
2020-01-08 13:44 ` Peter Zijlstra
2020-01-08 19:08   ` Patrick Bellasi
2020-01-09 13:00   ` Qais Yousef
2020-01-10 13:39     ` Peter Zijlstra
2020-01-12 23:35       ` Qais Yousef
2020-01-10 13:42     ` Peter Zijlstra
2020-01-12 23:31       ` Qais Yousef
2020-01-08 18:56 ` Patrick Bellasi
2020-01-09  1:35   ` Valentin Schneider
2020-01-09  9:21     ` Patrick Bellasi
2020-01-09 13:38       ` Qais Yousef
2020-01-14 21:34       ` Qais Yousef
2020-01-22 10:19         ` Patrick Bellasi
2020-01-22 11:45           ` Qais Yousef
2020-01-22 12:44             ` Patrick Bellasi
2020-01-22 14:57               ` Qais Yousef
2020-01-09 13:15     ` Qais Yousef

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).