All of lore.kernel.org
 help / color / mirror / Atom feed
From: Quentin Perret <qperret@google.com>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	valentin.schneider@arm.com,
	Patrick Bellasi <patrick.bellasi@matbug.net>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	kernel-team@android.com
Subject: Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
Date: Tue, 7 Jan 2020 13:42:34 +0000	[thread overview]
Message-ID: <20200107134234.GA158998@google.com> (raw)
In-Reply-To: <20191220164838.31619-1-qais.yousef@arm.com>

Hi Qais,

On Friday 20 Dec 2019 at 16:48:38 (+0000), Qais Yousef wrote:
> +/*
> + * By default RT tasks run at the maximum performance point/capacity of the
> + * system. Uclamp enforces this by always setting UCLAMP_MIN of RT tasks to
> + * SCHED_CAPACITY_SCALE.
> + *
> + * This knob allows admins to change the default behavior when uclamp is being
> + * used. In battery powered devices particularly running at the maximum
> + * capacity will increase energy consumption and shorten the battery life.
> + *
> + * This knob only affects the default value RT uses when a new RT task is
> + * forked or has just changed policy to RT and no uclamp user settings were
> + * applied (ie: the task didn't modify the default value to a new value.
> + *
> + * This knob will not override the system default values defined above.
> + */

I suppose this comment could go in the sysctl doc file instead ?

> +unsigned int sysctl_sched_rt_uclamp_util_min = SCHED_CAPACITY_SCALE;

I would suggest renaming the knob as 'sysctl_sched_rt_default_uclamp_min'
or something along those lines to make it clear it's a default value.

And for consistency with the existing code, perhaps set the default to
uclamp_none(UCLAMP_MAX) instead of an explicit SCHED_CAPACITY_SCALE?

>  /* All clamps are required to be less or equal than these values */
>  static struct uclamp_se uclamp_default[UCLAMP_CNT];
>  
> @@ -919,6 +936,14 @@ uclamp_eff_get(struct task_struct *p, enum uclamp_id clamp_id)
>  	return uc_req;
>  }
>  
> +void uclamp_rt_sync_default_util_min(struct task_struct *p)
> +{
> +	struct uclamp_se *uc_se = &p->uclamp_req[UCLAMP_MIN];
> +
> +	if (!uc_se->user_defined)
> +		uclamp_se_set(uc_se, sysctl_sched_rt_uclamp_util_min, false);
> +}
> +
>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id)
>  {
>  	struct uclamp_se uc_eff;
> @@ -1116,12 +1141,13 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  				loff_t *ppos)
>  {
>  	bool update_root_tg = false;
> -	int old_min, old_max;
> +	int old_min, old_max, old_rt_min;
>  	int result;
>  
>  	mutex_lock(&uclamp_mutex);
>  	old_min = sysctl_sched_uclamp_util_min;
>  	old_max = sysctl_sched_uclamp_util_max;
> +	old_rt_min = sysctl_sched_rt_uclamp_util_min;
>  
>  	result = proc_dointvec(table, write, buffer, lenp, ppos);
>  	if (result)
> @@ -1129,12 +1155,23 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  	if (!write)
>  		goto done;
>  
> +	/*
> +	 * The new value will be applied to all RT tasks the next time they
> +	 * wakeup, assuming the task is using the system default and not a user
> +	 * specified value. In the latter we shall leave the value as the user
> +	 * requested.
> +	 */
>  	if (sysctl_sched_uclamp_util_min > sysctl_sched_uclamp_util_max ||
>  	    sysctl_sched_uclamp_util_max > SCHED_CAPACITY_SCALE) {
>  		result = -EINVAL;
>  		goto undo;
>  	}
>  
> +	if (sysctl_sched_rt_uclamp_util_min > SCHED_CAPACITY_SCALE) {
> +		result = -EINVAL;
> +		goto undo;
> +	}
> +
>  	if (old_min != sysctl_sched_uclamp_util_min) {
>  		uclamp_se_set(&uclamp_default[UCLAMP_MIN],
>  			      sysctl_sched_uclamp_util_min, false);
> @@ -1160,6 +1197,7 @@ int sysctl_sched_uclamp_handler(struct ctl_table *table, int write,
>  undo:
>  	sysctl_sched_uclamp_util_min = old_min;
>  	sysctl_sched_uclamp_util_max = old_max;
> +	sysctl_sched_rt_uclamp_util_min = old_rt_min;
>  done:
>  	mutex_unlock(&uclamp_mutex);
>  
> @@ -1202,9 +1240,12 @@ static void __setscheduler_uclamp(struct task_struct *p,
>  		if (uc_se->user_defined)
>  			continue;
>  
> -		/* By default, RT tasks always get 100% boost */
> +		/*
> +		 * By default, RT tasks always get 100% boost, which the admins
> +		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
> +		 */
>  		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> -			clamp_value = uclamp_none(UCLAMP_MAX);
> +			clamp_value = sysctl_sched_rt_uclamp_util_min;
>  
>  		uclamp_se_set(uc_se, clamp_value, false);
>  	}
> @@ -1236,9 +1277,12 @@ static void uclamp_fork(struct task_struct *p)
>  	for_each_clamp_id(clamp_id) {
>  		unsigned int clamp_value = uclamp_none(clamp_id);
>  
> -		/* By default, RT tasks always get 100% boost */
> +		/*
> +		 * By default, RT tasks always get 100% boost, which the admins
> +		 * are allowed change via sysctl_sched_rt_uclamp_util_min knob.
> +		 */
>  		if (unlikely(rt_task(p) && clamp_id == UCLAMP_MIN))
> -			clamp_value = uclamp_none(UCLAMP_MAX);
> +			clamp_value = sysctl_sched_rt_uclamp_util_min;
>  
>  		uclamp_se_set(&p->uclamp_req[clamp_id], clamp_value, false);
>  	}
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index e591d40fd645..19572dfc175b 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2147,6 +2147,12 @@ static void pull_rt_task(struct rq *this_rq)
>   */
>  static void task_woken_rt(struct rq *rq, struct task_struct *p)
>  {
> +	/*
> +	 * When sysctl_sched_rt_uclamp_util_min value is changed by the user,
> +	 * we apply any new value on the next wakeup, which is here.
> +	 */
> +	uclamp_rt_sync_default_util_min(p);

The task has already been enqueued and sugov has been called by then I
think, so this is a bit late. You could do that in uclamp_rq_inc() maybe?

> +
>  	if (!task_running(rq, p) &&
>  	    !test_tsk_need_resched(rq->curr) &&
>  	    p->nr_cpus_allowed > 1 &&
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 280a3c735935..337bf17b1a9d 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2300,6 +2300,8 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
>  #endif /* CONFIG_CPU_FREQ */
>  
>  #ifdef CONFIG_UCLAMP_TASK
> +void uclamp_rt_sync_default_util_min(struct task_struct *p);
> +
>  unsigned int uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id);
>  
>  static __always_inline
> @@ -2330,6 +2332,8 @@ static inline unsigned int uclamp_util(struct rq *rq, unsigned int util)
>  	return uclamp_util_with(rq, util, NULL);
>  }
>  #else /* CONFIG_UCLAMP_TASK */
> +void uclamp_rt_sync_default_util_min(struct task_struct *p) {}
> +
>  static inline unsigned int uclamp_util_with(struct rq *rq, unsigned int util,
>  					    struct task_struct *p)
>  {
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 70665934d53e..06183762daac 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -465,6 +465,13 @@ static struct ctl_table kern_table[] = {
>  		.mode		= 0644,
>  		.proc_handler	= sysctl_sched_uclamp_handler,
>  	},
> +	{
> +		.procname	= "sched_rt_util_clamp_min",
> +		.data		= &sysctl_sched_rt_uclamp_util_min,
> +		.maxlen		= sizeof(unsigned int),
> +		.mode		= 0644,
> +		.proc_handler	= sysctl_sched_uclamp_handler,
> +	},
>  #endif
>  #ifdef CONFIG_SCHED_AUTOGROUP
>  	{
> -- 
> 2.17.1
> 

Apart from the small things above, this seems like a sensible idea and
would indeed be useful, so thanks for the patch!

Quentin

  reply	other threads:[~2020-01-07 13:42 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-20 16:48 [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min Qais Yousef
2020-01-07 13:42 ` Quentin Perret [this message]
2020-01-07 19:30   ` Dietmar Eggemann
2020-01-08  9:51     ` Quentin Perret
2020-01-08 19:16       ` Patrick Bellasi
2020-01-09 11:36       ` Qais Yousef
2020-01-09 11:16     ` Qais Yousef
2020-01-09 11:12   ` Qais Yousef
2020-01-08 13:44 ` Peter Zijlstra
2020-01-08 19:08   ` Patrick Bellasi
2020-01-09 13:00   ` Qais Yousef
2020-01-10 13:39     ` Peter Zijlstra
2020-01-12 23:35       ` Qais Yousef
2020-01-10 13:42     ` Peter Zijlstra
2020-01-12 23:31       ` Qais Yousef
2020-01-08 18:56 ` Patrick Bellasi
2020-01-09  1:35   ` Valentin Schneider
2020-01-09  9:21     ` Patrick Bellasi
2020-01-09 13:38       ` Qais Yousef
2020-01-14 21:34       ` Qais Yousef
2020-01-22 10:19         ` Patrick Bellasi
2020-01-22 11:45           ` Qais Yousef
2020-01-22 12:44             ` Patrick Bellasi
2020-01-22 14:57               ` Qais Yousef
2020-01-09 13:15     ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200107134234.GA158998@google.com \
    --to=qperret@google.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=kernel-team@android.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=patrick.bellasi@matbug.net \
    --cc=peterz@infradead.org \
    --cc=qais.yousef@arm.com \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.