linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] A couple of uclamp fixes
@ 2021-07-19 16:16 Quentin Perret
  2021-07-19 16:16 ` [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
  2021-07-19 16:16 ` [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
  0 siblings, 2 replies; 12+ messages in thread
From: Quentin Perret @ 2021-07-19 16:16 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, dietmar.eggemann, qais.yousef,
	rickyiu, wvw, patrick.bellasi, xuewen.yan94
  Cc: linux-kernel, kernel-team

Hi all,

This is v4 of small series previously posted here:

  https://lore.kernel.org/lkml/20210623123441.592348-1-qperret@google.com/

I dropped the last patch of the series which requires a bit more
discussion, and kept the first two here as they're standalone fixes.

Thanks,
Quentin

Quentin Perret (2):
  sched: Fix UCLAMP_FLAG_IDLE setting
  sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS

 kernel/sched/core.c | 44 ++++++++++++++++++++++++++++++++------------
 1 file changed, 32 insertions(+), 12 deletions(-)

-- 
2.32.0.402.g57bb445576-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-07-19 16:16 [PATCH v4 0/2] A couple of uclamp fixes Quentin Perret
@ 2021-07-19 16:16 ` Quentin Perret
  2021-07-21 10:07   ` Dietmar Eggemann
  2021-07-27 14:32   ` Qais Yousef
  2021-07-19 16:16 ` [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
  1 sibling, 2 replies; 12+ messages in thread
From: Quentin Perret @ 2021-07-19 16:16 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, dietmar.eggemann, qais.yousef,
	rickyiu, wvw, patrick.bellasi, xuewen.yan94
  Cc: linux-kernel, kernel-team

The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
active task to maintain the last uclamp.max and prevent blocked util
from suddenly becoming visible.

However, there is an asymmetry in how the flag is set and cleared which
can lead to having the flag set whilst there are active tasks on the rq.
Specifically, the flag is cleared in the uclamp_rq_inc() path, which is
called at enqueue time, but set in uclamp_rq_dec_id() which is called
both when dequeueing a task _and_ in the update_uclamp_active() path. As
a result, when both uclamp_rq_{dec,ind}_id() are called from
update_uclamp_active(), the flag ends up being set but not cleared,
hence leaving the runqueue in a broken state.

Fix this by clearing the flag in update_uclamp_active() as well.

Fixes: e496187da710 ("sched/uclamp: Enforce last task's UCLAMP_MAX")
Reported-by: Rick Yiu <rickyiu@google.com>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 kernel/sched/core.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index cf16f8fda9a6..e801d2c3077b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1619,6 +1619,23 @@ static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p)
 		uclamp_rq_dec_id(rq, p, clamp_id);
 }
 
+static inline void uclamp_rq_reinc_id(struct rq *rq, struct task_struct *p,
+				      enum uclamp_id clamp_id)
+{
+	if (!p->uclamp[clamp_id].active)
+		return;
+
+	uclamp_rq_dec_id(rq, p, clamp_id);
+	uclamp_rq_inc_id(rq, p, clamp_id);
+
+	/*
+	 * Make sure to clear the idle flag if we've transiently reached 0
+	 * active tasks on rq.
+	 */
+	if (clamp_id == UCLAMP_MAX && (rq->uclamp_flags & UCLAMP_FLAG_IDLE))
+		rq->uclamp_flags &= ~UCLAMP_FLAG_IDLE;
+}
+
 static inline void
 uclamp_update_active(struct task_struct *p)
 {
@@ -1642,12 +1659,8 @@ uclamp_update_active(struct task_struct *p)
 	 * affecting a valid clamp bucket, the next time it's enqueued,
 	 * it will already see the updated clamp bucket value.
 	 */
-	for_each_clamp_id(clamp_id) {
-		if (p->uclamp[clamp_id].active) {
-			uclamp_rq_dec_id(rq, p, clamp_id);
-			uclamp_rq_inc_id(rq, p, clamp_id);
-		}
-	}
+	for_each_clamp_id(clamp_id)
+		uclamp_rq_reinc_id(rq, p, clamp_id);
 
 	task_rq_unlock(rq, p, &rf);
 }
-- 
2.32.0.402.g57bb445576-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-07-19 16:16 [PATCH v4 0/2] A couple of uclamp fixes Quentin Perret
  2021-07-19 16:16 ` [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
@ 2021-07-19 16:16 ` Quentin Perret
  2021-07-22  8:47   ` Dietmar Eggemann
  1 sibling, 1 reply; 12+ messages in thread
From: Quentin Perret @ 2021-07-19 16:16 UTC (permalink / raw)
  To: mingo, peterz, vincent.guittot, dietmar.eggemann, qais.yousef,
	rickyiu, wvw, patrick.bellasi, xuewen.yan94
  Cc: linux-kernel, kernel-team

SCHED_FLAG_KEEP_PARAMS can be passed to sched_setattr to specify that
the call must not touch scheduling parameters (nice or priority). This
is particularly handy for uclamp when used in conjunction with
SCHED_FLAG_KEEP_POLICY as that allows to issue a syscall that only
impacts uclamp values.

However, sched_setattr always checks whether the priorities and nice
values passed in sched_attr are valid first, even if those never get
used down the line. This is useless at best since userspace can
trivially bypass this check to set the uclamp values by specifying low
priorities. However, it is cumbersome to do so as there is no single
expression of this that skips both RT and CFS checks at once. As such,
userspace needs to query the task policy first with e.g. sched_getattr
and then set sched_attr.sched_priority accordingly. This is racy and
slower than a single call.

As the priority and nice checks are useless when SCHED_FLAG_KEEP_PARAMS
is specified, simply inherit them in this case to match the policy
inheritance of SCHED_FLAG_KEEP_POLICY.

Reported-by: Wei Wang <wvw@google.com>
Reviewed-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Quentin Perret <qperret@google.com>
---
 kernel/sched/core.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e801d2c3077b..914076eab242 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7332,6 +7332,16 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
 	return -E2BIG;
 }
 
+static void get_params(struct task_struct *p, struct sched_attr *attr)
+{
+	if (task_has_dl_policy(p))
+		__getparam_dl(p, attr);
+	else if (task_has_rt_policy(p))
+		attr->sched_priority = p->rt_priority;
+	else
+		attr->sched_nice = task_nice(p);
+}
+
 /**
  * sys_sched_setscheduler - set/change the scheduler policy and RT priority
  * @pid: the pid in question.
@@ -7393,6 +7403,8 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
 	rcu_read_unlock();
 
 	if (likely(p)) {
+		if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
+			get_params(p, &attr);
 		retval = sched_setattr(p, &attr);
 		put_task_struct(p);
 	}
@@ -7541,12 +7553,7 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
 	kattr.sched_policy = p->policy;
 	if (p->sched_reset_on_fork)
 		kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
-	if (task_has_dl_policy(p))
-		__getparam_dl(p, &kattr);
-	else if (task_has_rt_policy(p))
-		kattr.sched_priority = p->rt_priority;
-	else
-		kattr.sched_nice = task_nice(p);
+	get_params(p, &kattr);
 
 #ifdef CONFIG_UCLAMP_TASK
 	/*
-- 
2.32.0.402.g57bb445576-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-07-19 16:16 ` [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
@ 2021-07-21 10:07   ` Dietmar Eggemann
  2021-07-21 13:09     ` Quentin Perret
  2021-07-27 14:32   ` Qais Yousef
  1 sibling, 1 reply; 12+ messages in thread
From: Dietmar Eggemann @ 2021-07-21 10:07 UTC (permalink / raw)
  To: Quentin Perret, mingo, peterz, vincent.guittot, qais.yousef,
	rickyiu, wvw, patrick.bellasi, xuewen.yan94
  Cc: linux-kernel, kernel-team

On 19/07/2021 18:16, Quentin Perret wrote:
> The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
> active task to maintain the last uclamp.max and prevent blocked util

s/active/runnable ?

> from suddenly becoming visible.
> 

[...]

IMHO, the main argument in v3 to do the clearing outside
uclamp_rq_inc_id() was a possible order change in `for_each_clamp_id()`.
So setting/clearing `rq->uclamp_flags` (UCLAMP_FLAG_IDLE) on UCLAMP_MAX
(currently the highest Uclamp constraint (UCLAMP_CNT-1)) could be
incorrect when UCLAMP_MIN and UCLAMP_MAX change place because the
same `rq->uclamp_flags` value is needed for both Uclamp constraint
values.

What about decoupling rq->uclamp_flags` handling from UCLAMP_MAX and
doing this for 'UCLAMP_CNT - 1', i.e. always on the highest Uclamp
constraint?

#define for_each_clamp_id(clamp_id) \
    for ((clamp_id) = 0; (clamp_id) < UCLAMP_CNT; (clamp_id)++)

In this case the code change can be as easy as in your original v3.

Setting UCLAMP_FLAG_IDLE in uclamp_idle_value():

  uclamp_rq_dec_id() -> uclamp_rq_max_value() -> *uclamp_idle_value()*

Resetting UCLAMP_FLAG_IDLE in uclamp_idle_reset():

  uclamp_rq_inc_id()                          -> *uclamp_idle_reset()*  

This would be more symmetrical then uclamp_idle_value() and
uclamp_rq_inc()/uclamp_rq_reinc_id().

--8<--

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0c22cd026440..600f68f3378c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1300,15 +1300,16 @@ static inline unsigned int
 uclamp_idle_value(struct rq *rq, enum uclamp_id clamp_id,
 		  unsigned int clamp_value)
 {
+	if (clamp_id == UCLAMP_CNT - 1)
+		rq->uclamp_flags |= UCLAMP_FLAG_IDLE;
+
 	/*
 	 * Avoid blocked utilization pushing up the frequency when we go
 	 * idle (which drops the max-clamp) by retaining the last known
 	 * max-clamp.
 	 */
-	if (clamp_id == UCLAMP_MAX) {
-		rq->uclamp_flags |= UCLAMP_FLAG_IDLE;
+	if (clamp_id == UCLAMP_MAX)
 		return clamp_value;
-	}
 
 	return uclamp_none(UCLAMP_MIN);
 }
@@ -1320,6 +1321,9 @@ static inline void uclamp_idle_reset(struct rq *rq, enum uclamp_id clamp_id,
 	if (!(rq->uclamp_flags & UCLAMP_FLAG_IDLE))
 		return;
 
+	if ((clamp_id == UCLAMP_CNT - 1) && (rq->uclamp_flags & UCLAMP_FLAG_IDLE))
+		rq->uclamp_flags &= ~UCLAMP_FLAG_IDLE;
+
 	WRITE_ONCE(rq->uclamp[clamp_id].value, clamp_value);
 }
 
@@ -1595,10 +1599,6 @@ static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p)
 
 	for_each_clamp_id(clamp_id)
 		uclamp_rq_inc_id(rq, p, clamp_id);
-
-	/* Reset clamp idle holding when there is one RUNNABLE task */
-	if (rq->uclamp_flags & UCLAMP_FLAG_IDLE)
-		rq->uclamp_flags &= ~UCLAMP_FLAG_IDLE;
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-07-21 10:07   ` Dietmar Eggemann
@ 2021-07-21 13:09     ` Quentin Perret
  2021-07-22  8:47       ` Dietmar Eggemann
  0 siblings, 1 reply; 12+ messages in thread
From: Quentin Perret @ 2021-07-21 13:09 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: mingo, peterz, vincent.guittot, qais.yousef, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

Hi Dietmar,

On Wednesday 21 Jul 2021 at 12:07:04 (+0200), Dietmar Eggemann wrote:
> On 19/07/2021 18:16, Quentin Perret wrote:
> > The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
> > active task to maintain the last uclamp.max and prevent blocked util
> 
> s/active/runnable ?

'active' should still be correct here no? We enter uclamp_rq_max_value()
-> uclamp_idle_value() when the last _active_ uclamp_se is decremented,
and when all the buckets are empty, so I think that works?

> > from suddenly becoming visible.
> > 
> 
> [...]
> 
> IMHO, the main argument in v3 to do the clearing outside
> uclamp_rq_inc_id() was a possible order change in `for_each_clamp_id()`.
> So setting/clearing `rq->uclamp_flags` (UCLAMP_FLAG_IDLE) on UCLAMP_MAX
> (currently the highest Uclamp constraint (UCLAMP_CNT-1)) could be
> incorrect when UCLAMP_MIN and UCLAMP_MAX change place because the
> same `rq->uclamp_flags` value is needed for both Uclamp constraint
> values.
> 
> What about decoupling rq->uclamp_flags` handling from UCLAMP_MAX and
> doing this for 'UCLAMP_CNT - 1', i.e. always on the highest Uclamp
> constraint?
> 
> #define for_each_clamp_id(clamp_id) \
>     for ((clamp_id) = 0; (clamp_id) < UCLAMP_CNT; (clamp_id)++)
> 
> In this case the code change can be as easy as in your original v3.
> 
> Setting UCLAMP_FLAG_IDLE in uclamp_idle_value():
> 
>   uclamp_rq_dec_id() -> uclamp_rq_max_value() -> *uclamp_idle_value()*
> 
> Resetting UCLAMP_FLAG_IDLE in uclamp_idle_reset():
> 
>   uclamp_rq_inc_id()                          -> *uclamp_idle_reset()*  
> 
> This would be more symmetrical then uclamp_idle_value() and
> uclamp_rq_inc()/uclamp_rq_reinc_id().

Right, thanks for the suggestion but to be fair I feel like this is a
matter of personal preference at this point. I personally like the way
it is in this patch -- I find it easier to reason about, but maybe
that's because I wrote it ...

Do you feel strongly about it? If not I'd prefer to not re-spin this
another time if possible. Let me know what you think.

Cheers,
Quentin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-07-21 13:09     ` Quentin Perret
@ 2021-07-22  8:47       ` Dietmar Eggemann
  0 siblings, 0 replies; 12+ messages in thread
From: Dietmar Eggemann @ 2021-07-22  8:47 UTC (permalink / raw)
  To: Quentin Perret
  Cc: mingo, peterz, vincent.guittot, qais.yousef, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

On 21/07/2021 15:09, Quentin Perret wrote:
> Hi Dietmar,
> 
> On Wednesday 21 Jul 2021 at 12:07:04 (+0200), Dietmar Eggemann wrote:
>> On 19/07/2021 18:16, Quentin Perret wrote:
>>> The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
>>> active task to maintain the last uclamp.max and prevent blocked util
>>
>> s/active/runnable ?
> 
> 'active' should still be correct here no? We enter uclamp_rq_max_value()
> -> uclamp_idle_value() when the last _active_ uclamp_se is decremented,
> and when all the buckets are empty, so I think that works?

Ah, it this uclamp ative `p->uclamp[clamp_id].active` which is set with
`bucket->tasks` in uclamp_rq_[inc/dec]_id.

Maybe add: last (uclamp) active task, i.e. (bucket.tasks == 0 for all
bucket_id's) ... ?

>>> from suddenly becoming visible.
>>>
>>
>> [...]
>>
>> IMHO, the main argument in v3 to do the clearing outside
>> uclamp_rq_inc_id() was a possible order change in `for_each_clamp_id()`.
>> So setting/clearing `rq->uclamp_flags` (UCLAMP_FLAG_IDLE) on UCLAMP_MAX
>> (currently the highest Uclamp constraint (UCLAMP_CNT-1)) could be
>> incorrect when UCLAMP_MIN and UCLAMP_MAX change place because the
>> same `rq->uclamp_flags` value is needed for both Uclamp constraint
>> values.
>>
>> What about decoupling rq->uclamp_flags` handling from UCLAMP_MAX and
>> doing this for 'UCLAMP_CNT - 1', i.e. always on the highest Uclamp
>> constraint?
>>
>> #define for_each_clamp_id(clamp_id) \
>>     for ((clamp_id) = 0; (clamp_id) < UCLAMP_CNT; (clamp_id)++)
>>
>> In this case the code change can be as easy as in your original v3.
>>
>> Setting UCLAMP_FLAG_IDLE in uclamp_idle_value():
>>
>>   uclamp_rq_dec_id() -> uclamp_rq_max_value() -> *uclamp_idle_value()*
>>
>> Resetting UCLAMP_FLAG_IDLE in uclamp_idle_reset():
>>
>>   uclamp_rq_inc_id()                          -> *uclamp_idle_reset()*  
>>
>> This would be more symmetrical then uclamp_idle_value() and
>> uclamp_rq_inc()/uclamp_rq_reinc_id().
> 
> Right, thanks for the suggestion but to be fair I feel like this is a
> matter of personal preference at this point. I personally like the way
> it is in this patch -- I find it easier to reason about, but maybe
> that's because I wrote it ...
> 
> Do you feel strongly about it? If not I'd prefer to not re-spin this
> another time if possible. Let me know what you think.

No, not at all ;-) Just like it better since it would mean less code
changes and only one place to reset UCLAMP_FLAG_IDLE.

You can add a:

Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

to your version in case you want to keep it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-07-19 16:16 ` [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
@ 2021-07-22  8:47   ` Dietmar Eggemann
  2021-07-26 13:56     ` Quentin Perret
  0 siblings, 1 reply; 12+ messages in thread
From: Dietmar Eggemann @ 2021-07-22  8:47 UTC (permalink / raw)
  To: Quentin Perret, mingo, peterz, vincent.guittot, qais.yousef,
	rickyiu, wvw, patrick.bellasi, xuewen.yan94
  Cc: linux-kernel, kernel-team

On 19/07/2021 18:16, Quentin Perret wrote:
> SCHED_FLAG_KEEP_PARAMS can be passed to sched_setattr to specify that
> the call must not touch scheduling parameters (nice or priority). This

What about DL params (runtime, deadline, period)?

Uclamp is not for DL but we could (*) still set uclamp values on a DL
task. Obviously they would only be used when the task switches policy.

On tip/sched/core:

root@juno:~# chrt -d -P 1000000000 -T 100000000 -p 0 1671

root@juno:~# uclampset -m200 -M400 -p 1671

root@juno:~# cat /proc/1671/sched | grep uclamp
uclamp.min                                   :                  200
uclamp.max                                   :                  400
effective uclamp.min                         :                  200
effective uclamp.max                         :                  400

root@juno:~# chrt -o -p 0
pid 1702's current scheduling policy: SCHED_OTHER
pid 1702's current scheduling priority: 0

root@juno:~# cat /proc/1671/sched | grep uclamp
uclamp.min                                   :                  200
uclamp.max                                   :                  400
effective uclamp.min                         :                  200
effective uclamp.max                         :                  400


> is particularly handy for uclamp when used in conjunction with
> SCHED_FLAG_KEEP_POLICY as that allows to issue a syscall that only
> impacts uclamp values.
> 
> However, sched_setattr always checks whether the priorities and nice
> values passed in sched_attr are valid first, even if those never get

+ DL params (__checkparam_dl())

> used down the line. This is useless at best since userspace can
> trivially bypass this check to set the uclamp values by specifying low
> priorities. However, it is cumbersome to do so as there is no single
> expression of this that skips both RT and CFS checks at once. As such,
> userspace needs to query the task policy first with e.g. sched_getattr
> and then set sched_attr.sched_priority accordingly. This is racy and
> slower than a single call.
> 
> As the priority and nice checks are useless when SCHED_FLAG_KEEP_PARAMS
> is specified, simply inherit them in this case to match the policy
> inheritance of SCHED_FLAG_KEEP_POLICY.
> 
> Reported-by: Wei Wang <wvw@google.com>
> Reviewed-by: Qais Yousef <qais.yousef@arm.com>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---
>  kernel/sched/core.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index e801d2c3077b..914076eab242 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7332,6 +7332,16 @@ static int sched_copy_attr(struct sched_attr __user *uattr, struct sched_attr *a
>  	return -E2BIG;
>  }
>  
> +static void get_params(struct task_struct *p, struct sched_attr *attr)
> +{
> +	if (task_has_dl_policy(p))
> +		__getparam_dl(p, attr);

(*) This changes the behaviour when setting uclamp values on a DL task.

Before uclamp values could be set but now, because of

  void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
    ..
    attr->sched_flags = dl_se->flags

SCHED_FLAG_UTIL_CLAMP gets overwritten and  __sched_setscheduler() bails in:

    if (unlikely(policy == p->policy)) {
      ...
      retval = 0;
      goto unlock;
    }
  change:

I.e. the:

      if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
        goto change;

can't trigger anymore.


> +	else if (task_has_rt_policy(p))
> +		attr->sched_priority = p->rt_priority;
> +	else
> +		attr->sched_nice = task_nice(p);
> +}
> +
>  /**
>   * sys_sched_setscheduler - set/change the scheduler policy and RT priority
>   * @pid: the pid in question.
> @@ -7393,6 +7403,8 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
>  	rcu_read_unlock();
>  
>  	if (likely(p)) {
> +		if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
> +			get_params(p, &attr);

SCHED_FLAG_KEEP_PARAMS is handled here but SCHED_FLAG_KEEP_POLICY
outside (before) the `if (likely(p))`?

>  		retval = sched_setattr(p, &attr);
>  		put_task_struct(p);
>  	}

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-07-22  8:47   ` Dietmar Eggemann
@ 2021-07-26 13:56     ` Quentin Perret
  2021-07-27 10:16       ` Quentin Perret
  2021-07-29 17:31       ` Dietmar Eggemann
  0 siblings, 2 replies; 12+ messages in thread
From: Quentin Perret @ 2021-07-26 13:56 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: mingo, peterz, vincent.guittot, qais.yousef, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

On Thursday 22 Jul 2021 at 10:47:33 (+0200), Dietmar Eggemann wrote:
> (*) This changes the behaviour when setting uclamp values on a DL task.
> 
> Before uclamp values could be set but now, because of
> 
>   void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
>     ..
>     attr->sched_flags = dl_se->flags
> 
> SCHED_FLAG_UTIL_CLAMP gets overwritten and  __sched_setscheduler() bails in:
> 
>     if (unlikely(policy == p->policy)) {
>       ...
>       retval = 0;
>       goto unlock;
>     }
>   change:
> 
> I.e. the:
> 
>       if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
>         goto change;
> 
> can't trigger anymore.

Bah, as you said it doesn't seem to be a big deal, but clearly that was
unintentional. Let me try and fix this.

> > +	else if (task_has_rt_policy(p))
> > +		attr->sched_priority = p->rt_priority;
> > +	else
> > +		attr->sched_nice = task_nice(p);
> > +}
> > +
> >  /**
> >   * sys_sched_setscheduler - set/change the scheduler policy and RT priority
> >   * @pid: the pid in question.
> > @@ -7393,6 +7403,8 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
> >  	rcu_read_unlock();
> >  
> >  	if (likely(p)) {
> > +		if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
> > +			get_params(p, &attr);
> 
> SCHED_FLAG_KEEP_PARAMS is handled here but SCHED_FLAG_KEEP_POLICY
> outside (before) the `if (likely(p))`?

Because I need to dereference p while SCHED_FLAG_KEEP_POLICY doesn't :)

> >  		retval = sched_setattr(p, &attr);
> >  		put_task_struct(p);
> >  	}
> 
> [...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-07-26 13:56     ` Quentin Perret
@ 2021-07-27 10:16       ` Quentin Perret
  2021-07-29 17:34         ` Dietmar Eggemann
  2021-07-29 17:31       ` Dietmar Eggemann
  1 sibling, 1 reply; 12+ messages in thread
From: Quentin Perret @ 2021-07-27 10:16 UTC (permalink / raw)
  To: Dietmar Eggemann
  Cc: mingo, peterz, vincent.guittot, qais.yousef, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

On Monday 26 Jul 2021 at 14:56:10 (+0100), Quentin Perret wrote:
> On Thursday 22 Jul 2021 at 10:47:33 (+0200), Dietmar Eggemann wrote:
> > (*) This changes the behaviour when setting uclamp values on a DL task.
> > 
> > Before uclamp values could be set but now, because of
> > 
> >   void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
> >     ..
> >     attr->sched_flags = dl_se->flags
> > 
> > SCHED_FLAG_UTIL_CLAMP gets overwritten and  __sched_setscheduler() bails in:
> > 
> >     if (unlikely(policy == p->policy)) {
> >       ...
> >       retval = 0;
> >       goto unlock;
> >     }
> >   change:
> > 
> > I.e. the:
> > 
> >       if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
> >         goto change;
> > 
> > can't trigger anymore.
> 
> Bah, as you said it doesn't seem to be a big deal, but clearly that was
> unintentional. Let me try and fix this.

While looking at this I found existing bugs in the area. Fixes are here:

https://lore.kernel.org/lkml/20210727101103.2729607-1-qperret@google.com/

And with the above series applied this patch should behave correctly
now.

Thanks,
Quentin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting
  2021-07-19 16:16 ` [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
  2021-07-21 10:07   ` Dietmar Eggemann
@ 2021-07-27 14:32   ` Qais Yousef
  1 sibling, 0 replies; 12+ messages in thread
From: Qais Yousef @ 2021-07-27 14:32 UTC (permalink / raw)
  To: Quentin Perret
  Cc: mingo, peterz, vincent.guittot, dietmar.eggemann, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

On 07/19/21 17:16, Quentin Perret wrote:
> The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last
> active task to maintain the last uclamp.max and prevent blocked util
> from suddenly becoming visible.
> 
> However, there is an asymmetry in how the flag is set and cleared which
> can lead to having the flag set whilst there are active tasks on the rq.
> Specifically, the flag is cleared in the uclamp_rq_inc() path, which is
> called at enqueue time, but set in uclamp_rq_dec_id() which is called
> both when dequeueing a task _and_ in the update_uclamp_active() path. As
> a result, when both uclamp_rq_{dec,ind}_id() are called from
> update_uclamp_active(), the flag ends up being set but not cleared,
> hence leaving the runqueue in a broken state.
> 
> Fix this by clearing the flag in update_uclamp_active() as well.
> 
> Fixes: e496187da710 ("sched/uclamp: Enforce last task's UCLAMP_MAX")
> Reported-by: Rick Yiu <rickyiu@google.com>
> Signed-off-by: Quentin Perret <qperret@google.com>
> ---

I've put a note that handling of this flag needs to be improved for the future.
But for now and FWIW, this LGTM:

Reviewed-by: Qais Yousef <qais.yousef@arm.com>

Thanks!

--
Qais Yousef

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-07-26 13:56     ` Quentin Perret
  2021-07-27 10:16       ` Quentin Perret
@ 2021-07-29 17:31       ` Dietmar Eggemann
  1 sibling, 0 replies; 12+ messages in thread
From: Dietmar Eggemann @ 2021-07-29 17:31 UTC (permalink / raw)
  To: Quentin Perret
  Cc: mingo, peterz, vincent.guittot, qais.yousef, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

On 26/07/2021 15:56, Quentin Perret wrote:
> On Thursday 22 Jul 2021 at 10:47:33 (+0200), Dietmar Eggemann wrote:

[...]

>>> @@ -7393,6 +7403,8 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
>>>  	rcu_read_unlock();
>>>  
>>>  	if (likely(p)) {
>>> +		if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
>>> +			get_params(p, &attr);
>>
>> SCHED_FLAG_KEEP_PARAMS is handled here but SCHED_FLAG_KEEP_POLICY
>> outside (before) the `if (likely(p))`?
> 
> Because I need to dereference p while SCHED_FLAG_KEEP_POLICY doesn't :)

Ah, true. Looked weird though.
But then the SCHED_FLAG_KEEP_POLICY condition can be placed closer to
the SCHED_FLAG_KEEP_PARAMS condition. We don't have to set
SETPARAM_POLICY if p == NULL.

>>>  		retval = sched_setattr(p, &attr);
>>>  		put_task_struct(p);
>>>  	}

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS
  2021-07-27 10:16       ` Quentin Perret
@ 2021-07-29 17:34         ` Dietmar Eggemann
  0 siblings, 0 replies; 12+ messages in thread
From: Dietmar Eggemann @ 2021-07-29 17:34 UTC (permalink / raw)
  To: Quentin Perret
  Cc: mingo, peterz, vincent.guittot, qais.yousef, rickyiu, wvw,
	patrick.bellasi, xuewen.yan94, linux-kernel, kernel-team

On 27/07/2021 12:16, Quentin Perret wrote:
> On Monday 26 Jul 2021 at 14:56:10 (+0100), Quentin Perret wrote:
>> On Thursday 22 Jul 2021 at 10:47:33 (+0200), Dietmar Eggemann wrote:
>>> (*) This changes the behaviour when setting uclamp values on a DL task.
>>>
>>> Before uclamp values could be set but now, because of
>>>
>>>   void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
>>>     ..
>>>     attr->sched_flags = dl_se->flags
>>>
>>> SCHED_FLAG_UTIL_CLAMP gets overwritten and  __sched_setscheduler() bails in:
>>>
>>>     if (unlikely(policy == p->policy)) {
>>>       ...
>>>       retval = 0;
>>>       goto unlock;
>>>     }
>>>   change:
>>>
>>> I.e. the:
>>>
>>>       if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
>>>         goto change;
>>>
>>> can't trigger anymore.
>>
>> Bah, as you said it doesn't seem to be a big deal, but clearly that was
>> unintentional. Let me try and fix this.
> 
> While looking at this I found existing bugs in the area. Fixes are here:
> 
> https://lore.kernel.org/lkml/20210727101103.2729607-1-qperret@google.com/
> 
> And with the above series applied this patch should behave correctly
> now.

It does. Like depicted in

https://lkml.kernel.org/r/e6d103f1-f8ee-cad9-c7c0-c9ea5d0f099a@arm.com

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-29 17:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-19 16:16 [PATCH v4 0/2] A couple of uclamp fixes Quentin Perret
2021-07-19 16:16 ` [PATCH v4 1/2] sched: Fix UCLAMP_FLAG_IDLE setting Quentin Perret
2021-07-21 10:07   ` Dietmar Eggemann
2021-07-21 13:09     ` Quentin Perret
2021-07-22  8:47       ` Dietmar Eggemann
2021-07-27 14:32   ` Qais Yousef
2021-07-19 16:16 ` [PATCH v4 2/2] sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS Quentin Perret
2021-07-22  8:47   ` Dietmar Eggemann
2021-07-26 13:56     ` Quentin Perret
2021-07-27 10:16       ` Quentin Perret
2021-07-29 17:34         ` Dietmar Eggemann
2021-07-29 17:31       ` Dietmar Eggemann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).