linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qais Yousef <qais.yousef@arm.com>
To: Xuewen Yan <xuewen.yan94@gmail.com>
Cc: Quentin Perret <qperret@google.com>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Benjamin Segall <bsegall@google.com>,
	Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Chunyan Zhang <zhang.lyra@gmail.com>,
	Ryan Y <xuewyan@foxmail.com>,
	Patrick Bellasi <patrick.bellasi@matbug.net>,
	tj@kernel.org
Subject: Re: [PATCH] sched/uclamp: Avoid setting cpu.uclamp.min bigger than cpu.uclamp.max
Date: Mon, 7 Jun 2021 14:49:02 +0100	[thread overview]
Message-ID: <20210607134902.nlgvqtzj35rhjg7x@e107158-lin.cambridge.arm.com> (raw)
In-Reply-To: <CAB8ipk_a6VFNjiEnHRHkUMBKbA+qzPQvhtNjJ_YNzQhqV_o8Zw@mail.gmail.com>

On 06/05/21 21:24, Xuewen Yan wrote:
> Hi Qais
> 
> On Sat, Jun 5, 2021 at 7:49 PM Qais Yousef <qais.yousef@arm.com> wrote:
> > > >
> > > In addition,In your patch:
> > > 6938840392c89 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min")
> > > https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@arm.com
> > >
> > > + switch (clamp_id) {
> > > + case UCLAMP_MIN: {
> > > + struct uclamp_se uc_min = task_group(p)->uclamp[clamp_id];
> > > + if (uc_req.value < uc_min.value)
> > > + return uc_min;
> > > + break;
> > >
> > > When the clamp_id = UCLAMP_MIN, why not judge the uc_req.value is
> > > bigger than task_group(p)->uclamp[UCLAMP_MAX] ?
> >
> > Because of the requirement I pointed you to in cgroup-v2.rst. We must allow any
> > value to be requested.
> >
> > Ultimately if we had
> >
> >         cpu.uclamp.min = 80
> >         cpu.uclamp.max = 50
> >
> > then we want to remember the original request but make sure the effective value
> > is capped.
> >
> > For the user in the future modifies the values such that
> >
> >         cpu.uclamp.max = max
> >
> > Then we want to remember cpu.uclamp.min = 80 and apply it since now the
> > cpu.uclamp.max was relaxed to allow the boost value.
> >
> > > Because when the p->uclamp_req[UCLAMP_MIN] >  task_group(p)->uclamp[UCLAMP_MAX],
> > > the patch can not clamp the p->uclamp_req[UCLAMP_MIN/MAX] into
> > > [ task_group(p)->uclamp[UCLAMP_MAX],  task_group(p)->uclamp[UCLAMP_MAX] ].
> > >
> > > Is it necessary to fix it here?
> >
> > Nope. We must allow any combination values to be accepted and remember them so
> > if one changes we ensure the new effective value is updated accordingly.
> > This is how cgroups API works.
> 
> Sorry. I may not have expressed it clearly. In your patch (which has
> not yet merged into the mainline):
> 
> 6938840392c89 ("sched/uclamp: Fix wrong implementation of cpu.uclamp.min")
>  https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@arm.com
> 
> This patch will not affect p->uclamp_req, but consider the following situation:
> 
> tg->cpu.uclamp.min = 0
> tg->cpu.uclamp.max = 50%
> 
> p->uclamp_req[UCLAMP_MIN] = 60%
> p->uclamp_req[UCLAMP_MIN] = 80%
> 
> The function call process is as follows:
> uclamp_eff_value() -> uclamp_eff_get() ->uclamp_tg_restrict()
> 
> with your patch, the result is:
> 
> p->effective_uclamp_min = 60%
> p->effective_uclamp_max = 50%
> 
> It would not affect the uclamp_task_util(p), but affect the rq:
> when p enqueued:
> rq->uclamp[UCLAMP_MIN] = 60%
> rq->uclamp[UCLAMP_MIN] = 50%
> 
> futher more,  in uclamp_rq_util_with() {
> ...
> 
> min_util = READ_ONCE(rq->uclamp[UCLAMP_MIN].value); //60%
> max_util = READ_ONCE(rq->uclamp[UCLAMP_MAX].value);//50%
> ...
> if (unlikely(min_util >= max_util))
> return min_util;
> 
> return clamp(util, min_util, max_util);
> ...
> }
> as a result, it would return 60%.

Looking at this again now, I better understand what you were trying to say.
I got confused that you were still arguing about cgroup inverted
cpu.uclamp.min/max, but you're actually talking about something else.

It would be a lot easier to not cross talk threads and reply to my patch
directly with this remark.

Anyways, still well spotted!

What you're saying is we need something like the patch below to ensure that the
*task request* is within tg uclamp range, right? The worry is that the task's
uclamp_min is higher than the tg's uclamp_min, so we end up with the inversion
because of that which will not be corrected later.

Hmm I need to think a bit more about this..

Cheers

--
Qais Yousef

--->8---

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 9e9a5be35cde..e867813b9d5e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1405,6 +1405,7 @@ uclamp_tg_restrict(struct task_struct *p, enum uclamp_id clamp_id)
 {
        struct uclamp_se uc_req = p->uclamp_req[clamp_id];
 #ifdef CONFIG_UCLAMP_TASK_GROUP
+       unsigned long uc_min, uc_max, val;

        /*
         * Tasks in autogroups or root task group will be
@@ -1415,23 +1416,10 @@ uclamp_tg_restrict(struct task_struct *p, enum uclamp_id clamp_id)
        if (task_group(p) == &root_task_group)
                return uc_req;

-       switch (clamp_id) {
-       case UCLAMP_MIN: {
-               struct uclamp_se uc_min = task_group(p)->uclamp[clamp_id];
-               if (uc_req.value < uc_min.value)
-                       return uc_min;
-               break;
-       }
-       case UCLAMP_MAX: {
-               struct uclamp_se uc_max = task_group(p)->uclamp[clamp_id];
-               if (uc_req.value > uc_max.value)
-                       return uc_max;
-               break;
-       }
-       default:
-               WARN_ON_ONCE(1);
-               break;
-       }
+       uc_min = task_group(p)->uclamp[UCLAMP_MIN].value;
+       uc_max = task_group(p)->uclamp[UCLAMP_MAX].value;
+       val = uc_req.value;
+       uc_req.value = clamp(val, uc_min, uc_max);
 #endif

        return uc_req;


  parent reply	other threads:[~2021-06-07 13:49 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02 12:38 [PATCH] sched/uclamp: Avoid setting cpu.uclamp.min bigger than cpu.uclamp.max Xuewen Yan
2021-06-02 13:22 ` Quentin Perret
2021-06-03  2:24   ` Xuewen Yan
2021-06-04 16:08     ` Qais Yousef
2021-06-04 16:22       ` Dietmar Eggemann
2021-06-05  2:14         ` Xuewen Yan
2021-06-05  2:12       ` Xuewen Yan
2021-06-05 11:49         ` Qais Yousef
2021-06-05 13:24           ` Xuewen Yan
2021-06-05 14:14             ` Qais Yousef
2021-06-07 13:49             ` Qais Yousef [this message]
2021-06-08 11:45               ` Xuewen Yan
2021-06-08 14:25                 ` Qais Yousef
2021-06-08 15:01                   ` Xuewen Yan
2021-06-08 18:21                     ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210607134902.nlgvqtzj35rhjg7x@e107158-lin.cambridge.arm.com \
    --to=qais.yousef@arm.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=patrick.bellasi@matbug.net \
    --cc=peterz@infradead.org \
    --cc=qperret@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=xuewen.yan94@gmail.com \
    --cc=xuewyan@foxmail.com \
    --cc=zhang.lyra@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).