From: Patrick Bellasi <patrick.bellasi@matbug.net>
To: Qais Yousef <qais.yousef@arm.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Luis Chamberlain <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Iurii Zaikin <yzaikin@google.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
qperret@google.com, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min
Date: Wed, 22 Jan 2020 11:19:44 +0100 [thread overview]
Message-ID: <20200122101944.GA12533@darkstar> (raw)
In-Reply-To: <20200114213403.cur6gydan6kmplqb@e107158-lin.cambridge.arm.com>
On 14-Jan 21:34, Qais Yousef wrote:
> On 01/09/20 10:21, Patrick Bellasi wrote:
> > That's not entirely true. In that patch we introduce cgroup support
> > but, if you look at the code before that patch, for CFS tasks there is
> > only:
> > - CFS task-specific values (min,max)=(0,1024) by default
> > - CFS system-wide tunables (min,max)=(1024,1024) by default
> > and a change on the system-wide tunable allows for example to enforce
> > a uclamp_max=200 on all tasks.
> >
> > A similar solution can be implemented for RT tasks, where we have:
> > - RT task-specific values (min,max)=(1024,1024) by default
> > - RT system-wide tunables (min,max)=(1024,1024) by default
> > and a change on the system-wide tunable allows for example to enforce
> > a uclamp_min=200 on all tasks.
>
> I feel I'm already getting lost in the complexity of the interaction here. Do
> we really need to go that path?
>
> So we will end up with a default system wide for all tasks + a CFS system wide
> default + an RT system wide default?
>
> As I understand it, we have a single system wide default now.
Right now we have one system wide default and that's both for all
CFS/RT tasks, when cgroups are not in use, or for root group and
autogroup CFS/RT tasks, when cgroups are in use.
What I'm proposing is to limit the usage of the current system wide
default to CFS tasks only, while we add a similar new one specifically
for RT tasks.
At the end we will have two system wide defaults, not three.
> > > (Would we need CONFIG_RT_GROUP_SCHED for this? IIRC there's a few pain points
> > > when turning it on, but I think we don't have to if we just want things like
> > > uclamp value propagation?)
> >
> > No, the current design for CFS tasks works also on !CONFIG_CFS_GROUP_SCHED.
> > That's because in this case:
> > - uclamp_tg_restrict() returns just the task requested value
> > - uclamp_eff_get() _always_ restricts the requested value considering
> > the system defaults
> >
> > > It's quite more work than the simple thing Qais is introducing (and on both
> > > user and kernel side).
> >
> > But if in the future we will want to extend CGroups support to RT then
> > we will feel the pains because we do the effective computation in two
> > different places.
>
> Hmm what you're suggesting here is that we want to have
> cpu.rt.uclamp.{min,max}? I'm not sure I can agree this is a good idea.
That's exactly what we already do for other controllers. For example,
if you look at the bandwidth controller, we have separate knobs for
CFS and RT tasks.
> It makes more sense to create a special group for all rt tasks rather than
> treat rt tasks in a cgroup differently.
Don't see why that should make more sense. This can work of course but
it would enforce a more strict configuration and usage of cgroups to
userspace.
I also have some doubths about this approach matching the delegation
model principles.
> > Do note that a proper CGroup support requires that the system default
> > values defines the values for the root group and are consistently
> > propagated down the hierarchy. Thus we need to add a dedicated pair of
> > cgroup attributes, e.g. cpu.util.rt.{min.max}.
> >
> > To recap, we don't need CGROUP support right now but just to add a new
> > default tracking similar to what we do for CFS.
> >
> > We already proposed such a support in one of the initial versions of
> > the uclamp series:
> > Message-ID: <20190115101513.2822-10-patrick.bellasi@arm.com>
> > https://lore.kernel.org/lkml/20190115101513.2822-10-patrick.bellasi@arm.com/
>
> IIUC what you're suggesting is:
>
> 1. Use the sysctl to specify the default_rt_uclamp_min
> 2. Enforce this value in uclamp_eff_get() rather than my sync logic
> 3. Remove the current hack to always set
> rt_task->uclamp_min = uclamp_none(UCLAMP_MAX)
Right, that's the essence...
> If I got it correctly I'd be happy to experiment with it if this is what
> you're suggesting. Otherwise I'm afraid I'm failing to see the crust of the
> problem you're trying to highlight.
... from what your write above I think you got it right.
In my previous posting:
Message-ID: <20200109092137.GA2811@darkstar>
https://lore.kernel.org/lkml/20200109092137.GA2811@darkstar/
there is also the code snippet which should be good enough to extend
uclamp_eff_get(). Apart from that, what remains is:
- to add the two new sysfs knobs for sysctl_sched_uclamp_util_{min,max}_rt
- make a call about how rt tasks in cgroups are clamped, a simple
proposal is in the second snippet of my message above.
Best,
Patrick
--
#include <best/regards.h>
Patrick Bellasi
next prev parent reply other threads:[~2020-01-22 10:20 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-20 16:48 [PATCH] sched/rt: Add a new sysctl to control uclamp_util_min Qais Yousef
2020-01-07 13:42 ` Quentin Perret
2020-01-07 19:30 ` Dietmar Eggemann
2020-01-08 9:51 ` Quentin Perret
2020-01-08 19:16 ` Patrick Bellasi
2020-01-09 11:36 ` Qais Yousef
2020-01-09 11:16 ` Qais Yousef
2020-01-09 11:12 ` Qais Yousef
2020-01-08 13:44 ` Peter Zijlstra
2020-01-08 19:08 ` Patrick Bellasi
2020-01-09 13:00 ` Qais Yousef
2020-01-10 13:39 ` Peter Zijlstra
2020-01-12 23:35 ` Qais Yousef
2020-01-10 13:42 ` Peter Zijlstra
2020-01-12 23:31 ` Qais Yousef
2020-01-08 18:56 ` Patrick Bellasi
2020-01-09 1:35 ` Valentin Schneider
2020-01-09 9:21 ` Patrick Bellasi
2020-01-09 13:38 ` Qais Yousef
2020-01-14 21:34 ` Qais Yousef
2020-01-22 10:19 ` Patrick Bellasi [this message]
2020-01-22 11:45 ` Qais Yousef
2020-01-22 12:44 ` Patrick Bellasi
2020-01-22 14:57 ` Qais Yousef
2020-01-09 13:15 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200122101944.GA12533@darkstar \
--to=patrick.bellasi@matbug.net \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=keescook@chromium.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=qais.yousef@arm.com \
--cc=qperret@google.com \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
--cc=yzaikin@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).