linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Bellasi <patrick.bellasi@arm.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-pm@vger.kernel.org, linux-api@vger.kernel.org,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>, Tejun Heo <tj@kernel.org>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Paul Turner <pjt@google.com>,
	Quentin Perret <quentin.perret@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Juri Lelli <juri.lelli@redhat.com>, Todd Kjos <tkjos@google.com>,
	Joel Fernandes <joelaf@google.com>,
	Steve Muckle <smuckle@google.com>
Subject: Re: [PATCH v8 12/16] sched/core: uclamp: Extend CPU's cgroup controller
Date: Tue, 7 May 2019 12:42:32 +0100	[thread overview]
Message-ID: <20190507114232.npsvba4itex5qpvl@e110439-lin> (raw)
In-Reply-To: <CAJuCfpFFSgRUFb9pyckpXWxr-z+mrrhcsLjZiVN5fZMvYC5XxQ@mail.gmail.com>

On 17-Apr 17:12, Suren Baghdasaryan wrote:
> On Tue, Apr 2, 2019 at 3:43 AM Patrick Bellasi <patrick.bellasi@arm.com> wrote:
> >
> > The cgroup CPU bandwidth controller allows to assign a specified
> > (maximum) bandwidth to the tasks of a group. However this bandwidth is
> > defined and enforced only on a temporal base, without considering the
> > actual frequency a CPU is running on. Thus, the amount of computation
> > completed by a task within an allocated bandwidth can be very different
> > depending on the actual frequency the CPU is running that task.
> > The amount of computation can be affected also by the specific CPU a
> > task is running on, especially when running on asymmetric capacity
> > systems like Arm's big.LITTLE.
> >
> > With the availability of schedutil, the scheduler is now able
> > to drive frequency selections based on actual task utilization.
> > Moreover, the utilization clamping support provides a mechanism to
> > bias the frequency selection operated by schedutil depending on
> > constraints assigned to the tasks currently RUNNABLE on a CPU.
> >
> > Giving the mechanisms described above, it is now possible to extend the
> > cpu controller to specify the minimum (or maximum) utilization which
> > should be considered for tasks RUNNABLE on a cpu.
> > This makes it possible to better defined the actual computational
> > power assigned to task groups, thus improving the cgroup CPU bandwidth
> > controller which is currently based just on time constraints.
> >
> > Extend the CPU controller with a couple of new attributes util.{min,max}
> > which allows to enforce utilization boosting and capping for all the
> > tasks in a group. Specifically:
> >
> > - util.min: defines the minimum utilization which should be considered
> >             i.e. the RUNNABLE tasks of this group will run at least at a
> >                  minimum frequency which corresponds to the util.min
> >                  utilization
> >
> > - util.max: defines the maximum utilization which should be considered
> >             i.e. the RUNNABLE tasks of this group will run up to a
> >                  maximum frequency which corresponds to the util.max
> >                  utilization
> >
> > These attributes:
> >
> > a) are available only for non-root nodes, both on default and legacy
> >    hierarchies, while system wide clamps are defined by a generic
> >    interface which does not depends on cgroups. This system wide
> >    interface enforces constraints on tasks in the root node.
> >
> > b) enforce effective constraints at each level of the hierarchy which
> >    are a restriction of the group requests considering its parent's
> >    effective constraints. Root group effective constraints are defined
> >    by the system wide interface.
> >    This mechanism allows each (non-root) level of the hierarchy to:
> >    - request whatever clamp values it would like to get
> >    - effectively get only up to the maximum amount allowed by its parent
> >
> > c) have higher priority than task-specific clamps, defined via
> >    sched_setattr(), thus allowing to control and restrict task requests
> >
> > Add two new attributes to the cpu controller to collect "requested"
> > clamp values. Allow that at each non-root level of the hierarchy.
> > Validate local consistency by enforcing util.min < util.max.
> > Keep it simple by do not caring now about "effective" values computation
> > and propagation along the hierarchy.
> >
> > Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Tejun Heo <tj@kernel.org>
> >
> > --
> > Changes in v8:
> >  Message-ID: <20190214154817.GN50184@devbig004.ftw2.facebook.com>
> >  - update changelog description for points b), c) and following paragraph
> > ---
> >  Documentation/admin-guide/cgroup-v2.rst |  27 +++++
> >  init/Kconfig                            |  22 ++++
> >  kernel/sched/core.c                     | 142 +++++++++++++++++++++++-
> >  kernel/sched/sched.h                    |   6 +
> >  4 files changed, 196 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > index 7bf3f129c68b..47710a77f4fa 100644
> > --- a/Documentation/admin-guide/cgroup-v2.rst
> > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > @@ -909,6 +909,12 @@ controller implements weight and absolute bandwidth limit models for
> >  normal scheduling policy and absolute bandwidth allocation model for
> >  realtime scheduling policy.
> >
> > +Cycles distribution is based, by default, on a temporal base and it
> > +does not account for the frequency at which tasks are executed.
> > +The (optional) utilization clamping support allows to enforce a minimum
> > +bandwidth, which should always be provided by a CPU, and a maximum bandwidth,
> > +which should never be exceeded by a CPU.
> > +
> >  WARNING: cgroup2 doesn't yet support control of realtime processes and
> >  the cpu controller can only be enabled when all RT processes are in
> >  the root cgroup.  Be aware that system management software may already
> > @@ -974,6 +980,27 @@ All time durations are in microseconds.
> >         Shows pressure stall information for CPU. See
> >         Documentation/accounting/psi.txt for details.
> >
> > +  cpu.util.min
> > +        A read-write single value file which exists on non-root cgroups.
> > +        The default is "0", i.e. no utilization boosting.
> > +
> > +        The requested minimum utilization in the range [0, 1024].
> > +
> > +        This interface allows reading and setting minimum utilization clamp
> > +        values similar to the sched_setattr(2). This minimum utilization
> > +        value is used to clamp the task specific minimum utilization clamp.
> > +
> > +  cpu.util.max
> > +        A read-write single value file which exists on non-root cgroups.
> > +        The default is "1024". i.e. no utilization capping
> > +
> > +        The requested maximum utilization in the range [0, 1024].
> > +
> > +        This interface allows reading and setting maximum utilization clamp
> > +        values similar to the sched_setattr(2). This maximum utilization
> > +        value is used to clamp the task specific maximum utilization clamp.
> > +
> > +
> >
> >  Memory
> >  ------
> > diff --git a/init/Kconfig b/init/Kconfig
> > index 7439cbf4d02e..33006e8de996 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -877,6 +877,28 @@ config RT_GROUP_SCHED
> >
> >  endif #CGROUP_SCHED
> >
> > +config UCLAMP_TASK_GROUP
> > +       bool "Utilization clamping per group of tasks"
> > +       depends on CGROUP_SCHED
> > +       depends on UCLAMP_TASK
> > +       default n
> > +       help
> > +         This feature enables the scheduler to track the clamped utilization
> > +         of each CPU based on RUNNABLE tasks currently scheduled on that CPU.
> > +
> > +         When this option is enabled, the user can specify a min and max
> > +         CPU bandwidth which is allowed for each single task in a group.
> > +         The max bandwidth allows to clamp the maximum frequency a task
> > +         can use, while the min bandwidth allows to define a minimum
> > +         frequency a task will always use.
> > +
> > +         When task group based utilization clamping is enabled, an eventually
> > +         specified task-specific clamp value is constrained by the cgroup
> > +         specified clamp value. Both minimum and maximum task clamping cannot
> > +         be bigger than the corresponding clamping defined at task group level.
> > +
> > +         If in doubt, say N.
> > +
> >  config CGROUP_PIDS
> >         bool "PIDs controller"
> >         help
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 71c9dd6487b1..aeed2dd315cc 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1130,8 +1130,12 @@ static void __init init_uclamp(void)
> >         /* System defaults allow max clamp values for both indexes */
> >         uc_max.value = uclamp_none(UCLAMP_MAX);
> >         uc_max.bucket_id = uclamp_bucket_id(uc_max.value);
> > -       for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id)
> > +       for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id) {
> >                 uclamp_default[clamp_id] = uc_max;
> > +#ifdef CONFIG_UCLAMP_TASK_GROUP
> > +               root_task_group.uclamp_req[clamp_id] = uc_max;
> > +#endif
> > +       }
> >  }
> >
> >  #else /* CONFIG_UCLAMP_TASK */
> > @@ -6720,6 +6724,19 @@ void ia64_set_curr_task(int cpu, struct task_struct *p)
> >  /* task_group_lock serializes the addition/removal of task groups */
> >  static DEFINE_SPINLOCK(task_group_lock);
> >
> > +static inline int alloc_uclamp_sched_group(struct task_group *tg,
> > +                                          struct task_group *parent)
> > +{
> > +#ifdef CONFIG_UCLAMP_TASK_GROUP
> > +       int clamp_id;
> > +
> > +       for (clamp_id = 0; clamp_id < UCLAMP_CNT; ++clamp_id)
> > +               tg->uclamp_req[clamp_id] = parent->uclamp_req[clamp_id];
> > +#endif
> > +
> > +       return 1;
> 
> Looks like you never return anything else neither here nor in the
> following patches I think...

That's right, I just preferred to keep the same structure in the
callsite below...

> > +}
> > +
> >  static void sched_free_group(struct task_group *tg)
> >  {
> >         free_fair_sched_group(tg);
> > @@ -6743,6 +6760,9 @@ struct task_group *sched_create_group(struct task_group *parent)
> >         if (!alloc_rt_sched_group(tg, parent))
> >                 goto err;
> >
> > +       if (!alloc_uclamp_sched_group(tg, parent))
> > +               goto err;
> > +

            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

... under the assumption the compiler is smart enough to optimized that.

But perhaps  it's less confusing to just use void, will update in v9.

> >         return tg;
> >
> >  err:
-- 
#include <best/regards.h>

Patrick Bellasi

  reply	other threads:[~2019-05-07 11:42 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02 10:41 [PATCH v8 00/16] Add utilization clamping support Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 01/16] sched/core: uclamp: Add CPU's clamp buckets refcounting Patrick Bellasi
2019-04-06 23:51   ` Suren Baghdasaryan
2019-04-08 11:49     ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 02/16] sched/core: Add bucket local max tracking Patrick Bellasi
2019-04-15 14:51   ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 03/16] sched/core: uclamp: Enforce last task's UCLAMP_MAX Patrick Bellasi
2019-04-17 20:36   ` Suren Baghdasaryan
2019-05-07 10:10     ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 04/16] sched/core: uclamp: Add system default clamps Patrick Bellasi
2019-04-18  0:51   ` Suren Baghdasaryan
2019-05-07 10:38     ` Patrick Bellasi
2019-05-08 18:42   ` Peter Zijlstra
2019-05-09  8:43     ` Patrick Bellasi
2019-05-08 19:00   ` Peter Zijlstra
2019-05-09  8:45     ` Patrick Bellasi
2019-05-08 19:07   ` Peter Zijlstra
2019-05-08 19:15     ` Peter Zijlstra
2019-05-09  9:10       ` Patrick Bellasi
2019-05-09 11:53         ` Peter Zijlstra
2019-05-09 13:04           ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 05/16] sched/core: Allow sched_setattr() to use the current policy Patrick Bellasi
2019-05-08 19:21   ` Peter Zijlstra
2019-05-09  9:18     ` Patrick Bellasi
2019-05-09 11:55       ` Peter Zijlstra
2019-05-09 14:59     ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping Patrick Bellasi
2019-04-17 22:26   ` Suren Baghdasaryan
2019-05-07 11:13     ` Patrick Bellasi
2019-05-08 19:44       ` Peter Zijlstra
2019-05-09  9:24         ` Patrick Bellasi
2019-05-08 19:41   ` Peter Zijlstra
2019-05-09  9:23     ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 07/16] sched/core: uclamp: Reset uclamp values on RESET_ON_FORK Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 08/16] sched/core: uclamp: Set default clamps for RT tasks Patrick Bellasi
2019-04-17 23:07   ` Suren Baghdasaryan
2019-05-07 11:25     ` Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 09/16] sched/cpufreq: uclamp: Add clamps for FAIR and " Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 10/16] sched/core: uclamp: Add uclamp_util_with() Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 11/16] sched/fair: uclamp: Add uclamp support to energy_compute() Patrick Bellasi
2019-05-09 12:51   ` Peter Zijlstra
2019-04-02 10:41 ` [PATCH v8 12/16] sched/core: uclamp: Extend CPU's cgroup controller Patrick Bellasi
2019-04-18  0:12   ` Suren Baghdasaryan
2019-05-07 11:42     ` Patrick Bellasi [this message]
2019-04-02 10:41 ` [PATCH v8 13/16] sched/core: uclamp: Propagate parent clamps Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 14/16] sched/core: uclamp: Propagate system defaults to root group Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 15/16] sched/core: uclamp: Use TG's clamps to restrict TASK's clamps Patrick Bellasi
2019-04-02 10:41 ` [PATCH v8 16/16] sched/core: uclamp: Update CPU's refcount on TG's clamp changes Patrick Bellasi
2019-05-09 13:02 ` [PATCH v8 00/16] Add utilization clamping support Peter Zijlstra
2019-05-09 13:09   ` Patrick Bellasi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190507114232.npsvba4itex5qpvl@e110439-lin \
    --to=patrick.bellasi@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=quentin.perret@arm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=smuckle@google.com \
    --cc=surenb@google.com \
    --cc=tj@kernel.org \
    --cc=tkjos@google.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).