All of lore.kernel.org
 help / color / mirror / Atom feed
From: Quentin Perret <quentin.perret@arm.com>
To: Joel Fernandes <joelaf@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thara Gopinath <thara.gopinath@linaro.org>,
	Linux PM <linux-pm@vger.kernel.org>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Chris Redpath <chris.redpath@arm.com>,
	Patrick Bellasi <patrick.bellasi@arm.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Todd Kjos <tkjos@google.com>, Juri Lelli <juri.lelli@redhat.com>,
	Steve Muckle <smuckle@google.com>,
	Eduardo Valentin <edubezval@gmail.com>
Subject: Re: [RFC PATCH v2 3/6] sched: Add over-utilization/tipping point indicator
Date: Fri, 20 Apr 2018 09:31:18 +0100	[thread overview]
Message-ID: <20180420083118.GA14391@e108498-lin.cambridge.arm.com> (raw)
In-Reply-To: <CAJWu+oqGtfpwX9_s=Gwt5CKQNfRhztV48RjriGiBS7ZXG8g90w@mail.gmail.com>

On Friday 20 Apr 2018 at 01:14:35 (-0700), Joel Fernandes wrote:
> On Fri, Apr 20, 2018 at 1:13 AM, Joel Fernandes <joelaf@google.com> wrote:
> > On Wed, Apr 18, 2018 at 4:17 AM, Quentin Perret <quentin.perret@arm.com> wrote:
> >> On Friday 13 Apr 2018 at 16:56:39 (-0700), Joel Fernandes wrote:
> >>> Hi,
> >>>
> >>> On Fri, Apr 6, 2018 at 8:36 AM, Dietmar Eggemann
> >>> <dietmar.eggemann@arm.com> wrote:
> >>> > From: Thara Gopinath <thara.gopinath@linaro.org>
> >>> >
> >>> > Energy-aware scheduling should only operate when the system is not
> >>> > overutilized. There must be cpu time available to place tasks based on
> >>> > utilization in an energy-aware fashion, i.e. to pack tasks on
> >>> > energy-efficient cpus without harming the overall throughput.
> >>> >
> >>> > In case the system operates above this tipping point the tasks have to
> >>> > be placed based on task and cpu load in the classical way of spreading
> >>> > tasks across as many cpus as possible.
> >>> >
> >>> > The point in which a system switches from being not overutilized to
> >>> > being overutilized is called the tipping point.
> >>> >
> >>> > Such a tipping point indicator on a sched domain as the system
> >>> > boundary is introduced here. As soon as one cpu of a sched domain is
> >>> > overutilized the whole sched domain is declared overutilized as well.
> >>> > A cpu becomes overutilized when its utilization is higher that 80%
> >>> > (capacity_margin) of its capacity.
> >>> >
> >>> > The implementation takes advantage of the shared sched domain which is
> >>> > shared across all per-cpu views of a sched domain level. The new
> >>> > overutilized flag is placed in this shared sched domain.
> >>> >
> >>> > Load balancing is skipped in case the energy model is present and the
> >>> > sched domain is not overutilized because under this condition the
> >>> > predominantly load-per-capacity driven load-balancer should not
> >>> > interfere with the energy-aware wakeup placement based on utilization.
> >>> >
> >>> > In case the total utilization of a sched domain is greater than the
> >>> > total sched domain capacity the overutilized flag is set at the parent
> >>> > sched domain level to let other sched groups help getting rid of the
> >>> > overutilization of cpus.
> >>> >
> >>> > Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
> >>> > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> >>> > ---
> >>> >  include/linux/sched/topology.h |  1 +
> >>> >  kernel/sched/fair.c            | 62 ++++++++++++++++++++++++++++++++++++++++--
> >>> >  kernel/sched/sched.h           |  1 +
> >>> >  kernel/sched/topology.c        | 12 +++-----
> >>> >  4 files changed, 65 insertions(+), 11 deletions(-)
> >>> >
> >>> > diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> >>> > index 26347741ba50..dd001c232646 100644
> >>> > --- a/include/linux/sched/topology.h
> >>> > +++ b/include/linux/sched/topology.h
> >>> > @@ -72,6 +72,7 @@ struct sched_domain_shared {
> >>> >         atomic_t        ref;
> >>> >         atomic_t        nr_busy_cpus;
> >>> >         int             has_idle_cores;
> >>> > +       int             overutilized;
> >>> >  };
> >>> >
> >>> >  struct sched_domain {
> >>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> > index 0a76ad2ef022..6960e5ef3c14 100644
> >>> > --- a/kernel/sched/fair.c
> >>> > +++ b/kernel/sched/fair.c
> >>> > @@ -5345,6 +5345,28 @@ static inline void hrtick_update(struct rq *rq)
> >>> >  }
> >>> >  #endif
> >>> >
> >>> > +#ifdef CONFIG_SMP
> >>> > +static inline int cpu_overutilized(int cpu);
> >>> > +
> >>> > +static inline int sd_overutilized(struct sched_domain *sd)
> >>> > +{
> >>> > +       return READ_ONCE(sd->shared->overutilized);
> >>> > +}
> >>> > +
> >>> > +static inline void update_overutilized_status(struct rq *rq)
> >>> > +{
> >>> > +       struct sched_domain *sd;
> >>> > +
> >>> > +       rcu_read_lock();
> >>> > +       sd = rcu_dereference(rq->sd);
> >>> > +       if (sd && !sd_overutilized(sd) && cpu_overutilized(rq->cpu))
> >>> > +               WRITE_ONCE(sd->shared->overutilized, 1);
> >>> > +       rcu_read_unlock();
> >>> > +}
> >>> > +#else
> >>> > +static inline void update_overutilized_status(struct rq *rq) {}
> >>> > +#endif /* CONFIG_SMP */
> >>> > +
> >>> >  /*
> >>> >   * The enqueue_task method is called before nr_running is
> >>> >   * increased. Here we update the fair scheduling stats and
> >>> > @@ -5394,8 +5416,10 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >>> >                 update_cfs_group(se);
> >>> >         }
> >>> >
> >>> > -       if (!se)
> >>> > +       if (!se) {
> >>> >                 add_nr_running(rq, 1);
> >>> > +               update_overutilized_status(rq);
> >>> > +       }
> >>>
> >>> I'm wondering if it makes sense for considering scenarios whether
> >>> other classes cause CPUs in the domain to go above the tipping point.
> >>> Then in that case also, it makes sense to not to do EAS in that domain
> >>> because of the overutilization.
> >>>
> >>> I guess task_fits using cpu_util which is PELT only at the moment...
> >>> so may require some other method like aggregation of CFS PELT, with
> >>> RT-PELT and DL running bw or something.
> >>>
> >>
> >> So at the moment in cpu_overutilized() we comapre cpu_util() to
> >> capacity_of() which should include RT and IRQ pressure IIRC. But
> >> you're right, we might be able to do more here... Perhaps we
> >> could also use cpu_util_dl() which is available in sched.h now ?
> >
> > Yes, should be Ok, and then when RT utilization stuff is available,
> > then that can be included in the equation as well (probably for now
> > you could use rt_avg).
> >
> > Another crazy idea is to check the contribution of higher classes in
> > one-shot with (capacity_orig_of - capacity_of) although I think that
> > method would be less instantaneous/accurate.
> 
> Just to add to the last point, the capacity_of also factors in the IRQ
> contribution if I remember correctly, which is probably a good thing?
> 

I think so too yes. But actually, since we compare cpu_util() to
capacity_of() in cpu_overutilized(), the current implementation should
already be fairly similar to the "capacity_orig_of - capacity_of"
implementation you're suggesting I guess.
And I agree that when Vincent's RT PELT patches get merged we should
probably use that :-)

Thanks !
Quentin

  reply	other threads:[~2018-04-20  8:31 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-06 15:36 [RFC PATCH v2 0/6] Energy Aware Scheduling Dietmar Eggemann
2018-04-06 15:36 ` [RFC PATCH v2 1/6] sched/fair: Create util_fits_capacity() Dietmar Eggemann
2018-04-12  7:02   ` Viresh Kumar
2018-04-12  8:20     ` Dietmar Eggemann
2018-04-06 15:36 ` [RFC PATCH v2 2/6] sched: Introduce energy models of CPUs Dietmar Eggemann
2018-04-10 11:54   ` Peter Zijlstra
2018-04-10 12:03     ` Dietmar Eggemann
2018-04-13  4:02   ` Viresh Kumar
2018-04-13  8:37     ` Quentin Perret
2018-04-06 15:36 ` [RFC PATCH v2 3/6] sched: Add over-utilization/tipping point indicator Dietmar Eggemann
2018-04-13 23:56   ` Joel Fernandes
2018-04-18 11:17     ` Quentin Perret
2018-04-20  8:13       ` Joel Fernandes
2018-04-20  8:14         ` Joel Fernandes
2018-04-20  8:31           ` Quentin Perret [this message]
2018-04-20  8:57             ` Juri Lelli
2018-04-17 14:25   ` Leo Yan
2018-04-17 17:39     ` Dietmar Eggemann
2018-04-18  0:18       ` Leo Yan
2018-04-06 15:36 ` [RFC PATCH v2 4/6] sched/fair: Introduce an energy estimation helper function Dietmar Eggemann
2018-04-10 12:51   ` Peter Zijlstra
2018-04-10 13:56     ` Quentin Perret
2018-04-10 14:08       ` Peter Zijlstra
2018-04-13  6:27   ` Viresh Kumar
2018-04-17 15:22   ` Leo Yan
2018-04-18  8:13     ` Quentin Perret
2018-04-18  9:19       ` Leo Yan
2018-04-18 11:06         ` Quentin Perret
2018-04-18  9:23   ` Leo Yan
2018-04-20 14:51     ` Quentin Perret
2018-04-18 12:15   ` Leo Yan
2018-04-20 14:42     ` Quentin Perret
2018-04-20 16:27       ` Leo Yan
2018-04-25  8:23         ` Quentin Perret
2018-04-06 15:36 ` [RFC PATCH v2 5/6] sched/fair: Select an energy-efficient CPU on task wake-up Dietmar Eggemann
2018-04-09 16:30   ` Peter Zijlstra
2018-04-09 16:43     ` Quentin Perret
2018-04-10 17:29   ` Peter Zijlstra
2018-04-10 18:14     ` Quentin Perret
2018-04-17 15:39   ` Leo Yan
2018-04-18  7:57     ` Quentin Perret
2018-04-06 15:36 ` [RFC PATCH v2 6/6] drivers: base: arch_topology.c: Enable EAS for arm/arm64 platforms Dietmar Eggemann
2018-04-17 12:50 ` [RFC PATCH v2 0/6] Energy Aware Scheduling Leo Yan
2018-04-17 17:22   ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180420083118.GA14391@e108498-lin.cambridge.arm.com \
    --to=quentin.perret@arm.com \
    --cc=chris.redpath@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=edubezval@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joelaf@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=patrick.bellasi@arm.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=smuckle@google.com \
    --cc=thara.gopinath@linaro.org \
    --cc=tkjos@google.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.