All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Valentin Schneider <valentin.schneider@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Phil Auld <pauld@redhat.com>, Parth Shah <parth@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>
Subject: Re: [PATCH v2 2/5] sched/numa: Replace runnable_load_avg by load_avg
Date: Tue, 18 Feb 2020 16:33:14 +0100	[thread overview]
Message-ID: <CAKfTPtB04MdaU+C4uSS=qU8O69UFjNsOJw5Ck17WBhxVNHETmg@mail.gmail.com> (raw)
In-Reply-To: <b67ae78b-17ba-8f3f-9052-fecefb848e3d@arm.com>

On Tue, 18 Feb 2020 at 15:54, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 2/14/20 3:27 PM, Vincent Guittot wrote:
> > @@ -1473,38 +1473,35 @@ bool should_numa_migrate_memory(struct task_struct *p, struct page * page,
> >              group_faults_cpu(ng, src_nid) * group_faults(p, dst_nid) * 4;
> >  }
> >
> > -static inline unsigned long cfs_rq_runnable_load_avg(struct cfs_rq *cfs_rq);
> > -
> > -static unsigned long cpu_runnable_load(struct rq *rq)
> > -{
> > -     return cfs_rq_runnable_load_avg(&rq->cfs);
> > -}
> > +/*
> > + * 'numa_type' describes the node at the moment of load balancing.
> > + */
> > +enum numa_type {
> > +     /* The node has spare capacity that can be used to run more tasks.  */
> > +     node_has_spare = 0,
> > +     /*
> > +      * The node is fully used and the tasks don't compete for more CPU
> > +      * cycles. Nevertheless, some tasks might wait before running.
> > +      */
> > +     node_fully_busy,
> > +     /*
> > +      * The node is overloaded and can't provide expected CPU cycles to all
> > +      * tasks.
> > +      */
> > +     node_overloaded
> > +};
>
> Could we reuse group_type instead? The definitions are the same modulo
> s/group/node/.

Also, imbalance might have but misfit and asym have no meaning at NUMA level

For now I prefer to keep them separate to ease code readability. Also
more changes will come on top of this for NUMA balancing which could
also ends up with numa dedicated states

>
> >
> >  /* Cached statistics for all CPUs within a node */
> >  struct numa_stats {
> >       unsigned long load;
> > -
> > +     unsigned long util;
> >       /* Total compute capacity of CPUs on a node */
> >       unsigned long compute_capacity;
> > +     unsigned int nr_running;
> > +     unsigned int weight;
> > +     enum numa_type node_type;
> >  };
> >
> > -/*
> > - * XXX borrowed from update_sg_lb_stats
> > - */
> > -static void update_numa_stats(struct numa_stats *ns, int nid)
> > -{
> > -     int cpu;
> > -
> > -     memset(ns, 0, sizeof(*ns));
> > -     for_each_cpu(cpu, cpumask_of_node(nid)) {
> > -             struct rq *rq = cpu_rq(cpu);
> > -
> > -             ns->load += cpu_runnable_load(rq);
> > -             ns->compute_capacity += capacity_of(cpu);
> > -     }
> > -
> > -}
> > -
> >  struct task_numa_env {
> >       struct task_struct *p;
> >
> > @@ -1521,6 +1518,47 @@ struct task_numa_env {
> >       int best_cpu;
> >  };
> >
> > +static unsigned long cpu_load(struct rq *rq);
> > +static unsigned long cpu_util(int cpu);
> > +
> > +static inline enum
> > +numa_type numa_classify(unsigned int imbalance_pct,
> > +                      struct numa_stats *ns)
> > +{
> > +     if ((ns->nr_running > ns->weight) &&
> > +         ((ns->compute_capacity * 100) < (ns->util * imbalance_pct)))
> > +             return node_overloaded;
> > +
> > +     if ((ns->nr_running < ns->weight) ||
> > +         ((ns->compute_capacity * 100) > (ns->util * imbalance_pct)))
> > +             return node_has_spare;
> > +
> > +     return node_fully_busy;
> > +}
> > +
>
> As Mel pointed out, this is group_is_overloaded() and group_has_capacity().
> @Mel, you mentioned having a common helper, do you have that laying around?
> I haven't seen it in your reconciliation series.
>
> What I'm naively thinking here is that we could have either move the whole
> thing to just sg_lb_stats (AFAICT the fields of numa_stats are a subset of it),
> or if we really care about the stack we could tweak the ordering to ensure
> we can cast one into the other (not too enticed by that one though).
>
> > +/*
> > + * XXX borrowed from update_sg_lb_stats
> > + */
> > +static void update_numa_stats(struct task_numa_env *env,
> > +                           struct numa_stats *ns, int nid)
> > +{
> > +     int cpu;
> > +
> > +     memset(ns, 0, sizeof(*ns));
> > +     for_each_cpu(cpu, cpumask_of_node(nid)) {
> > +             struct rq *rq = cpu_rq(cpu);
> > +
> > +             ns->load += cpu_load(rq);
> > +             ns->util += cpu_util(cpu);
> > +             ns->nr_running += rq->cfs.h_nr_running;
> > +             ns->compute_capacity += capacity_of(cpu);
> > +     }
> > +
> > +     ns->weight = cpumask_weight(cpumask_of_node(nid));
> > +
> > +     ns->node_type = numa_classify(env->imbalance_pct, ns);
> > +}
> > +
> >  static void task_numa_assign(struct task_numa_env *env,
> >                            struct task_struct *p, long imp)
> >  {

  reply	other threads:[~2020-02-18 15:33 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-14 15:27 [PATCH v2 0/5] remove runnable_load_avg and improve group_classify Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 1/5] sched/fair: Reorder enqueue/dequeue_task_fair path Vincent Guittot
2020-02-18 12:37   ` Dietmar Eggemann
2020-02-18 13:22     ` Peter Zijlstra
2020-02-18 14:15       ` Vincent Guittot
2020-02-19 11:07         ` Dietmar Eggemann
2020-02-19 16:26           ` Vincent Guittot
2020-02-20 13:38             ` Dietmar Eggemann
     [not found]               ` <20200222152541.GA11669@geo.homenetwork>
2020-02-26 16:30                 ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 2/5] sched/numa: Replace runnable_load_avg by load_avg Vincent Guittot
2020-02-18 12:37   ` Dietmar Eggemann
2020-02-18 13:50     ` Mel Gorman
2020-02-18 14:17       ` Vincent Guittot
2020-02-18 14:42         ` Dietmar Eggemann
2020-02-18 14:54   ` Valentin Schneider
2020-02-18 15:33     ` Vincent Guittot [this message]
2020-02-18 15:38     ` Mel Gorman
2020-02-18 16:50       ` Valentin Schneider
2020-02-18 17:41         ` Mel Gorman
2020-02-18 17:54           ` Valentin Schneider
2020-02-18 16:51       ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 3/5] sched/pelt: Remove unused runnable load average Vincent Guittot
2020-02-21  9:57   ` Dietmar Eggemann
2020-02-21 11:56     ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 4/5] sched/pelt: Add a new runnable average signal Vincent Guittot
2020-02-18 14:54   ` Valentin Schneider
2020-02-18 15:12     ` Peter Zijlstra
2020-02-18 15:28     ` Vincent Guittot
2020-02-18 16:30       ` Valentin Schneider
2020-02-18 21:19   ` Valentin Schneider
2020-02-19  9:02     ` Vincent Guittot
2020-02-19  9:08     ` Mel Gorman
2020-02-19 12:55   ` [PATCH v3 " Vincent Guittot
2020-02-19 14:02     ` Mel Gorman
2020-02-19 20:10     ` Valentin Schneider
2020-02-20 14:36       ` Vincent Guittot
2020-02-20 16:11         ` Valentin Schneider
2020-02-21  8:56           ` Vincent Guittot
2020-02-24 15:57             ` Valentin Schneider
2020-02-21  9:04           ` Mel Gorman
2020-02-21  9:25             ` Vincent Guittot
2020-02-21 10:40               ` Mel Gorman
2020-02-21 13:28                 ` Vincent Guittot
2020-02-20 15:04     ` Dietmar Eggemann
2020-02-21  9:44     ` Dietmar Eggemann
2020-02-21 11:47       ` Vincent Guittot
2020-02-14 15:27 ` [PATCH v2 5/5] sched/fair: Take into account runnable_avg to classify group Vincent Guittot
2020-02-15 21:58 ` [PATCH v2 0/5] remove runnable_load_avg and improve group_classify Mel Gorman
2020-02-21  9:58 ` Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKfTPtB04MdaU+C4uSS=qU8O69UFjNsOJw5Ck17WBhxVNHETmg@mail.gmail.com' \
    --to=vincent.guittot@linaro.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=parth@linux.ibm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=valentin.schneider@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.