All of lore.kernel.org
 help / color / mirror / Atom feed
From: Aubrey Li <aubrey.intel@gmail.com>
To: Vineeth Remanan Pillai <vpillai@digitalocean.com>
Cc: "Nishanth Aravamudan" <naravamudan@digitalocean.com>,
	"Julien Desfossez" <jdesfossez@digitalocean.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Tim Chen" <tim.c.chen@linux.intel.com>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Paul Turner" <pjt@google.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Linux List Kernel Mailing" <linux-kernel@vger.kernel.org>,
	"Subhra Mazumdar" <subhra.mazumdar@oracle.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Greg Kerr" <kerrnel@google.com>, "Phil Auld" <pauld@redhat.com>,
	"Aaron Lu" <aaron.lwe@gmail.com>,
	"Valentin Schneider" <valentin.schneider@arm.com>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Pawan Gupta" <pawan.kumar.gupta@linux.intel.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [RFC PATCH v2 15/17] sched: Trivial forced-newidle balancer
Date: Wed, 24 Apr 2019 07:46:19 +0800	[thread overview]
Message-ID: <CAERHkrtP90QJCc-xxqxJbiWg+0jod_U2i-tHayw=Wd+WRDdTFQ@mail.gmail.com> (raw)
In-Reply-To: <002304fa58577c7926abe9f4cdc8039945985b3d.1556025155.git.vpillai@digitalocean.com>

On Wed, Apr 24, 2019 at 12:18 AM Vineeth Remanan Pillai
<vpillai@digitalocean.com> wrote:
>
> From: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> When a sibling is forced-idle to match the core-cookie; search for
> matching tasks to fill the core.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  include/linux/sched.h |   1 +
>  kernel/sched/core.c   | 131 +++++++++++++++++++++++++++++++++++++++++-
>  kernel/sched/idle.c   |   1 +
>  kernel/sched/sched.h  |   6 ++
>  4 files changed, 138 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index a4b39a28236f..1a309e8546cd 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -641,6 +641,7 @@ struct task_struct {
>  #ifdef CONFIG_SCHED_CORE
>         struct rb_node                  core_node;
>         unsigned long                   core_cookie;
> +       unsigned int                    core_occupation;
>  #endif
>
>  #ifdef CONFIG_CGROUP_SCHED
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 9e6e90c6f9b9..e8f5ec641d0a 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -217,6 +217,21 @@ struct task_struct *sched_core_find(struct rq *rq, unsigned long cookie)
>         return match;
>  }
>
> +struct task_struct *sched_core_next(struct task_struct *p, unsigned long cookie)
> +{
> +       struct rb_node *node = &p->core_node;
> +
> +       node = rb_next(node);
> +       if (!node)
> +               return NULL;
> +
> +       p = container_of(node, struct task_struct, core_node);
> +       if (p->core_cookie != cookie)
> +               return NULL;
> +
> +       return p;
> +}
> +
>  /*
>   * The static-key + stop-machine variable are needed such that:
>   *
> @@ -3672,7 +3687,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>         struct task_struct *next, *max = NULL;
>         const struct sched_class *class;
>         const struct cpumask *smt_mask;
> -       int i, j, cpu;
> +       int i, j, cpu, occ = 0;
>
>         if (!sched_core_enabled(rq))
>                 return __pick_next_task(rq, prev, rf);
> @@ -3763,6 +3778,9 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>                                 goto done;
>                         }
>
> +                       if (!is_idle_task(p))
> +                               occ++;
> +
>                         rq_i->core_pick = p;
>
>                         /*
> @@ -3786,6 +3804,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
>
>                                                 cpu_rq(j)->core_pick = NULL;
>                                         }
> +                                       occ = 1;
>                                         goto again;
>                                 }
>                         }
> @@ -3808,6 +3827,8 @@ next_class:;
>
>                 WARN_ON_ONCE(!rq_i->core_pick);
>
> +               rq_i->core_pick->core_occupation = occ;
> +
>                 if (i == cpu)
>                         continue;
>
> @@ -3823,6 +3844,114 @@ next_class:;
>         return next;
>  }
>
> +static bool try_steal_cookie(int this, int that)
> +{
> +       struct rq *dst = cpu_rq(this), *src = cpu_rq(that);
> +       struct task_struct *p;
> +       unsigned long cookie;
> +       bool success = false;
> +

try_steal_cookie() is in the loop of for_each_cpu_wrap().
The root domain could be large and we should avoid
stealing cookie if source rq has only one task or dst is really busy.

The following patch eliminated a deadlock issue on my side if I didn't
miss anything in v1. I'll double check with v2, but it at least avoids
unnecessary irq off/on and double rq lock. Especially, it avoids lock
contention that the idle cpu which is holding rq lock in the progress
of load_balance() and tries to lock rq here. I think it might be worth to
be picked up.

Thanks,
-Aubrey

---
 kernel/sched/core.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 191ebf9..973a75d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3876,6 +3876,13 @@ static bool try_steal_cookie(int this, int that)
        unsigned long cookie;
        bool success = false;

+       /*
+        * Don't steal if src is idle or has only one runnable task,
+        * or dst has more than one runnable task
+        */
+       if (src->nr_running <= 1 || unlikely(dst->nr_running >= 1))
+               return false;
+
        local_irq_disable();
        double_rq_lock(dst, src);

-- 
2.7.4

> +       local_irq_disable();
> +       double_rq_lock(dst, src);
> +
> +       cookie = dst->core->core_cookie;
> +       if (!cookie)
> +               goto unlock;
> +
> +       if (dst->curr != dst->idle)
> +               goto unlock;
> +
> +       p = sched_core_find(src, cookie);
> +       if (p == src->idle)
> +               goto unlock;
> +
> +       do {
> +               if (p == src->core_pick || p == src->curr)
> +                       goto next;
> +
> +               if (!cpumask_test_cpu(this, &p->cpus_allowed))
> +                       goto next;
> +
> +               if (p->core_occupation > dst->idle->core_occupation)
> +                       goto next;
> +
> +               p->on_rq = TASK_ON_RQ_MIGRATING;
> +               deactivate_task(src, p, 0);
> +               set_task_cpu(p, this);
> +               activate_task(dst, p, 0);
> +               p->on_rq = TASK_ON_RQ_QUEUED;
> +
> +               resched_curr(dst);
> +
> +               success = true;
> +               break;
> +
> +next:
> +               p = sched_core_next(p, cookie);
> +       } while (p);
> +
> +unlock:
> +       double_rq_unlock(dst, src);
> +       local_irq_enable();
> +
> +       return success;
> +}
> +
> +static bool steal_cookie_task(int cpu, struct sched_domain *sd)
> +{
> +       int i;
> +
> +       for_each_cpu_wrap(i, sched_domain_span(sd), cpu) {
> +               if (i == cpu)
> +                       continue;
> +
> +               if (need_resched())
> +                       break;
> +
> +               if (try_steal_cookie(cpu, i))
> +                       return true;
> +       }
> +
> +       return false;
> +}
> +
> +static void sched_core_balance(struct rq *rq)
> +{
> +       struct sched_domain *sd;
> +       int cpu = cpu_of(rq);
> +
> +       rcu_read_lock();
> +       raw_spin_unlock_irq(rq_lockp(rq));
> +       for_each_domain(cpu, sd) {
> +               if (!(sd->flags & SD_LOAD_BALANCE))
> +                       break;
> +
> +               if (need_resched())
> +                       break;
> +
> +               if (steal_cookie_task(cpu, sd))
> +                       break;
> +       }
> +       raw_spin_lock_irq(rq_lockp(rq));
> +       rcu_read_unlock();
> +}
> +
> +static DEFINE_PER_CPU(struct callback_head, core_balance_head);
> +
> +void queue_core_balance(struct rq *rq)
> +{
> +       if (!sched_core_enabled(rq))
> +               return;
> +
> +       if (!rq->core->core_cookie)
> +               return;
> +
> +       if (!rq->nr_running) /* not forced idle */
> +               return;
> +
> +       queue_balance_callback(rq, &per_cpu(core_balance_head, rq->cpu), sched_core_balance);
> +}
> +
>  #else /* !CONFIG_SCHED_CORE */
>
>  static struct task_struct *
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index e7f38da60373..44decdcccba1 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -387,6 +387,7 @@ static void set_next_task_idle(struct rq *rq, struct task_struct *next)
>  {
>         update_idle_core(rq);
>         schedstat_inc(rq->sched_goidle);
> +       queue_core_balance(rq);
>  }
>
>  static struct task_struct *
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 4cfde289610d..2a5f5a6b11ae 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1013,6 +1013,8 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq)
>         return &rq->__lock;
>  }
>
> +extern void queue_core_balance(struct rq *rq);
> +
>  #else /* !CONFIG_SCHED_CORE */
>
>  static inline bool sched_core_enabled(struct rq *rq)
> @@ -1025,6 +1027,10 @@ static inline raw_spinlock_t *rq_lockp(struct rq *rq)
>         return &rq->__lock;
>  }
>
> +static inline void queue_core_balance(struct rq *rq)
> +{
> +}
> +
>  #endif /* CONFIG_SCHED_CORE */
>
>  #ifdef CONFIG_SCHED_SMT
> --
> 2.17.1
>

  reply	other threads:[~2019-04-23 23:46 UTC|newest]

Thread overview: 109+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-23 16:18 [RFC PATCH v2 00/17] Core scheduling v2 Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 01/17] stop_machine: Fix stop_cpus_in_progress ordering Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 02/17] sched: Fix kerneldoc comment for ia64_set_curr_task Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 03/17] sched: Wrap rq::lock access Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 04/17] sched/{rt,deadline}: Fix set_next_task vs pick_next_task Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 05/17] sched: Add task_struct pointer to sched_class::set_curr_task Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 06/17] sched/fair: Export newidle_balance() Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 07/17] sched: Allow put_prev_task() to drop rq->lock Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 08/17] sched: Rework pick_next_task() slow-path Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 09/17] sched: Introduce sched_class::pick_task() Vineeth Remanan Pillai
2019-04-26 14:02   ` Peter Zijlstra
2019-04-26 16:10     ` Vineeth Remanan Pillai
2019-04-29  5:38   ` Aaron Lu
2019-04-23 16:18 ` [RFC PATCH v2 10/17] sched: Core-wide rq->lock Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 11/17] sched: Basic tracking of matching tasks Vineeth Remanan Pillai
2019-04-24  0:08   ` Tim Chen
2019-04-24 20:43     ` Vineeth Remanan Pillai
2019-04-24 22:12       ` Tim Chen
2019-04-25 14:35       ` Phil Auld
2019-05-22 19:52         ` Vineeth Remanan Pillai
2019-04-24  0:17   ` Tim Chen
2019-04-24 20:43     ` Vineeth Remanan Pillai
2019-04-29  3:36   ` Aaron Lu
2019-05-10 13:06     ` Peter Zijlstra
2019-04-29  6:15   ` Aaron Lu
2019-05-01 23:27     ` Tim Chen
2019-05-03  0:06       ` Tim Chen
2019-05-08 15:49         ` Aubrey Li
2019-05-08 18:19           ` Subhra Mazumdar
2019-05-08 18:37             ` Subhra Mazumdar
2019-05-09  0:01               ` Aubrey Li
2019-05-09  0:25                 ` Subhra Mazumdar
2019-05-09  1:38                   ` Aubrey Li
2019-05-09  2:14                     ` Subhra Mazumdar
2019-05-09 15:10                       ` Aubrey Li
2019-05-09 17:50                         ` Subhra Mazumdar
2019-05-10  0:09                           ` Tim Chen
2019-04-23 16:18 ` [RFC PATCH v2 12/17] sched: A quick and dirty cgroup tagging interface Vineeth Remanan Pillai
2019-04-25 14:26   ` Phil Auld
2019-04-26 14:13     ` Peter Zijlstra
2019-04-26 14:19       ` Phil Auld
2019-05-10 15:12   ` Julien Desfossez
2019-04-23 16:18 ` [RFC PATCH v2 13/17] sched: Add core wide task selection and scheduling Vineeth Remanan Pillai
2019-04-29  7:13   ` Aaron Lu
2019-05-18 15:37   ` Aubrey Li
2019-05-20 13:04     ` Phil Auld
2019-05-20 14:04       ` Vineeth Pillai
2019-05-21  8:19         ` Aubrey Li
2019-05-21 13:24           ` Vineeth Pillai
2019-04-23 16:18 ` [RFC PATCH v2 14/17] sched/fair: Add a few assertions Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 15/17] sched: Trivial forced-newidle balancer Vineeth Remanan Pillai
2019-04-23 23:46   ` Aubrey Li [this message]
2019-04-24 14:03     ` Vineeth Remanan Pillai
2019-04-24 14:05     ` Vineeth Remanan Pillai
2019-04-23 16:18 ` [RFC PATCH v2 16/17] sched: Wake up sibling if it has something to run Vineeth Remanan Pillai
2019-04-26 15:03   ` Peter Zijlstra
2019-04-29 12:36     ` Julien Desfossez
2019-04-23 16:18 ` [RFC PATCH v2 17/17] sched: Debug bits Vineeth Remanan Pillai
2019-05-17 17:18   ` Aubrey Li
2019-04-23 18:02 ` [RFC PATCH v2 00/17] Core scheduling v2 Phil Auld
2019-04-23 18:45   ` Vineeth Remanan Pillai
2019-04-29  3:53     ` Aaron Lu
2019-05-06 19:39       ` Julien Desfossez
2019-05-08  2:30         ` Aaron Lu
2019-05-08 17:49           ` Julien Desfossez
2019-05-09  2:11             ` Aaron Lu
2019-05-15 21:36               ` Vineeth Remanan Pillai
2019-04-23 23:25 ` Aubrey Li
2019-04-24 11:19   ` Vineeth Remanan Pillai
2019-05-15 21:39     ` Vineeth Remanan Pillai
2019-04-24 13:13 ` Aubrey Li
2019-04-24 14:00   ` Julien Desfossez
2019-04-25  3:15     ` Aubrey Li
2019-04-25  9:55       ` Ingo Molnar
2019-04-25 14:46         ` Mel Gorman
2019-04-25 18:53           ` Ingo Molnar
2019-04-25 18:59             ` Thomas Gleixner
2019-04-25 19:34               ` Ingo Molnar
2019-04-25 21:31             ` Mel Gorman
2019-04-26  8:42               ` Ingo Molnar
2019-04-26 10:43                 ` Mel Gorman
2019-04-26 18:37                   ` Subhra Mazumdar
2019-04-26 19:49                     ` Mel Gorman
2019-04-26  9:45               ` Ingo Molnar
2019-04-26 10:19                 ` Mel Gorman
2019-04-27  9:06                   ` Ingo Molnar
2019-04-26  9:51               ` Ingo Molnar
2019-04-26 14:15             ` Phil Auld
2019-04-26  2:18         ` Aubrey Li
2019-04-26  9:51           ` Ingo Molnar
2019-04-27  3:51         ` Aubrey Li
2019-04-27  9:17           ` Ingo Molnar
2019-04-27 14:04             ` Aubrey Li
2019-04-27 14:21               ` Ingo Molnar
2019-04-27 15:54                 ` Aubrey Li
2019-04-28  9:33                   ` Ingo Molnar
2019-04-28 10:29                     ` Aubrey Li
2019-04-28 12:17                       ` Ingo Molnar
2019-04-29  2:17                         ` Li, Aubrey
2019-04-29  6:14                           ` Ingo Molnar
2019-04-29 13:25                             ` Li, Aubrey
2019-04-29 15:39                               ` Phil Auld
2019-04-30  1:24                                 ` Aubrey Li
2019-04-29 16:00                               ` Ingo Molnar
2019-04-30  1:34                                 ` Aubrey Li
2019-04-30  4:42                                   ` Ingo Molnar
2019-05-18  0:58                                     ` Li, Aubrey
2019-05-18  1:08                                       ` Li, Aubrey
2019-04-25 14:36 ` Julien Desfossez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAERHkrtP90QJCc-xxqxJbiWg+0jod_U2i-tHayw=Wd+WRDdTFQ@mail.gmail.com' \
    --to=aubrey.intel@gmail.com \
    --cc=aaron.lwe@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=naravamudan@digitalocean.com \
    --cc=pauld@redhat.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vpillai@digitalocean.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.