All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Rik van Riel <riel@surriel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Gautham R Shenoy <ego@linux.vnet.ibm.com>,
	Parth Shah <parth@linux.ibm.com>,
	Aubrey Li <aubrey.li@linux.intel.com>
Subject: Re: [PATCH v3 2/8] sched/fair: Maintain the identity of idle-core
Date: Tue, 25 May 2021 09:11:06 +0200	[thread overview]
Message-ID: <CAKfTPtCMGqdSzDx+ioHpFhURAOESxR6SpH1T_GP-fb83xdRUEA@mail.gmail.com> (raw)
In-Reply-To: <20210522141045.GJ2633526@linux.vnet.ibm.com>

On Sat, 22 May 2021 at 16:11, Srikar Dronamraju
<srikar@linux.vnet.ibm.com> wrote:
>
> * Vincent Guittot <vincent.guittot@linaro.org> [2021-05-22 14:42:00]:
>
> > On Fri, 21 May 2021 at 15:31, Srikar Dronamraju
> > <srikar@linux.vnet.ibm.com> wrote:
> > >
> > > * Vincent Guittot <vincent.guittot@linaro.org> [2021-05-21 14:36:15]:
> > >
> > > > On Thu, 13 May 2021 at 09:40, Srikar Dronamraju
> > > > <srikar@linux.vnet.ibm.com> wrote:

...

> > > > > +#endif
> > > >
> > > > CPUA-core0 enters idle
> > > > All other CPUs of core0 are already idle
> > > > set idle_core = core0
> > > > CPUB-core1 enters idle
> > > > All other CPUs of core1 are already idle so core1 becomes idle
> > > >
> > > > A task wakes up and select_idle_core returns CPUA-core0
> > > > then idle_core=-1
> > > >
> > > > At next wake up, we skip select_idlecore whereas core1 is idle
> > > >
> > > > Do I miss something ?
> > > >
> > >
> > > You are right, but this is similar to what we do currently do too. Even
> > > without this patch, we got ahead an unconditionally (We dont even have an
> > > option to see if the selected CPU was from an idle-core.) set the idle-core
> > > to -1. (Please see the hunk I removed below)
> >
> > The current race window is limited between select_idle_core() not
> > finding an idle core and the call of set_idle_cores(this, false);
> > later in end select_idle_cpu().
> >
>
> However lets say there was only one idle-core, and select_idle_core() finds
> this idle-core, it just doesn't reset has-idle-core. So on the next wakeup,
> we end up iterating through the whole LLC to find if we have an idle-core.

Yes, the current algorithm is clearing the idle core flag only when it
hasn't been able to find one in order to stay cheap in the fast wakeup
path.


>
> Also even if there were more than one idle-core in LLC and the task had a
> limited cpu_allowed_list, and hence had to skip the idle-core, then we still
> go ahead and reset the idle-core.
>
> > In your proposal, the race is not limited in time anymore. As soon as
> > the 1st core being idle and setting idle_core is then selected by
> > select_idle_core, then idle_core is broken
> >
>
> Yes, but with the next patch, as soon as a CPU within this LLC goes to idle,
> it will search and set the right idle-core.
>
> > >
> > > I try to improve upon this in the next iteration. But that again we are
> > > seeing some higher utilization probably with that change.
> > >
> > > I plan to move to a cpumask based approach in v4.
> > > By which we dont have to search for setting an idle-core but we still know
> > > if any idle-cores are around. However that will have the extra penalty of
> > > atomic operations that you commented to in one of my patches.
> > >
> > > But if you have other ideas, I would be willing to try out.
> > >
> > > >
> > > >
> > > > >                                 return i;
> > > > > +                       }
> > > > >
> > > > >                 } else {
> > > > >                         if (!--nr)
> > > > > @@ -6218,9 +6226,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> > > > >                 }
> > > > >         }
> > > > >
> > > > > -       if (has_idle_core)
> > > > > -               set_idle_cores(this, false);
> > > > > -
> > >
> > > I was referring to this hunk.
> > >
> > > > >         if (sched_feat(SIS_PROP) && !has_idle_core) {
> > > > >                 time = cpu_clock(this) - time;
> > > > >                 update_avg(&this_sd->avg_scan_cost, time);
> > > > > @@ -6276,10 +6281,9 @@ static inline bool asym_fits_capacity(int task_util, int cpu)
> > > > >   */
> > > > >  static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > > > >  {
> > > > > -       bool has_idle_core = false;
> > > > > +       int i, recent_used_cpu, idle_core = -1;
> > > > >         struct sched_domain *sd;
> > > > >         unsigned long task_util;
> > > > > -       int i, recent_used_cpu;
> > > > >
> > > > >         /*
> > > > >          * On asymmetric system, update task utilization because we will check
> > > > > @@ -6357,16 +6361,16 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > > > >                 return target;
> > > > >
> > > > >         if (sched_smt_active()) {
> > > > > -               has_idle_core = test_idle_cores(target, false);
> > > > > +               idle_core = get_idle_core(target, -1);
> > > > >
> > > > > -               if (!has_idle_core && cpus_share_cache(prev, target)) {
> > > > > +               if (idle_core < 0 && cpus_share_cache(prev, target)) {
> > > > >                         i = select_idle_smt(p, sd, prev);
> > > > >                         if ((unsigned int)i < nr_cpumask_bits)
> > > > >                                 return i;
> > > > >                 }
> > > > >         }
> > > > >
> > > > > -       i = select_idle_cpu(p, sd, has_idle_core, target);
> > > > > +       i = select_idle_cpu(p, sd, idle_core, target);
> > > > >         if ((unsigned)i < nr_cpumask_bits)
> > > > >                 return i;
> > > > >
> > > > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > > > > index a189bec13729..22fbb50b036e 100644
> > > > > --- a/kernel/sched/sched.h
> > > > > +++ b/kernel/sched/sched.h
> > > > > @@ -1491,6 +1491,9 @@ static inline struct sched_domain *lowest_flag_domain(int cpu, int flag)
> > > > >  DECLARE_PER_CPU(struct sched_domain __rcu *, sd_llc);
> > > > >  DECLARE_PER_CPU(int, sd_llc_size);
> > > > >  DECLARE_PER_CPU(int, sd_llc_id);
> > > > > +#ifdef CONFIG_SCHED_SMT
> > > > > +DECLARE_PER_CPU(int, smt_id);
> > > > > +#endif
> > > > >  DECLARE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
> > > > >  DECLARE_PER_CPU(struct sched_domain __rcu *, sd_numa);
> > > > >  DECLARE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing);
> > > > > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > > > > index 55a0a243e871..232fb261dfc2 100644
> > > > > --- a/kernel/sched/topology.c
> > > > > +++ b/kernel/sched/topology.c
> > > > > @@ -644,6 +644,9 @@ static void destroy_sched_domains(struct sched_domain *sd)
> > > > >  DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc);
> > > > >  DEFINE_PER_CPU(int, sd_llc_size);
> > > > >  DEFINE_PER_CPU(int, sd_llc_id);
> > > > > +#ifdef CONFIG_SCHED_SMT
> > > > > +DEFINE_PER_CPU(int, smt_id);
> > > > > +#endif
> > > > >  DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
> > > > >  DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa);
> > > > >  DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_packing);
> > > > > @@ -667,6 +670,9 @@ static void update_top_cache_domain(int cpu)
> > > > >         rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
> > > > >         per_cpu(sd_llc_size, cpu) = size;
> > > > >         per_cpu(sd_llc_id, cpu) = id;
> > > > > +#ifdef CONFIG_SCHED_SMT
> > > > > +       per_cpu(smt_id, cpu) = cpumask_first(cpu_smt_mask(cpu));
> > > > > +#endif
> > > > >         rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds);
> > > > >
> > > > >         sd = lowest_flag_domain(cpu, SD_NUMA);
> > > > > @@ -1497,6 +1503,7 @@ sd_init(struct sched_domain_topology_level *tl,
> > > > >                 sd->shared = *per_cpu_ptr(sdd->sds, sd_id);
> > > > >                 atomic_inc(&sd->shared->ref);
> > > > >                 atomic_set(&sd->shared->nr_busy_cpus, sd_weight);
> > > > > +               sd->shared->idle_core = -1;
> > > > >         }
> > > > >
> > > > >         sd->private = sdd;
> > > > > --
> > > > > 2.18.2
> > > > >
> > >
> > > --
> > > Thanks and Regards
> > > Srikar Dronamraju
>
> --
> Thanks and Regards
> Srikar Dronamraju

  reply	other threads:[~2021-05-25  7:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-13  7:40 [PATCH v3 0/8] sched/fair: wake_affine improvements Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 1/8] sched/fair: Update affine statistics when needed Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 2/8] sched/fair: Maintain the identity of idle-core Srikar Dronamraju
2021-05-21 12:36   ` Vincent Guittot
2021-05-21 13:31     ` Srikar Dronamraju
2021-05-22 12:42       ` Vincent Guittot
2021-05-22 14:10         ` Srikar Dronamraju
2021-05-25  7:11           ` Vincent Guittot [this message]
2021-05-13  7:40 ` [PATCH v3 3/8] sched/fair: Update idle-core more often Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 4/8] sched/fair: Prefer idle CPU to cache affinity Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 5/8] sched/fair: Use affine_idler_llc for wakeups across LLC Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 6/8] sched/idle: Move busy_cpu accounting to idle callback Srikar Dronamraju
2021-05-21 12:37   ` Vincent Guittot
2021-05-21 13:21     ` Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 7/8] sched/fair: Remove ifdefs in waker_affine_idler_llc Srikar Dronamraju
2021-05-13  7:40 ` [PATCH v3 8/8] sched/fair: Dont iterate if no idle CPUs Srikar Dronamraju
2021-05-19  9:36 ` [PATCH v3 0/8] sched/fair: wake_affine improvements Mel Gorman
2021-05-19 16:55   ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKfTPtCMGqdSzDx+ioHpFhURAOESxR6SpH1T_GP-fb83xdRUEA@mail.gmail.com \
    --to=vincent.guittot@linaro.org \
    --cc=aubrey.li@linux.intel.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=parth@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.