From: "Chen, Rong A" <rong.a.chen@intel.com>
To: Vincent Guittot <vincent.guittot@linaro.org>,
linux-kernel@vger.kernel.org, mingo@redhat.com,
peterz@infradead.org
Cc: pauld@redhat.com, valentin.schneider@arm.com,
srikar@linux.vnet.ibm.com, quentin.perret@arm.com,
dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com,
hdanton@sina.com, parth@linux.ibm.com, riel@surriel.com
Subject: Re: [PATCH] sched/fair: fix rework of find_idlest_group()
Date: Wed, 23 Oct 2019 15:50:28 +0800 [thread overview]
Message-ID: <b6dafa6d-6d9a-3499-85b8-52c5163cd38d@intel.com> (raw)
In-Reply-To: <1571762798-25900-1-git-send-email-vincent.guittot@linaro.org>
Tested-by: kernel test robot <rong.a.chen@intel.com>
On 10/23/2019 12:46 AM, Vincent Guittot wrote:
> The task, for which the scheduler looks for the idlest group of CPUs, must
> be discounted from all statistics in order to get a fair comparison
> between groups. This includes utilization, load, nr_running and idle_cpus.
>
> Such unfairness can be easily highlighted with the unixbench execl 1 task.
> This test continuously call execve() and the scheduler looks for the idlest
> group/CPU on which it should place the task. Because the task runs on the
> local group/CPU, the latter seems already busy even if there is nothing
> else running on it. As a result, the scheduler will always select another
> group/CPU than the local one.
>
> Fixes: 57abff067a08 ("sched/fair: Rework find_idlest_group()")
> Reported-by: kernel test robot <rong.a.chen@intel.com>
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>
> This recover most of the perf regression on my system and I have asked
> Rong if he can rerun the test with the patch to check that it fixes his
> system as well.
>
> kernel/sched/fair.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 83 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a81c364..0ad4b21 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5379,6 +5379,36 @@ static unsigned long cpu_load(struct rq *rq)
> {
> return cfs_rq_load_avg(&rq->cfs);
> }
> +/*
> + * cpu_load_without - compute cpu load without any contributions from *p
> + * @cpu: the CPU which load is requested
> + * @p: the task which load should be discounted
> + *
> + * The load of a CPU is defined by the load of tasks currently enqueued on that
> + * CPU as well as tasks which are currently sleeping after an execution on that
> + * CPU.
> + *
> + * This method returns the load of the specified CPU by discounting the load of
> + * the specified task, whenever the task is currently contributing to the CPU
> + * load.
> + */
> +static unsigned long cpu_load_without(struct rq *rq, struct task_struct *p)
> +{
> + struct cfs_rq *cfs_rq;
> + unsigned int load;
> +
> + /* Task has no contribution or is new */
> + if (cpu_of(rq) != task_cpu(p) || !READ_ONCE(p->se.avg.last_update_time))
> + return cpu_load(rq);
> +
> + cfs_rq = &rq->cfs;
> + load = READ_ONCE(cfs_rq->avg.load_avg);
> +
> + /* Discount task's util from CPU's util */
> + lsub_positive(&load, task_h_load(p));
> +
> + return load;
> +}
>
> static unsigned long capacity_of(int cpu)
> {
> @@ -8117,10 +8147,55 @@ static inline enum fbq_type fbq_classify_rq(struct rq *rq)
> struct sg_lb_stats;
>
> /*
> + * task_running_on_cpu - return 1 if @p is running on @cpu.
> + */
> +
> +static unsigned int task_running_on_cpu(int cpu, struct task_struct *p)
> +{
> + /* Task has no contribution or is new */
> + if (cpu != task_cpu(p) || !READ_ONCE(p->se.avg.last_update_time))
> + return 0;
> +
> + if (task_on_rq_queued(p))
> + return 1;
> +
> + return 0;
> +}
> +
> +/**
> + * idle_cpu_without - would a given CPU be idle without p ?
> + * @cpu: the processor on which idleness is tested.
> + * @p: task which should be ignored.
> + *
> + * Return: 1 if the CPU would be idle. 0 otherwise.
> + */
> +static int idle_cpu_without(int cpu, struct task_struct *p)
> +{
> + struct rq *rq = cpu_rq(cpu);
> +
> + if ((rq->curr != rq->idle) && (rq->curr != p))
> + return 0;
> +
> + /*
> + * rq->nr_running can't be used but an updated version without the
> + * impact of p on cpu must be used instead. The updated nr_running
> + * be computed and tested before calling idle_cpu_without().
> + */
> +
> +#ifdef CONFIG_SMP
> + if (!llist_empty(&rq->wake_list))
> + return 0;
> +#endif
> +
> + return 1;
> +}
> +
> +/*
> * update_sg_wakeup_stats - Update sched_group's statistics for wakeup.
> - * @denv: The ched_domain level to look for idlest group.
> + * @sd: The sched_domain level to look for idlest group.
> * @group: sched_group whose statistics are to be updated.
> * @sgs: variable to hold the statistics for this group.
> + * @p: The task for which we look for the idlest group/CPU.
> */
> static inline void update_sg_wakeup_stats(struct sched_domain *sd,
> struct sched_group *group,
> @@ -8133,21 +8208,22 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd,
>
> for_each_cpu(i, sched_group_span(group)) {
> struct rq *rq = cpu_rq(i);
> + unsigned int local;
>
> - sgs->group_load += cpu_load(rq);
> + sgs->group_load += cpu_load_without(rq, p);
> sgs->group_util += cpu_util_without(i, p);
> - sgs->sum_h_nr_running += rq->cfs.h_nr_running;
> + local = task_running_on_cpu(i, p);
> + sgs->sum_h_nr_running += rq->cfs.h_nr_running - local;
>
> - nr_running = rq->nr_running;
> + nr_running = rq->nr_running - local;
> sgs->sum_nr_running += nr_running;
>
> /*
> - * No need to call idle_cpu() if nr_running is not 0
> + * No need to call idle_cpu_without() if nr_running is not 0
> */
> - if (!nr_running && idle_cpu(i))
> + if (!nr_running && idle_cpu_without(i, p))
> sgs->idle_cpus++;
>
> -
> }
>
> /* Check if task fits in the group */
next prev parent reply other threads:[~2019-10-23 7:50 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-18 13:26 [PATCH v4 00/10] sched/fair: rework the CFS load balance Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 01/11] sched/fair: clean up asym packing Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Clean " tip-bot2 for Vincent Guittot
2019-10-30 14:51 ` [PATCH v4 01/11] sched/fair: clean " Mel Gorman
2019-10-30 16:03 ` Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 02/11] sched/fair: rename sum_nr_running to sum_h_nr_running Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Rename sg_lb_stats::sum_nr_running " tip-bot2 for Vincent Guittot
2019-10-30 14:53 ` [PATCH v4 02/11] sched/fair: rename sum_nr_running " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 03/11] sched/fair: remove meaningless imbalance calculation Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Remove " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 04/11] sched/fair: rework load_balance Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Rework load_balance() tip-bot2 for Vincent Guittot
2019-10-30 15:45 ` [PATCH v4 04/11] sched/fair: rework load_balance Mel Gorman
2019-10-30 16:16 ` Valentin Schneider
2019-10-31 9:09 ` Vincent Guittot
2019-10-31 10:15 ` Mel Gorman
2019-10-31 11:13 ` Vincent Guittot
2019-10-31 11:40 ` Mel Gorman
2019-11-08 16:35 ` Vincent Guittot
2019-11-08 18:37 ` Mel Gorman
2019-11-12 10:58 ` Vincent Guittot
2019-11-12 15:06 ` Mel Gorman
2019-11-12 15:40 ` Vincent Guittot
2019-11-12 17:45 ` Mel Gorman
2019-11-18 13:50 ` Ingo Molnar
2019-11-18 13:57 ` Vincent Guittot
2019-11-18 14:51 ` Mel Gorman
2019-10-18 13:26 ` [PATCH v4 05/11] sched/fair: use rq->nr_running when balancing load Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-30 15:54 ` [PATCH v4 05/11] sched/fair: use " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 06/11] sched/fair: use load instead of runnable load in load_balance Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use load instead of runnable load in load_balance() tip-bot2 for Vincent Guittot
2019-10-30 15:58 ` [PATCH v4 06/11] sched/fair: use load instead of runnable load in load_balance Mel Gorman
2019-10-18 13:26 ` [PATCH v4 07/11] sched/fair: evenly spread tasks when not overloaded Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Spread out tasks evenly " tip-bot2 for Vincent Guittot
2019-10-30 16:03 ` [PATCH v4 07/11] sched/fair: evenly spread tasks " Mel Gorman
2019-10-18 13:26 ` [PATCH v4 08/11] sched/fair: use utilization to select misfit task Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 09/11] sched/fair: use load instead of runnable load in wakeup path Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Use " tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 10/11] sched/fair: optimize find_idlest_group Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Optimize find_idlest_group() tip-bot2 for Vincent Guittot
2019-10-18 13:26 ` [PATCH v4 11/11] sched/fair: rework find_idlest_group Vincent Guittot
2019-10-21 9:12 ` [tip: sched/core] sched/fair: Rework find_idlest_group() tip-bot2 for Vincent Guittot
2019-10-22 16:46 ` [PATCH] sched/fair: fix rework of find_idlest_group() Vincent Guittot
2019-10-23 7:50 ` Chen, Rong A [this message]
2019-10-30 16:07 ` Mel Gorman
2019-11-18 17:42 ` [tip: sched/core] sched/fair: Fix " tip-bot2 for Vincent Guittot
2019-11-22 14:37 ` [PATCH] sched/fair: fix " Valentin Schneider
2019-11-25 9:16 ` Vincent Guittot
2019-11-25 11:03 ` Valentin Schneider
2019-11-20 11:58 ` [PATCH v4 11/11] sched/fair: rework find_idlest_group Qais Yousef
2019-11-20 13:21 ` Vincent Guittot
2019-11-20 16:53 ` Vincent Guittot
2019-11-20 17:34 ` Qais Yousef
2019-11-20 17:43 ` Vincent Guittot
2019-11-20 18:10 ` Qais Yousef
2019-11-20 18:20 ` Vincent Guittot
2019-11-20 18:27 ` Qais Yousef
2019-11-20 19:28 ` Vincent Guittot
2019-11-20 19:55 ` Qais Yousef
2019-11-21 14:58 ` Qais Yousef
2019-11-22 14:34 ` Valentin Schneider
2019-11-25 9:59 ` Vincent Guittot
2019-11-25 11:13 ` Valentin Schneider
2019-10-21 7:50 ` [PATCH v4 00/10] sched/fair: rework the CFS load balance Ingo Molnar
2019-10-21 8:44 ` Vincent Guittot
2019-10-21 12:56 ` Phil Auld
2019-10-24 12:38 ` Phil Auld
2019-10-24 13:46 ` Phil Auld
2019-10-24 14:59 ` Vincent Guittot
2019-10-25 13:33 ` Phil Auld
2019-10-28 13:03 ` Vincent Guittot
2019-10-30 14:39 ` Phil Auld
2019-10-30 16:24 ` Dietmar Eggemann
2019-10-30 16:35 ` Valentin Schneider
2019-10-30 17:19 ` Phil Auld
2019-10-30 17:25 ` Valentin Schneider
2019-10-30 17:29 ` Phil Auld
2019-10-30 17:28 ` Vincent Guittot
2019-10-30 17:44 ` Phil Auld
2019-10-30 17:25 ` Vincent Guittot
2019-10-31 13:57 ` Phil Auld
2019-10-31 16:41 ` Vincent Guittot
2019-10-30 16:24 ` Mel Gorman
2019-10-30 16:35 ` Vincent Guittot
2019-11-18 13:15 ` Ingo Molnar
2019-11-25 12:48 ` Valentin Schneider
2020-01-03 16:39 ` Valentin Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6dafa6d-6d9a-3499-85b8-52c5163cd38d@intel.com \
--to=rong.a.chen@intel.com \
--cc=Morten.Rasmussen@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=hdanton@sina.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=parth@linux.ibm.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=quentin.perret@arm.com \
--cc=riel@surriel.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).