From: Vincent Guittot <vincent.guittot@linaro.org>
To: Chris Mason <clm@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@surriel.com>,
linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] fix scheduler regression from "sched/fair: Rework load_balance()"
Date: Mon, 26 Oct 2020 16:18:46 +0100 [thread overview]
Message-ID: <20201026151846.GA17073@vingu-book> (raw)
In-Reply-To: <83A9BEDF-20BB-4BAD-AABD-0EECB92BF8DF@fb.com>
Le lundi 26 oct. 2020 à 11:05:35 (-0400), Chris Mason a écrit :
>
>
> On 26 Oct 2020, at 10:24, Vincent Guittot wrote:
>
> > Le lundi 26 oct. 2020 à 08:45:27 (-0400), Chris Mason a écrit :
> > > On 26 Oct 2020, at 4:39, Vincent Guittot wrote:
> > >
> > > > Hi Chris
> > > >
> > > > On Sat, 24 Oct 2020 at 01:49, Chris Mason <clm@fb.com> wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > We’re validating a new kernel in the fleet, and compared
> > > > > with v5.2,
> > > >
> > > > Which version are you using ?
> > > > several improvements have been added since v5.5 and the rework of
> > > > load_balance
> > >
> > > We’re validating v5.6, but all of the numbers referenced in this
> > > patch are
> > > against v5.9. I usually try to back port my way to victory on this
> > > kind of
> > > thing, but mainline seems to behave exactly the same as 0b0695f2b34a
> > > wrt
> > > this benchmark.
> >
> > ok. Thanks for the confirmation
> >
> > I have been able to reproduce the problem on my setup.
>
> Thanks for taking a look! Can I ask what parameters you used on schbench,
> and what kind of results you saw? Mostly I’m trying to make sure it’s a
> useful tool, but also the patch didn’t change things here.
>
with latest tip/sched/core on my dual quad cores:
schbench -t 4 -r 10 -c 1000000 -s 1000
Latency percentiles (usec)
50.0th: 16
75.0th: 23
90.0th: 32
95.0th: 41
*99.0th: 15120
99.5th: 15120
99.9th: 15120
min=0, max=15130
with the patch :
schbench -t 4 -r 10 -c 1000000 -s 1000
Latency percentiles (usec)
50.0th: 28
75.0th: 32
90.0th: 36
95.0th: 56
*99.0th: 1310
99.5th: 1310
99.9th: 1310
min=0, max=1309
> >
> > Could you try the fix below ?
> >
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9049,7 +9049,8 @@ static inline void calculate_imbalance(struct
> > lb_env *env, struct sd_lb_stats *s
> > * emptying busiest.
> > */
> > if (local->group_type == group_has_spare) {
> > - if (busiest->group_type > group_fully_busy) {
> > + if ((busiest->group_type > group_fully_busy) &&
> > + (busiest->group_weight > 1)) {
> > /*
> > * If busiest is overloaded, try to fill spare
> > * capacity. This might end up creating spare
> > capacity
> >
> >
> > When we calculate an imbalance at te smallest level, ie between CPUs
> > (group_weight == 1),
> > we should try to spread tasks on cpus instead of trying to fill spare
> > capacity.
>
> With this patch on top of v5.9, my latencies are unchanged. I’m building
> against current Linus now just in case I’m missing other fixes.
>
I can't remember any changes in mainline that would make a difference
I had another way to fix it but it could impact more other UC and the improvement
was smaller
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ebe15e36f336..415927885228 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7707,7 +7707,7 @@ static int detach_tasks(struct lb_env *env)
case migrate_util:
util = task_util_est(p);
- if (util > env->imbalance)
+ if ((util >> env->sd->nr_balance_failed) > env->imbalance)
goto next;
env->imbalance -= util;
--
>
> -chris
next prev parent reply other threads:[~2020-10-26 15:18 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-23 23:49 [PATCH] fix scheduler regression from "sched/fair: Rework load_balance()" Chris Mason
2020-10-26 8:39 ` Vincent Guittot
2020-10-26 12:45 ` Chris Mason
2020-10-26 14:24 ` Vincent Guittot
2020-10-26 14:38 ` Rik van Riel
2020-10-26 14:56 ` Vincent Guittot
2020-10-26 15:04 ` Rik van Riel
2020-10-26 15:42 ` Vincent Guittot
2020-10-26 15:54 ` Vincent Guittot
2020-10-26 16:04 ` Rik van Riel
2020-10-26 16:20 ` Vincent Guittot
2020-10-26 16:48 ` Chris Mason
2020-10-26 16:52 ` Vincent Guittot
2020-10-30 2:10 ` Rik van Riel
2020-10-30 9:16 ` Vincent Guittot
2020-10-26 15:05 ` Chris Mason
2020-10-26 15:18 ` Vincent Guittot [this message]
2020-10-26 15:28 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201026151846.GA17073@vingu-book \
--to=vincent.guittot@linaro.org \
--cc=clm@fb.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).