All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: 王贇 <yun.wang@linux.alibaba.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	"open list:SCHEDULER" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small
Date: Tue, 3 Mar 2020 20:52:45 +0100	[thread overview]
Message-ID: <20200303195245.GF2596@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <44fa1cee-08db-e4ab-e5ab-08d6fbd421d7@linux.alibaba.com>

On Tue, Mar 03, 2020 at 10:17:03PM +0800, 王贇 wrote:
> During our testing, we found a case that shares no longer
> working correctly, the cgroup topology is like:
> 
>   /sys/fs/cgroup/cpu/A		(shares=102400)
>   /sys/fs/cgroup/cpu/A/B	(shares=2)
>   /sys/fs/cgroup/cpu/A/B/C	(shares=1024)
> 
>   /sys/fs/cgroup/cpu/D		(shares=1024)
>   /sys/fs/cgroup/cpu/D/E	(shares=1024)
>   /sys/fs/cgroup/cpu/D/E/F	(shares=1024)
> 
> The same benchmark is running in group C & F, no other tasks are
> running, the benchmark is capable to consumed all the CPUs.
> 
> We suppose the group C will win more CPU resources since it could
> enjoy all the shares of group A, but it's F who wins much more.
> 
> The reason is because we have group B with shares as 2, which make
> the group A 'cfs_rq->load.weight' very small.
> 
> And in calc_group_shares() we calculate shares as:
> 
>   load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
>   shares = (tg_shares * load) / tg_weight;
> 
> Since the 'cfs_rq->load.weight' is too small, the load become 0
> in here, although 'tg_shares' is 102400, shares of the se which
> stand for group A on root cfs_rq become 2.

Argh, because A->cfs_rq.load.weight is B->se.load.weight which is
B->shares/nr_cpus.

> While the se of D on root cfs_rq is far more bigger than 2, so it
> wins the battle.
> 
> This patch add a check on the zero load and make it as MIN_SHARES
> to fix the nonsense shares, after applied the group C wins as
> expected.
> 
> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
> ---
>  kernel/sched/fair.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 84594f8aeaf8..53d705f75fa4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3182,6 +3182,8 @@ static long calc_group_shares(struct cfs_rq *cfs_rq)
>  	tg_shares = READ_ONCE(tg->shares);
> 
>  	load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
> +	if (!load && cfs_rq->load.weight)
> +		load = MIN_SHARES;
> 
>  	tg_weight = atomic_long_read(&tg->load_avg);

Yeah, I suppose that'll do. Hurmph, wants a comment though.

But that has me looking at other users of scale_load_down(), and doesn't
at least update_tg_cfs_load() suffer the same problem?

  reply	other threads:[~2020-03-03 19:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 14:17 [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small 王贇
2020-03-03 19:52 ` Peter Zijlstra [this message]
2020-03-04  1:19   ` 王贇
2020-03-04  8:47     ` Vincent Guittot
2020-03-04  9:43       ` Vincent Guittot
2020-03-05  1:23         ` 王贇
2020-03-04  9:52       ` Peter Zijlstra
2020-03-04 11:55         ` Vincent Guittot
2020-03-05  1:08         ` 王贇
2020-03-04  8:45   ` Vincent Guittot
2020-03-04 18:47   ` bsegall
2020-03-05  1:14     ` 王贇
2020-03-05  7:53       ` Vincent Guittot
2020-03-06  4:23         ` 王贇
2020-03-06  8:04           ` Vincent Guittot
2020-03-06  9:34             ` 王贇
2020-03-06 19:17       ` bsegall
2020-03-09 11:15         ` Vincent Guittot
2020-03-10  3:42           ` 王贇
2020-03-10  7:57             ` Vincent Guittot
2020-03-10  8:15               ` 王贇

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200303195245.GF2596@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yun.wang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.