All of lore.kernel.org
 help / color / mirror / Atom feed
From: 王贇 <yun.wang@linux.alibaba.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	"open list:SCHEDULER" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small
Date: Thu, 5 Mar 2020 09:23:55 +0800	[thread overview]
Message-ID: <a22aa816-df93-3bf2-20be-c3eaae17628c@linux.alibaba.com> (raw)
In-Reply-To: <CAKfTPtCnwUKCNbmGR-oErNrF+H+D0FPZPVS=d4m3mvr8Hc7ivQ@mail.gmail.com>



On 2020/3/4 下午5:43, Vincent Guittot wrote:
> On Wed, 4 Mar 2020 at 09:47, Vincent Guittot <vincent.guittot@linaro.org> wrote:
>>
>> On Wed, 4 Mar 2020 at 02:19, 王贇 <yun.wang@linux.alibaba.com> wrote:
>>>
>>>
>>>
>>> On 2020/3/4 上午3:52, Peter Zijlstra wrote:
>>> [snip]
>>>>> The reason is because we have group B with shares as 2, which make
>>>>> the group A 'cfs_rq->load.weight' very small.
>>>>>
>>>>> And in calc_group_shares() we calculate shares as:
>>>>>
>>>>>   load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
>>>>>   shares = (tg_shares * load) / tg_weight;
>>>>>
>>>>> Since the 'cfs_rq->load.weight' is too small, the load become 0
>>>>> in here, although 'tg_shares' is 102400, shares of the se which
>>>>> stand for group A on root cfs_rq become 2.
>>>>
>>>> Argh, because A->cfs_rq.load.weight is B->se.load.weight which is
>>>> B->shares/nr_cpus.
>>>
>>> Yeah, that's exactly why it happens, even the share 2 scale up to 2048,
>>> on 96 CPUs platform, each CPU get only 21 in equal case.
>>>
>>>>
>>>>> While the se of D on root cfs_rq is far more bigger than 2, so it
>>>>> wins the battle.
>>>>>
>>>>> This patch add a check on the zero load and make it as MIN_SHARES
>>>>> to fix the nonsense shares, after applied the group C wins as
>>>>> expected.
>>>>>
>>>>> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
>>>>> ---
>>>>>  kernel/sched/fair.c | 2 ++
>>>>>  1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>> index 84594f8aeaf8..53d705f75fa4 100644
>>>>> --- a/kernel/sched/fair.c
>>>>> +++ b/kernel/sched/fair.c
>>>>> @@ -3182,6 +3182,8 @@ static long calc_group_shares(struct cfs_rq *cfs_rq)
>>>>>      tg_shares = READ_ONCE(tg->shares);
>>>>>
>>>>>      load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
>>>>> +    if (!load && cfs_rq->load.weight)
>>>>> +            load = MIN_SHARES;
>>>>>
>>>>>      tg_weight = atomic_long_read(&tg->load_avg);
>>>>
>>>> Yeah, I suppose that'll do. Hurmph, wants a comment though.
>>>>
>>>> But that has me looking at other users of scale_load_down(), and doesn't
>>>> at least update_tg_cfs_load() suffer the same problem?
>>>
>>> Good point :-) I'm not sure but is scale_load_down() supposed to scale small
>>> value into 0? If not, maybe we should fix the helper to make sure it at
>>> least return some real load? like:
>>>
>>> # define scale_load_down(w) ((w + (1 << SCHED_FIXEDPOINT_SHIFT)) >> SCHED_FIXEDPOINT_SHIFT)
>>
>> you will add +1 of nice prio for each device
> 
> Of course, it's not prio but only weight which is different

That's right, should only handle the issue cases.

Regards,
Michael Wang

> 
>>
>> should we use instead
>> # define scale_load_down(w) ((w >> SCHED_FIXEDPOINT_SHIFT) ? (w >>
>> SCHED_FIXEDPOINT_SHIFT) : MIN_SHARES)
>>
>> Regards,
>> Vincent
>>
>>>
>>> Regards,
>>> Michael Wang
>>>
>>>>

  reply	other threads:[~2020-03-05  1:24 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 14:17 [RFC PATCH] sched: fix the nonsense shares when load of cfs_rq is too, small 王贇
2020-03-03 19:52 ` Peter Zijlstra
2020-03-04  1:19   ` 王贇
2020-03-04  8:47     ` Vincent Guittot
2020-03-04  9:43       ` Vincent Guittot
2020-03-05  1:23         ` 王贇 [this message]
2020-03-04  9:52       ` Peter Zijlstra
2020-03-04 11:55         ` Vincent Guittot
2020-03-05  1:08         ` 王贇
2020-03-04  8:45   ` Vincent Guittot
2020-03-04 18:47   ` bsegall
2020-03-05  1:14     ` 王贇
2020-03-05  7:53       ` Vincent Guittot
2020-03-06  4:23         ` 王贇
2020-03-06  8:04           ` Vincent Guittot
2020-03-06  9:34             ` 王贇
2020-03-06 19:17       ` bsegall
2020-03-09 11:15         ` Vincent Guittot
2020-03-10  3:42           ` 王贇
2020-03-10  7:57             ` Vincent Guittot
2020-03-10  8:15               ` 王贇

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a22aa816-df93-3bf2-20be-c3eaae17628c@linux.alibaba.com \
    --to=yun.wang@linux.alibaba.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.