linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chengming Zhou <zhouchengming@bytedance.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Benjamin Segall <bsegall@google.com>,
	mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
	dietmar.eggemann@arm.com, rostedt@goodmis.org, mgorman@suse.de,
	bristot@redhat.com, linux-kernel@vger.kernel.org,
	duanxiongchun@bytedance.com, songmuchun@bytedance.com,
	zhengqi.arch@bytedance.com
Subject: Re: [External] Re: [PATCH] sched/fair: update tg->load_avg and se->load in throttle_cfs_rq()
Date: Mon, 18 Apr 2022 21:20:25 +0800	[thread overview]
Message-ID: <1edf3ea4-6c77-1cc1-9537-13acdfe67cf1@bytedance.com> (raw)
In-Reply-To: <CAKfTPtBWXyamX0jFSvgP3VnZacd5SNb_Yg9jAq1y0koHwr7DxQ@mail.gmail.com>

On 2022/4/15 15:51, Vincent Guittot wrote:
> On Fri, 15 Apr 2022 at 07:42, Chengming Zhou
> <zhouchengming@bytedance.com> wrote:
>>
>> On 2022/4/14 01:30, Benjamin Segall wrote:
>>> Chengming Zhou <zhouchengming@bytedance.com> writes:
>>>
>>>> We use update_load_avg(cfs_rq, se, 0) in throttle_cfs_rq(), so the
>>>> cfs_rq->tg_load_avg_contrib and task_group->load_avg won't be updated
>>>> even when the cfs_rq's load_avg has changed.
>>>>
>>>> And we also don't call update_cfs_group(se), so the se->load won't
>>>> be updated too.
>>>>
>>>> Change to use update_load_avg(cfs_rq, se, UPDATE_TG) and add
>>>> update_cfs_group(se) in throttle_cfs_rq(), like we do in
>>>> dequeue_task_fair().
>>>
>>> Hmm, this does look more correct; Vincent, was having this not do
>>> UPDATE_TG deliberate, or an accident that we all missed when checking?
> 
> The cost of UPDATE_TG/update_tg_load_avg() is not free and the parent
> cfs->load_avg should not change because of the throttling but only the
> cfs->weight so I don't see a real benefit of UPDATE_TG.

Hi Vincent,

If the current task has dequeued before throttle_cfs_rq() when pick_next_task_fair,
the parent cfs_rq will wait to update_tg_load_avg() until the throttle_cfs_rq()
when enqueue_entity(), cause delay update of parent cfs_rq->load_avg and the
load.weight of that group_se, so the fairness of task_groups may be delayed.

update_tg_load_avg() won't touch tg->load_avg if (delta <= cfs_rq->tg_load_avg_contrib / 64).
So the cost may have been avoided if the load_avg is really unchanged ?

> 
> Chengming,
> have you faced an issue or this change is based on code review ?

Yes, this change is based on code review and git log history.

Thanks.

> 
>>>
>>> It looks like the unthrottle_cfs_rq side got UPDATE_TG added later in
>>> the two-loops pass, but not the throttle_cfs_rq side.
>>
>> Yes, UPDATE_TG was added in unthrottle_cfs_rq() in commit 39f23ce07b93
>> ("sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list").
>>
>>>
>>> Also unthrottle_cfs_rq I'm guessing could still use update_cfs_group(se)
>>
>> It looks like we should also add update_cfs_group(se) in unthrottle_cfs_rq().
>>
>> Thanks.
>>
>>>
>>>
>>>>
>>>> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
>>>> ---
>>>>  kernel/sched/fair.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index d4bd299d67ab..b37dc1db7be7 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -4936,8 +4936,9 @@ static bool throttle_cfs_rq(struct cfs_rq *cfs_rq)
>>>>              if (!se->on_rq)
>>>>                      goto done;
>>>>
>>>> -            update_load_avg(qcfs_rq, se, 0);
>>>> +            update_load_avg(qcfs_rq, se, UPDATE_TG);
>>>>              se_update_runnable(se);
>>>> +            update_cfs_group(se);
>>>>
>>>>              if (cfs_rq_is_idle(group_cfs_rq(se)))
>>>>                      idle_task_delta = cfs_rq->h_nr_running;

      reply	other threads:[~2022-04-18 14:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-13  4:16 [PATCH] sched/fair: update tg->load_avg and se->load in throttle_cfs_rq() Chengming Zhou
2022-04-13 17:30 ` Benjamin Segall
2022-04-15  5:42   ` [External] " Chengming Zhou
2022-04-15  7:51     ` Vincent Guittot
2022-04-18 13:20       ` Chengming Zhou [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1edf3ea4-6c77-1cc1-9537-13acdfe67cf1@bytedance.com \
    --to=zhouchengming@bytedance.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=songmuchun@bytedance.com \
    --cc=vincent.guittot@linaro.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).