linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Sargun Dhillon <sargun@sargun.me>,
	Xie XiuQi <xiexiuqi@huawei.com>, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	xiezhipeng1@huawei.com, huawei.libin@huawei.com,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Dmitry Adamushko <dmitry.adamushko@gmail.com>,
	Rik van Riel <riel@surriel.com>
Subject: Re: [PATCH] sched: fix infinity loop in update_blocked_averages
Date: Fri, 28 Dec 2018 18:25:37 +0100	[thread overview]
Message-ID: <CAKfTPtAMjfnNHu_JdDSi1BSMoXKgGkm2+G_QLsUBDAhq1wFvZQ@mail.gmail.com> (raw)
In-Reply-To: <20181228165451.GJ2509588@devbig004.ftw2.facebook.com>

On Fri, 28 Dec 2018 at 17:54, Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Fri, Dec 28, 2018 at 10:30:07AM +0100, Vincent Guittot wrote:
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index d1907506318a..88b9118b5191 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -7698,7 +7698,8 @@ static void update_blocked_averages(int cpu)
> > >                  * There can be a lot of idle CPU cgroups.  Don't let fully
> > >                  * decayed cfs_rqs linger on the list.
> > >                  */
> > > -               if (cfs_rq_is_decayed(cfs_rq))
> > > +               if (cfs_rq_is_decayed(cfs_rq) &&
> > > +                   rq->tmp_alone_branch == &rq->leaf_cfs_rq_list)
> > >                         list_del_leaf_cfs_rq(cfs_rq);
> >
> > This patch reduces the cases but I don't thinks it's enough because it
> > doesn't cover the case of unregister_fair_sched_group()
> > And we can still break the ordering of the cfs_rq
>
> So, if unregister_fair_sched_group() can corrupt list, the bug is
> there regardless of a9e7f6544b9ce, right?

I don't think so because without a9e7f6544b9ce, the insertion in the
list is done only once and we can't  call unregister_fair_sched_group
while an enqueue is ongoing so tmp_alone_branch always point to
rq->leaf_cfs_rq_list.
a9e7f6544b9ce enables to have tmp_alone_branch not pointing to
rq->leaf_cfs_rq_list

>
> Is there a reason why we're building a dedicated list for avg
> propagation?  AFAICS, it's just doing depth first walk, which can be

we have this list of sched group of the rq to update the load of each
cfs_rq and as a result the load of the task group. This is then used
to compute the share of a task group between CPUs.
This list must be ordered to correctly propagate the updates from leafs to root.

> done without extra space as long as each node has the parent pointer,
> which they do.  Is the dedicated list an optimization?

It prevents to parse and walk all task group struct every time.
Instead, you just have to follow a linked list

Regards,
Vincent
>
> Thanks.
>
> --
> tejun

  reply	other threads:[~2018-12-28 17:25 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-27  3:04 [PATCH] sched: fix infinity loop in update_blocked_averages Xie XiuQi
2018-12-27  9:21 ` Vincent Guittot
2018-12-27 10:21   ` Vincent Guittot
2018-12-27 10:23     ` Vincent Guittot
2018-12-27 16:39       ` Sargun Dhillon
2018-12-27 17:01         ` Vincent Guittot
2018-12-27 18:15           ` Linus Torvalds
2018-12-27 21:08             ` Sargun Dhillon
2018-12-27 21:46               ` Linus Torvalds
2018-12-28  1:15             ` Tejun Heo
2018-12-28  1:36               ` Linus Torvalds
2018-12-28  1:53                 ` Tejun Heo
2018-12-28  2:02                   ` Tejun Heo
2018-12-28  2:30                     ` Xie XiuQi
2018-12-28  5:38                     ` Sargun Dhillon
2018-12-28  9:30                     ` Vincent Guittot
2018-12-28 14:26                       ` Sargun Dhillon
2018-12-28 16:54                       ` Tejun Heo
2018-12-28 17:25                         ` Vincent Guittot [this message]
2018-12-28 17:46                           ` Tejun Heo
2018-12-28 18:04                             ` Vincent Guittot
2018-12-28 10:25                     ` Xiezhipeng (EulerOS)
2018-12-30 12:04   ` Ingo Molnar
2018-12-30 12:31     ` [PATCH] sched: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c Ingo Molnar
2018-12-30 12:36     ` [PATCH] sched: fix infinity loop in update_blocked_averages Vincent Guittot
2018-12-30 12:54       ` Ingo Molnar
2018-12-30 13:00 ` [tip:sched/urgent] sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c tip-bot for Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKfTPtAMjfnNHu_JdDSi1BSMoXKgGkm2+G_QLsUBDAhq1wFvZQ@mail.gmail.com \
    --to=vincent.guittot@linaro.org \
    --cc=dmitry.adamushko@gmail.com \
    --cc=huawei.libin@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=sargun@sargun.me \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=xiexiuqi@huawei.com \
    --cc=xiezhipeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).