linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
	Chris Mason <clm@fb.com>,
	kernel-team@fb.com
Subject: Re: [PATCH v2 for-4.12-fixes 2/2] sched/fair: Fix O(# total cgroups) in load balance path
Date: Thu, 11 May 2017 09:02:22 +0200	[thread overview]
Message-ID: <CAKfTPtDPDseGBts+nzk=GqONXcwP0ZhETFd1ygsmi7P3FJDK6Q@mail.gmail.com> (raw)
In-Reply-To: <20170510144414.GA32165@htj.duckdns.org>

Hi Tejun,

On 10 May 2017 at 16:44, Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On Wed, May 10, 2017 at 08:50:14AM +0200, Vincent Guittot wrote:
>> On 9 May 2017 at 18:18, Tejun Heo <tj@kernel.org> wrote:
>> > Currently, rq->leaf_cfs_rq_list is a traversal ordered list of all
>> > live cfs_rqs which have ever been active on the CPU; unfortunately,
>> > this makes update_blocked_averages() O(# total cgroups) which isn't
>> > scalable at all.
>>
>> Dietmar raised similar optimization in the past. The only question was
>> : what is the impact of  re-adding the cfs_rq in leaf_cfs_rq_list on
>> the wake up path ? Have you done some measurements ?
>
> Didn't do a perf test yet but it's several more branches and a local
> list operation on enqueue, which is already pretty expensive vs. load
> balance being O(total number of cgroups on the system).
>
> Anyways, I'll do some hackbench tests with several levels of layering.
>
>> > @@ -7008,6 +7009,14 @@ static void update_blocked_averages(int
>> >                 se = cfs_rq->tg->se[cpu];
>> >                 if (se && !skip_blocked_update(se))
>> >                         update_load_avg(se, 0);
>> > +
>> > +               /*
>> > +                * There can be a lot of idle CPU cgroups.  Don't let fully
>> > +                * decayed cfs_rqs linger on the list.
>> > +                */
>> > +               if (!cfs_rq->load.weight && !cfs_rq->avg.load_sum &&
>> > +                   !cfs_rq->avg.util_sum && !cfs_rq->runnable_load_sum)
>> > +                       list_del_leaf_cfs_rq(cfs_rq);
>>
>> list_add_leaf_cfs_rq() assumes that we always enqueue cfs_rq bottom-up.
>> By removing  cfs_rq, can't we break this assumption in some cases ?
>
> We queue a cfs_rq on the leaf list when the a se is queued on that
> cfs_rq for the first time, so queueing can happen in any order;

Sorry, what i mean is:
When the group entity of a cfs_rq is enqueued, we are sure that either
the parents is already enqueued or it will be enqueued in the same
sequence. We must be sure that no other branch will be enqueued in the
middle of the sequence and will reset tmp_alone_branch.
This is true with current implementation but I  wondered it can happen
if we del/add the cfs_rq out of order

That said i haven't find a use case that break the sequence


> otherwise, we'd simply be doing list_add_tail().  AFAICS, removing and
> re-adding shouldn't break anything if the code wasn't broken before.
>
> Thanks.
>
> --
> tejun

  parent reply	other threads:[~2017-05-11  7:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-09 16:17 [PATCH v2 for-4.12-fixes 1/2] sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs Tejun Heo
2017-05-09 16:18 ` [PATCH v2 for-4.12-fixes 2/2] sched/fair: Fix O(# total cgroups) in load balance path Tejun Heo
2017-05-10  6:50   ` Vincent Guittot
2017-05-10 14:44     ` Tejun Heo
2017-05-10 15:55       ` Tejun Heo
2017-05-11  7:02       ` Vincent Guittot [this message]
2017-05-12 13:16         ` Tejun Heo
2017-05-12 14:36           ` Vincent Guittot
2017-05-10 17:36   ` [PATCH v3 " Tejun Heo
2017-05-24 23:40 ` [PATCH v2 for-4.12-fixes 1/2] sched/fair: Use task_groups instead of leaf_cfs_rq_list to walk all cfs_rqs Tim Chen
2017-05-25 14:39   ` Tejun Heo
2017-05-26 23:04     ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKfTPtDPDseGBts+nzk=GqONXcwP0ZhETFd1ygsmi7P3FJDK6Q@mail.gmail.com' \
    --to=vincent.guittot@linaro.org \
    --cc=clm@fb.com \
    --cc=efault@gmx.de \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).