All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Turner <pjt@google.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org, Venki Pallipadi <venki@google.com>,
	Srivatsa Vaddagiri <vatsa@in.ibm.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Subject: Re: [PATCH 12/16] sched: refactor update_shares_cpu() -> update_blocked_avgs()
Date: Wed, 11 Jul 2012 17:11:59 -0700	[thread overview]
Message-ID: <CAPM31RJoLTCM+DPpPk7bsks6Nb8mLNba9CC+Et4vBnMM3rdENw@mail.gmail.com> (raw)
In-Reply-To: <1341489508.19870.30.camel@laptop>

On Thu, Jul 5, 2012 at 4:58 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> On Wed, 2012-06-27 at 19:24 -0700, Paul Turner wrote:
>> Now that running entities maintain their own load-averages the work we must do
>> in update_shares() is largely restricted to the periodic decay of blocked
>> entities.  This allows us to be a little less pessimistic regarding our
>> occupancy on rq->lock and the associated rq->clock updates required.
>
> So what you're saying is that since 'weight' now includes runtime
> behaviour (where we hope the recent past matches the near future) we
> don't need to update shares quite as often since the effect of
> sleep-wakeup cycles isn't near as big since they're already anticipated.


Not quite: This does not decrease the frequency of updates.

Rather:
The old code used to take and release rq->lock (nested under rcu)
about updating *every* single task-group.

This is because the amount of work that we would have to do to update
a group was not deterministic (we did not know how many times  we'd
have to fold our load average sums).  The new code just says, since
this work is deterministic, let's thrash that lock less.  I suspect 10
is an incredibly conservative value, we probably want something more
like 100.  (But on the flip-side, I don't want to time out on the
wacko machine with 2000 cgroups, so I do have to release at SOME
point.)


> So how is the decay of blocked load still significant, surely that too
> is mostly part of the anticipated sleep/wake cycle already caught in the
> runtime behaviour.

RIght but the run-time behavior only lets us update things while
they're running.

This maintains periodic updates (and hence load-decay) on a group
that's sent everything to sleep.

>
> Or is this the primary place where we decay? If so that wasn't obvious
> and thus wants a comment someplace.
>

For a group with no runnable entities, yes.  For a group with runnable
entities this will also occur naturally about updates for the
aforementioned runnables.

I'll add a comment calling out the blocked case.

>> Signed-off-by: Paul Turner <pjt@google.com>
>> ---
>
>> +static void update_blocked_averages(int cpu)
>>  {
>>       struct rq *rq = cpu_rq(cpu);
>> +     struct cfs_rq *cfs_rq;
>> +
>> +     unsigned long flags;
>> +     int num_updates = 0;
>>
>>       rcu_read_lock();
>> +     raw_spin_lock_irqsave(&rq->lock, flags);
>> +     update_rq_clock(rq);
>>       /*
>>        * Iterates the task_group tree in a bottom up fashion, see
>>        * list_add_leaf_cfs_rq() for details.
>>        */
>>       for_each_leaf_cfs_rq(rq, cfs_rq) {
>> +             __update_blocked_averages_cpu(cfs_rq->tg, rq->cpu);
>>
>> +             /*
>> +              * Periodically release the lock so that a cfs_rq with many
>> +              * children cannot hold it for an arbitrary period of time.
>> +              */
>> +             if (num_updates++ % 20 == 0) {
>> +                     raw_spin_unlock_irqrestore(&rq->lock, flags);
>> +                     cpu_relax();
>> +                     raw_spin_lock_irqsave(&rq->lock, flags);
>
> Gack.. that's not real pretty is it.. Esp. since we're still holding RCU
> lock and are thus (mostly) still not preemptable.
>
> How much of a problem was this?, the changelog is silent on this.

So the holding of RCU about these operations is nothing new (and
indeed they should be much faster than before).

As above, the bound is only for the crazy-large-numbers of cgroups
case where we don't want to sit on with interrupts disabled forever.
I suspect it wants to be larger, but picked a fairly conservative
number to start with since I also think it's not a big performance
factor either way.

>
>> +                     update_rq_clock(rq);
>> +             }
>>       }
>> +
>> +     raw_spin_unlock_irqrestore(&rq->lock, flags);
>>       rcu_read_unlock();
>>  }
>>
>
>
>

  reply	other threads:[~2012-07-12  0:12 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-28  2:24 [PATCH 00/16] Series short description Paul Turner
2012-06-28  2:24 ` [PATCH 09/16] sched: normalize tg load contributions against runnable time Paul Turner
2012-06-29  7:26   ` Namhyung Kim
2012-07-04 19:48   ` Peter Zijlstra
2012-07-06 11:52     ` Peter Zijlstra
2012-07-12  1:08       ` Andre Noll
2012-07-12  0:02     ` Paul Turner
2012-07-06 12:23   ` Peter Zijlstra
2012-06-28  2:24 ` [PATCH 06/16] sched: account for blocked load waking back up Paul Turner
2012-06-28  2:24 ` [PATCH 02/16] sched: maintain per-rq runnable averages Paul Turner
2012-06-28  2:24 ` [PATCH 01/16] sched: track the runnable average on a per-task entitiy basis Paul Turner
2012-06-28  6:06   ` Namhyung Kim
2012-07-12  0:14     ` Paul Turner
2012-07-04 15:32   ` Peter Zijlstra
2012-07-12  0:12     ` Paul Turner
2012-06-28  2:24 ` [PATCH 04/16] sched: maintain the load contribution of blocked entities Paul Turner
2012-06-29  1:27   ` Namhyung Kim
2012-06-28  2:24 ` [PATCH 07/16] sched: aggregate total task_group load Paul Turner
2012-06-28  2:24 ` [PATCH 05/16] sched: add an rq migration call-back to sched_class Paul Turner
2012-06-29  1:32   ` Namhyung Kim
2012-06-28  2:24 ` [PATCH 08/16] sched: compute load contribution by a group entity Paul Turner
2012-06-28  2:24 ` [PATCH 03/16] sched: aggregate load contributed by task entities on parenting cfs_rq Paul Turner
2012-06-28  6:33   ` Namhyung Kim
2012-07-04 15:28   ` Peter Zijlstra
2012-07-06 14:53     ` Peter Zijlstra
2012-07-09  9:15       ` Ingo Molnar
2012-06-28  2:24 ` [PATCH 11/16] sched: replace update_shares weight distribution with per-entity computation Paul Turner
2012-06-28  2:24 ` [PATCH 16/16] sched: introduce temporary FAIR_GROUP_SCHED dependency for load-tracking Paul Turner
2012-06-28  2:24 ` [PATCH 12/16] sched: refactor update_shares_cpu() -> update_blocked_avgs() Paul Turner
2012-06-29  7:28   ` Namhyung Kim
2012-07-12  0:03     ` Paul Turner
2012-07-05 11:58   ` Peter Zijlstra
2012-07-12  0:11     ` Paul Turner [this message]
2012-07-12 14:40       ` Peter Zijlstra
2012-06-28  2:24 ` [PATCH 15/16] sched: implement usage tracking Paul Turner
2012-06-28  2:24 ` [PATCH 13/16] sched: update_cfs_shares at period edge Paul Turner
2012-06-28  2:24 ` [PATCH 10/16] sched: maintain runnable averages across throttled periods Paul Turner
2012-06-28  2:24 ` [PATCH 14/16] sched: make __update_entity_runnable_avg() fast Paul Turner
2012-07-04 15:41   ` Peter Zijlstra
2012-07-04 17:20     ` Peter Zijlstra
2012-07-09 20:18       ` Benjamin Segall
2012-07-10 10:51         ` Peter Zijlstra
2012-07-12  0:15           ` Paul Turner
2012-07-12 14:30             ` Peter Zijlstra
2012-07-04 16:51   ` Peter Zijlstra
2012-08-23 14:14 [patch 00/16] sched: per-entity load-tracking pjt
2012-08-23 14:14 ` [patch 12/16] sched: refactor update_shares_cpu() -> update_blocked_avgs() pjt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPM31RJoLTCM+DPpPk7bsks6Nb8mLNba9CC+Et4vBnMM3rdENw@mail.gmail.com \
    --to=pjt@google.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=vatsa@in.ibm.com \
    --cc=venki@google.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.