All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
	Joseph Salisbury <joseph.salisbury@canonical.com>,
	Ingo Molnar <mingo@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Mike Galbraith <efault@gmx.de>,
	omer.akram@canonical.com
Subject: Re: [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes
Date: Mon, 17 Oct 2016 15:19:52 +0200	[thread overview]
Message-ID: <20161017131952.GR3117@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <4e15ad55-beeb-e860-0420-8f439d076758@arm.com>

On Mon, Oct 17, 2016 at 12:49:55PM +0100, Dietmar Eggemann wrote:

> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 8b03fb5..8926685 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -2902,7 +2902,8 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> >   */
> >  static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force)
> >  {
> > -	long delta = cfs_rq->avg.load_avg - cfs_rq->tg_load_avg_contrib;
> > +	unsigned long load_avg = READ_ONCE(cfs_rq->avg.load_avg);
> > +	long delta = load_avg - cfs_rq->tg_load_avg_contrib;
> >  
> >  	/*
> >  	 * No need to update load_avg for root_task_group as it is not used.
> > @@ -2912,7 +2913,7 @@ static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force)
> >  
> >  	if (force || abs(delta) > cfs_rq->tg_load_avg_contrib / 64) {
> >  		atomic_long_add(delta, &cfs_rq->tg->load_avg);
> > -		cfs_rq->tg_load_avg_contrib = cfs_rq->avg.load_avg;
> > +		cfs_rq->tg_load_avg_contrib = load_avg;
> >  	}
> >  }
> 
> Tested it on an Ubuntu 16.10 Server (on top of the default 4.8.0-22-generic
> kernel) on a Lenovo T430 and it didn't help.

Right, I don't think that race exists, we update cfs_rq->avg.load_avg
with rq->lock held and have it held here, so it cannot change under us.

This might do with a few lockdep_assert_held() instances to clarify this
though.

> What seems to cure it is to get rid of this snippet (part of the commit
> mentioned earlier in this thread: 3d30544f0212):
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 039de34f1521..16c692049fbf 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -726,7 +726,6 @@ void post_init_entity_util_avg(struct sched_entity *se)
>         struct sched_avg *sa = &se->avg;
>         long cap = (long)(SCHED_CAPACITY_SCALE - cfs_rq->avg.util_avg) / 2;
>         u64 now = cfs_rq_clock_task(cfs_rq);
> -       int tg_update;
>  
>         if (cap > 0) {
>                 if (cfs_rq->avg.util_avg != 0) {
> @@ -759,10 +758,8 @@ void post_init_entity_util_avg(struct sched_entity *se)
>                 }
>         }
>  
> -       tg_update = update_cfs_rq_load_avg(now, cfs_rq, false);
> +       update_cfs_rq_load_avg(now, cfs_rq, false);
>         attach_entity_load_avg(cfs_rq, se);
> -       if (tg_update)
> -               update_tg_load_avg(cfs_rq, false);
>  }
>  
>  #else /* !CONFIG_SMP */
> 
> BTW, I guess we can reach .tg_load_avg up to ~300000-400000 on such a system
> initially because systemd will create all ~100 services (and therefore the
> corresponding 2. level tg's) at once. In my previous example, there was 500ms
> between the creation of 2 tg's so there was a lot of decaying going on in between.

Cute... on current kernels that translates to simply removing the call
to update_tg_load_avg(), lets see if we can figure out what goes
sideways first though, because it _should_ decay back out. And if that
can fail here, I'm not seeing why that wouldn't fail elsewhere either.

I'll see if I can reproduce this with a script creating heaps of cgroups
in a hurry, I have a total lack of system-disease on all my machines.


/me goes prod..

  reply	other threads:[~2016-10-17 13:20 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-07 19:38 [v4.8-rc1 Regression] sched/fair: Apply more PELT fixes Joseph Salisbury
2016-10-07 19:57 ` Linus Torvalds
2016-10-07 20:22   ` Joseph Salisbury
2016-10-07 20:37     ` Linus Torvalds
2016-10-08  8:00 ` Peter Zijlstra
2016-10-08  8:39   ` Ingo Molnar
2016-10-08 11:37     ` Vincent Guittot
2016-10-08 11:49       ` Mike Galbraith
2016-10-12 12:20         ` Vincent Guittot
2016-10-12 15:35           ` Joseph Salisbury
2016-10-12 16:21           ` Joseph Salisbury
2016-10-13 10:58             ` Vincent Guittot
2016-10-13 15:52               ` Joseph Salisbury
2016-10-13 16:48                 ` Vincent Guittot
2016-10-13 18:49                   ` Dietmar Eggemann
2016-10-13 21:34                     ` Vincent Guittot
2016-10-14  8:24                       ` Vincent Guittot
2016-10-14 13:10                         ` Dietmar Eggemann
2016-10-14 15:18                           ` Vincent Guittot
2016-10-14 16:04                             ` Joseph Salisbury
2016-10-17  9:09                               ` Vincent Guittot
2016-10-17 11:49                                 ` Dietmar Eggemann
2016-10-17 13:19                                   ` Peter Zijlstra [this message]
2016-10-17 13:54                                     ` Vincent Guittot
2016-10-17 22:52                                       ` Dietmar Eggemann
2016-10-18  8:43                                         ` Vincent Guittot
2016-10-18  9:07                                         ` Peter Zijlstra
2016-10-18  9:45                                           ` Vincent Guittot
2016-10-18 10:34                                             ` Peter Zijlstra
2016-10-18 11:56                                               ` Vincent Guittot
2016-10-18 21:58                                                 ` Joonwoo Park
2016-10-19  6:42                                                   ` Vincent Guittot
2016-10-19  9:46                                                 ` Dietmar Eggemann
2016-10-19 11:25                                                   ` Vincent Guittot
2016-10-19 15:33                                                     ` Dietmar Eggemann
2016-10-19 17:33                                                       ` Joonwoo Park
2016-10-19 17:50                                                       ` Vincent Guittot
2016-10-19 11:33                                                 ` Peter Zijlstra
2016-10-19 11:50                                                   ` Vincent Guittot
2016-10-19 13:30                                                 ` Morten Rasmussen
2016-10-19 17:41                                                   ` Vincent Guittot
2016-10-20  7:56                                                     ` Morten Rasmussen
2016-10-19 14:49                                                 ` Joseph Salisbury
2016-10-19 14:53                                                   ` Vincent Guittot
2016-10-18 11:15                                           ` Dietmar Eggemann
2016-10-18 12:07                                             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161017131952.GR3117@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=efault@gmx.de \
    --cc=joseph.salisbury@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=omer.akram@canonical.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.