linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Galbraith <efault@gmx.de>, Paul Turner <pjt@google.com>,
	Chris Mason <clm@fb.com>,
	kernel-team@fb.com
Subject: Re: [PATCH 2/2] sched/fair: Always propagate runnable_load_avg
Date: Wed, 26 Apr 2017 17:30:20 -0700	[thread overview]
Message-ID: <20170427003020.GD11348@wtj.duckdns.org> (raw)
In-Reply-To: <CAKfTPtC92nVXCH3QX-Qqf5R5gD58pk2=S_OpwiTao5y16g84Xw@mail.gmail.com>

Hello, Vincent.

On Wed, Apr 26, 2017 at 12:21:52PM +0200, Vincent Guittot wrote:
> > This is from the follow-up patch.  I was confused.  Because we don't
> > propagate decays, we still should decay the runnable_load_avg;
> > otherwise, we end up accumulating errors in the counter.  I'll drop
> > the last patch.
> 
> Ok, the runnable_load_avg goes back to 0 when I drop patch 3. But i
> see  runnable_load_avg sometimes significantly higher than load_avg
> which is normally not possible as load_avg = runnable_load_avg +
> sleeping task's load_avg

So, while load_avg would eventually converge on runnable_load_avg +
blocked load_avg given stable enough workload for long enough,
runnable_load_avg jumping above load avg temporarily is expected,
AFAICS.  That's the whole point of it, a sum closely tracking what's
currently on the cpu so that we can pick the cpu which has the most on
it now.  It doesn't make sense to try to pick threads off of a cpu
which is generally loaded but doesn't have much going on right now,
after all.

> Then, I just have the opposite behavior on my platform. I see a
> increase of latency at p99 with your patches.
> My platform is a hikey : 2x4 cores ARM and I have used schbench -m 2
> -t 4 -s 10000 -c 15000 -r 30 so I have 1 worker thread per CPU which
> is similar to what you are doing on your platform
>
> With v4.11-rc8. I have run 10 times the test and get consistent results
...
> *99.0000th: 539
...
> With your patches i see an increase of the latency for p99. I run 10
> *99.0000th: 2034

I see.  This is surprising given that at least the purpose of the
patch is restoring cgroup behavior to match !cgroup one.  I could have
totally messed it up tho.  Hmm... there are several ways forward I
guess.

* Can you please double check that the higher latencies w/ the patch
  is reliably reproducible?  The test machines that I use have
  variable management load.  They never dominate the machine but are
  enough to disturb the results so that to drawing out a reliable
  pattern takes a lot of repeated runs.  I'd really appreciate if you
  could double check that the pattern is reliable with different run
  patterns (ie. instead of 10 consecutive runs after another,
  interleaved).

* Is the board something easily obtainable?  It'd be the eaisest for
  me to set up the same environment and reproduce the problem.  I
  looked up hikey boards on amazon but couldn't easily find 2x4 core
  ones.  If there's something I can easily buy, please point me to it.
  If there's something I can loan, that'd be great too.

* If not, I'll try to clean up the debug patches I have and send them
  your way to get more visiblity but given these things tend to be
  very iterative, it might take quite a few back and forth.

Thanks!

-- 
tejun

  reply	other threads:[~2017-04-27  0:30 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-24 20:13 [RFC PATCHSET] sched/fair: fix load balancer behavior when cgroup is in use Tejun Heo
2017-04-24 20:14 ` [PATCH 1/2] sched/fair: Fix how load gets propagated from cfs_rq to its sched_entity Tejun Heo
2017-04-24 21:33   ` [PATCH v2 " Tejun Heo
2017-05-03 18:00     ` Peter Zijlstra
2017-05-03 21:45       ` Tejun Heo
2017-05-04  5:51         ` Peter Zijlstra
2017-05-04  6:21           ` Peter Zijlstra
2017-05-04  9:49             ` Dietmar Eggemann
2017-05-04 10:57               ` Peter Zijlstra
2017-05-04 17:39               ` Tejun Heo
2017-05-05 10:36                 ` Dietmar Eggemann
2017-05-04 10:26       ` Vincent Guittot
2017-04-25  8:35   ` [PATCH " Vincent Guittot
2017-04-25 18:12     ` Tejun Heo
2017-04-26 16:51       ` Vincent Guittot
2017-04-26 22:40         ` Tejun Heo
2017-04-27  7:00           ` Vincent Guittot
2017-05-01 14:17         ` Peter Zijlstra
2017-05-01 14:52           ` Peter Zijlstra
2017-05-01 21:56           ` Tejun Heo
2017-05-02  8:19             ` Peter Zijlstra
2017-05-02  8:30               ` Peter Zijlstra
2017-05-02 20:00                 ` Tejun Heo
2017-05-03  9:10                   ` Peter Zijlstra
2017-04-26 16:14   ` Vincent Guittot
2017-04-26 22:27     ` Tejun Heo
2017-04-27  8:59       ` Vincent Guittot
2017-04-28 17:46         ` Tejun Heo
2017-05-02  7:20           ` Vincent Guittot
2017-04-24 20:14 ` [PATCH 2/2] sched/fair: Always propagate runnable_load_avg Tejun Heo
2017-04-25  8:46   ` Vincent Guittot
2017-04-25  9:05     ` Vincent Guittot
2017-04-25 12:59       ` Vincent Guittot
2017-04-25 18:49         ` Tejun Heo
2017-04-25 20:49           ` Tejun Heo
2017-04-25 21:15             ` Chris Mason
2017-04-25 21:08           ` Tejun Heo
2017-04-26 10:21             ` Vincent Guittot
2017-04-27  0:30               ` Tejun Heo [this message]
2017-04-27  8:28                 ` Vincent Guittot
2017-04-28 16:14                   ` Tejun Heo
2017-05-02  6:56                     ` Vincent Guittot
2017-05-02 20:56                       ` Tejun Heo
2017-05-03  7:25                         ` Vincent Guittot
2017-05-03  7:54                           ` Vincent Guittot
2017-04-26 18:12   ` Vincent Guittot
2017-04-26 22:52     ` Tejun Heo
2017-04-27  8:29       ` Vincent Guittot
2017-04-28 20:33         ` Tejun Heo
2017-04-28 20:38           ` Tejun Heo
2017-05-01 15:56           ` Peter Zijlstra
2017-05-02 22:01             ` Tejun Heo
2017-05-02  7:18           ` Vincent Guittot
2017-05-02 13:26             ` Vincent Guittot
2017-05-02 22:37               ` Tejun Heo
2017-05-02 21:50             ` Tejun Heo
2017-05-03  7:34               ` Vincent Guittot
2017-05-03  9:37                 ` Peter Zijlstra
2017-05-03 10:37                   ` Vincent Guittot
2017-05-03 13:09                     ` Peter Zijlstra
2017-05-03 21:49                       ` Tejun Heo
2017-05-04  8:19                         ` Vincent Guittot
2017-05-04 17:43                           ` Tejun Heo
2017-05-04 19:02                             ` Vincent Guittot
2017-05-04 19:04                               ` Tejun Heo
2017-04-24 21:35 ` [PATCH 3/2] sched/fair: Skip __update_load_avg() on cfs_rq sched_entities Tejun Heo
2017-04-24 21:48   ` Peter Zijlstra
2017-04-24 22:54     ` Tejun Heo
2017-04-25 21:09   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170427003020.GD11348@wtj.duckdns.org \
    --to=tj@kernel.org \
    --cc=clm@fb.com \
    --cc=efault@gmx.de \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).