linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Matt Fleming <matt@codeblueprint.co.uk>,
	<linux-kernel@vger.kernel.org>
Subject: Re: sched: tweak select_idle_sibling to look for idle threads
Date: Tue, 12 Apr 2016 09:27:58 -0400	[thread overview]
Message-ID: <20160412132758.7apgqqwl2c2wksy6@floor.thefacebook.com> (raw)
In-Reply-To: <1460436248.3839.80.camel@suse.de>

On Tue, Apr 12, 2016 at 06:44:08AM +0200, Mike Galbraith wrote:
> On Mon, 2016-04-11 at 20:30 -0400, Chris Mason wrote:
> > On Mon, Apr 11, 2016 at 06:54:21AM +0200, Mike Galbraith wrote:
> 
> > > > Ok, I was able to reproduce this by stuffing tbench_srv and tbench onto
> > > > just socket 0.  Version 2 below fixes things for me, but I'm hoping
> > > > someone can suggest a way to get task_hot() buddy checks without the rq
> > > > lock.
> > > > 
> > > > I haven't run this on production loads yet, but our 4.0 patch for this
> > > > uses task_hot(), so I'd expect it to be on par.  If this doesn't fix it
> > > > for you, I'll dig up a similar machine on Monday.
> > > 
> > > My box stopped caring.  I personally would be reluctant to apply it
> > > without a "you asked for it" button or a large pile of benchmark
> > > results.  Lock banging or not, full scan existing makes me nervous.
> > 
> > 
> > We can use a bitmap at the socket level to keep track of which cpus are
> > idle.  I'm sure there are better places for the array and better ways to
> > allocate, this is just a rough cut to make sure the idle tracking works.
> 
> See e0a79f529d5b:
> 
>       pre   15.22 MB/sec 1 procs
>       post 252.01 MB/sec 1 procs
> 
> You can make traverse cycles go away, but those cycles, while precious,
> are not the most costly cycles.  The above was 1 tbench pair in an
> otherwise idle box.. ie it wasn't traverse cycles that demolished it.

Agreed, this is why the decision not to scan is so important.  But while
I've been describing this patch in terms of latency, latency is really
the symptom instead of the goal.  Without these patches, workloads that
do want to fully utilize the hardware are basically getting one fewer
core of utilization.  It's true that we define 'fully utilize' with an
upper bound on application response time, but we're not talking high
frequency trading here.

It clearly shows up in our graphs.  CPU idle is higher (the lost core),
CPU user time is lower, average system load is higher (procs waiting on
a fewer number of core).

We measure this internally with scheduling latency because that's the
easiest way to talk about it across a wide variety of hardware.

> 
> 	-Mike
> 
> (p.s. SCHED_IDLE is dinky bandwidth fair class)

Ugh, not my best quick patch, but you get the idea I was going for.  I
can always add the tunable to flip things on/off but I'd prefer that we
find a good set of defaults, mostly so the FB production runtime is the
common config instead of the special snowflake.

-chris

  reply	other threads:[~2016-04-12 13:28 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 18:08 [PATCH RFC] select_idle_sibling experiments Chris Mason
2016-04-05 18:43 ` Bastien Bastien Philbert
2016-04-05 19:28   ` Chris Mason
2016-04-05 20:03 ` Matt Fleming
2016-04-05 21:05   ` Bastien Philbert
2016-04-06  0:44   ` Chris Mason
2016-04-06  7:27 ` Mike Galbraith
2016-04-06 13:36   ` Chris Mason
2016-04-09 17:30   ` Chris Mason
2016-04-12 21:45     ` Matt Fleming
2016-04-13  3:40       ` Mike Galbraith
2016-04-13 15:54         ` Chris Mason
2016-04-28 12:00   ` Peter Zijlstra
2016-04-28 13:17     ` Mike Galbraith
2016-05-02  5:35     ` Mike Galbraith
2016-04-07 15:17 ` Chris Mason
2016-04-09 19:05 ` sched: tweak select_idle_sibling to look for idle threads Chris Mason
2016-04-10 10:04   ` Mike Galbraith
2016-04-10 12:35     ` Chris Mason
2016-04-10 12:46       ` Mike Galbraith
2016-04-10 19:55     ` Chris Mason
2016-04-11  4:54       ` Mike Galbraith
2016-04-12  0:30         ` Chris Mason
2016-04-12  4:44           ` Mike Galbraith
2016-04-12 13:27             ` Chris Mason [this message]
2016-04-12 18:16               ` Mike Galbraith
2016-04-12 20:07                 ` Chris Mason
2016-04-13  3:18                   ` Mike Galbraith
2016-04-13 13:44                     ` Chris Mason
2016-04-13 14:22                       ` Mike Galbraith
2016-04-13 14:36                         ` Chris Mason
2016-04-13 15:05                           ` Mike Galbraith
2016-04-13 15:34                             ` Mike Galbraith
2016-04-30 12:47   ` Peter Zijlstra
2016-05-01  7:12     ` Mike Galbraith
2016-05-01  8:53       ` Peter Zijlstra
2016-05-01  9:20         ` Mike Galbraith
2016-05-07  1:24           ` Yuyang Du
2016-05-08  8:08             ` Mike Galbraith
2016-05-08 18:57               ` Yuyang Du
2016-05-09  3:45                 ` Mike Galbraith
2016-05-08 20:22                   ` Yuyang Du
2016-05-09  7:44                     ` Mike Galbraith
2016-05-09  1:13                       ` Yuyang Du
2016-05-09  9:39                         ` Mike Galbraith
2016-05-09 23:26                           ` Yuyang Du
2016-05-10  7:49                             ` Mike Galbraith
2016-05-10 15:26                               ` Mike Galbraith
2016-05-10 19:16                                 ` Yuyang Du
2016-05-11  4:17                                   ` Mike Galbraith
2016-05-11  1:23                                     ` Yuyang Du
2016-05-11  9:56                                       ` Mike Galbraith
2016-05-18  6:41                                   ` Mike Galbraith
2016-05-09  3:52                 ` Mike Galbraith
2016-05-08 20:31                   ` Yuyang Du
2016-05-02  8:46       ` Peter Zijlstra
2016-05-02 14:50         ` Mike Galbraith
2016-05-02 14:58           ` Peter Zijlstra
2016-05-02 15:47             ` Chris Mason
2016-05-03 14:32               ` Peter Zijlstra
2016-05-03 15:11                 ` Chris Mason
2016-05-04 10:37                   ` Peter Zijlstra
2016-05-04 15:31                     ` Peter Zijlstra
2016-05-05 22:03                     ` Matt Fleming
2016-05-06 18:54                       ` Mike Galbraith
2016-05-09  8:33                         ` Peter Zijlstra
2016-05-09  8:56                           ` Mike Galbraith
2016-05-04 15:45                   ` Peter Zijlstra
2016-05-04 17:46                     ` Chris Mason
2016-05-05  9:33                       ` Peter Zijlstra
2016-05-05 13:58                         ` Chris Mason
2016-05-06  7:12                           ` Peter Zijlstra
2016-05-06 17:27                             ` Chris Mason
2016-05-06  7:25                   ` Peter Zijlstra
2016-05-02 17:30             ` Mike Galbraith
2016-05-02 15:01           ` Peter Zijlstra
2016-05-02 16:04             ` Ingo Molnar
2016-05-03 11:31               ` Peter Zijlstra
2016-05-03 18:22                 ` Peter Zijlstra
2016-05-02 15:10           ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160412132758.7apgqqwl2c2wksy6@floor.thefacebook.com \
    --to=clm@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=mgalbraith@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).