linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Julien Desfossez <jdesfossez@digitalocean.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Aaron Lu <aaron.lu@linux.alibaba.com>,
	mingo@kernel.org, tglx@linutronix.de, pjt@google.com,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	subhra.mazumdar@oracle.com, fweisbec@gmail.com,
	keescook@chromium.org, kerrnel@google.com,
	Aubrey Li <aubrey.intel@gmail.com>
Subject: Re: [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling.
Date: Mon, 15 Apr 2019 12:59:37 -0400	[thread overview]
Message-ID: <20190415165937.GA26890@sinkpad> (raw)
In-Reply-To: <20190410080630.GY11158@hirez.programming.kicks-ass.net>

On 10-Apr-2019 10:06:30 AM, Peter Zijlstra wrote:
> while you're all having fun playing with this, I've not yet had answers
> to the important questions of how L1TF complete we want to be and if all
> this crud actually matters one way or the other.
> 
> Also, I still don't see this stuff working for high context switch rate
> workloads, and that is exactly what some people were aiming for..

We have been running scaling tests on highly loaded systems (with all
the fixes and suggestions applied) and here are the results.

On a system with 2x6 cores (12 hardware threads per NUMA node), with one
12-vcpus-32gb VM per NUMA node running a CPU-intensive workload
(linpack):
- Baseline: 864 gflops
- Core scheduling: 864 gflops
- nosmt (switch to 6 hardware threads per node): 298 gflops (-65%)

In this test, the VMs are basically alone on their own NUMA node, so
they are only competing with themselves, so for the next test we moved
the 2 VMs to the same node:
- Baseline: 340 gflops, about 586k context switches/sec
- Core scheduling: 322 gflops (-5%), about 575k context switches/sec
- nosmt: 146 gflops (-57%), about 284k context switches/sec

In terms of isolation, CPU-intensive VMs share their core with a
"foreign process" (not tagged or tagged with a different tag) less than
2% of the time (sum of the time spent with a lot of different
processes). For reference, this could add up to 60% without core
scheduling and smt on. We are working on identifying the various cases
where there is unwanted co-scheduling so we can address those.

With a more heterogeneous benchmark (MySQL benchmark with a remote
client, 1 12-vcpus MySQL VM on each NUMA node), we don’t measure any
performance degradation when there is more hardware threads available
than vcpus (same with nosmt), but when we add noise VMs (sleep(15);
collect metrics; send them over a VPN; repeat) with an overcommit ratio
of 3 vcpus to 1 hardware thread, core scheduling can have up to 25%
performance degradation, whereas nosmt has 15% impact.

So the performance impact varies depending on the type of workload, but
since the CPU-intensive workloads are the ones most impacted when we
disable SMT, this is very encouraging and is a worthwhile effort.

Thanks,

Julien

  parent reply	other threads:[~2019-04-15 16:59 UTC|newest]

Thread overview: 99+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-18 16:56 [RFC][PATCH 00/16] sched: Core scheduling Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 01/16] stop_machine: Fix stop_cpus_in_progress ordering Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 02/16] sched: Fix kerneldoc comment for ia64_set_curr_task Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 03/16] sched: Wrap rq::lock access Peter Zijlstra
2019-02-19 16:13   ` Phil Auld
2019-02-19 16:22     ` Peter Zijlstra
2019-02-19 16:37       ` Phil Auld
2019-03-18 15:41   ` Julien Desfossez
2019-03-20  2:29     ` Subhra Mazumdar
2019-03-21 21:20       ` Julien Desfossez
2019-03-22 13:34         ` Peter Zijlstra
2019-03-22 20:59           ` Julien Desfossez
2019-03-23  0:06         ` Subhra Mazumdar
2019-03-27  1:02           ` Subhra Mazumdar
2019-03-29 13:35           ` Julien Desfossez
2019-03-29 22:23             ` Subhra Mazumdar
2019-04-01 21:35               ` Subhra Mazumdar
2019-04-03 20:16                 ` Julien Desfossez
2019-04-05  1:30                   ` Subhra Mazumdar
2019-04-02  7:42               ` Peter Zijlstra
2019-03-22 23:28       ` Tim Chen
2019-03-22 23:44         ` Tim Chen
2019-02-18 16:56 ` [RFC][PATCH 04/16] sched/{rt,deadline}: Fix set_next_task vs pick_next_task Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 05/16] sched: Add task_struct pointer to sched_class::set_curr_task Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 06/16] sched/fair: Export newidle_balance() Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 07/16] sched: Allow put_prev_task() to drop rq->lock Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 08/16] sched: Rework pick_next_task() slow-path Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 09/16] sched: Introduce sched_class::pick_task() Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 10/16] sched: Core-wide rq->lock Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 11/16] sched: Basic tracking of matching tasks Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 12/16] sched: A quick and dirty cgroup tagging interface Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 13/16] sched: Add core wide task selection and scheduling Peter Zijlstra
     [not found]   ` <20190402064612.GA46500@aaronlu>
2019-04-02  8:28     ` Peter Zijlstra
2019-04-02 13:20       ` Aaron Lu
2019-04-05 14:55       ` Aaron Lu
2019-04-09 18:09         ` Tim Chen
2019-04-10  4:36           ` Aaron Lu
2019-04-10 14:18             ` Aubrey Li
2019-04-11  2:11               ` Aaron Lu
2019-04-10 14:44             ` Peter Zijlstra
2019-04-11  3:05               ` Aaron Lu
2019-04-11  9:19                 ` Peter Zijlstra
2019-04-10  8:06           ` Peter Zijlstra
2019-04-10 19:58             ` Vineeth Remanan Pillai
2019-04-15 16:59             ` Julien Desfossez [this message]
2019-04-16 13:43       ` Aaron Lu
2019-04-09 18:38   ` Julien Desfossez
2019-04-10 15:01     ` Peter Zijlstra
2019-04-11  0:11     ` Subhra Mazumdar
2019-04-19  8:40       ` Ingo Molnar
2019-04-19 23:16         ` Subhra Mazumdar
2019-02-18 16:56 ` [RFC][PATCH 14/16] sched/fair: Add a few assertions Peter Zijlstra
2019-02-18 16:56 ` [RFC][PATCH 15/16] sched: Trivial forced-newidle balancer Peter Zijlstra
2019-02-21 16:19   ` Valentin Schneider
2019-02-21 16:41     ` Peter Zijlstra
2019-02-21 16:47       ` Peter Zijlstra
2019-02-21 18:28         ` Valentin Schneider
2019-04-04  8:31       ` Aubrey Li
2019-04-06  1:36         ` Aubrey Li
2019-02-18 16:56 ` [RFC][PATCH 16/16] sched: Debug bits Peter Zijlstra
2019-02-18 17:49 ` [RFC][PATCH 00/16] sched: Core scheduling Linus Torvalds
2019-02-18 20:40   ` Peter Zijlstra
2019-02-19  0:29     ` Linus Torvalds
2019-02-19 15:15       ` Ingo Molnar
2019-02-22 12:17     ` Paolo Bonzini
2019-02-22 14:20       ` Peter Zijlstra
2019-02-22 19:26         ` Tim Chen
2019-02-26  8:26           ` Aubrey Li
2019-02-27  7:54             ` Aubrey Li
2019-02-21  2:53   ` Subhra Mazumdar
2019-02-21 14:03     ` Peter Zijlstra
2019-02-21 18:44       ` Subhra Mazumdar
2019-02-22  0:34       ` Subhra Mazumdar
2019-02-22 12:45   ` Mel Gorman
2019-02-22 16:10     ` Mel Gorman
2019-03-08 19:44     ` Subhra Mazumdar
2019-03-11  4:23       ` Aubrey Li
2019-03-11 18:34         ` Subhra Mazumdar
2019-03-11 23:33           ` Subhra Mazumdar
2019-03-12  0:20             ` Greg Kerr
2019-03-12  0:47               ` Subhra Mazumdar
2019-03-12  7:33               ` Aaron Lu
2019-03-12  7:45             ` Aubrey Li
2019-03-13  5:55               ` Aubrey Li
2019-03-14  0:35                 ` Tim Chen
2019-03-14  5:30                   ` Aubrey Li
2019-03-14  6:07                     ` Li, Aubrey
2019-03-18  6:56             ` Aubrey Li
2019-03-12 19:07           ` Pawan Gupta
2019-03-26  7:32       ` Aaron Lu
2019-03-26  7:56         ` Aaron Lu
2019-02-19 22:07 ` Greg Kerr
2019-02-20  9:42   ` Peter Zijlstra
2019-02-20 18:33     ` Greg Kerr
2019-02-22 14:10       ` Peter Zijlstra
2019-03-07 22:06         ` Paolo Bonzini
2019-02-20 18:43     ` Subhra Mazumdar
2019-03-01  2:54 ` Subhra Mazumdar
2019-03-14 15:28 ` Julien Desfossez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190415165937.GA26890@sinkpad \
    --to=jdesfossez@digitalocean.com \
    --cc=aaron.lu@linux.alibaba.com \
    --cc=aubrey.intel@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=subhra.mazumdar@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).