linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Gruza, Agata" <agata.gruza@intel.com>
To: "vpillai@digitalocean.com" <vpillai@digitalocean.com>,
	"naravamudan@digitalocean.com" <naravamudan@digitalocean.com>,
	"jdesfossez@digitalocean.com" <jdesfossez@digitalocean.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	"mingo@kernel.org" <mingo@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"pjt@google.com" <pjt@google.com>,
	"torvalds@linux-foundation.org" <torvalds@linux-foundation.org>
Cc: "vpillai@digitalocean.com" <vpillai@digitalocean.com>,
	"fweisbec@gmail.com" <fweisbec@gmail.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"kerrnel@google.com" <kerrnel@google.com>,
	"pauld@redhat.com" <pauld@redhat.com>,
	"aaron.lwe@gmail.com" <aaron.lwe@gmail.com>,
	"aubrey.intel@gmail.com" <aubrey.intel@gmail.com>,
	"Li, Aubrey" <aubrey.li@linux.intel.com>,
	"valentin.schneider@arm.com" <valentin.schneider@arm.com>,
	"mgorman@techsingularity.net" <mgorman@techsingularity.net>,
	"pawan.kumar.gupta@linux.intel.com" 
	<pawan.kumar.gupta@linux.intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"joelaf@google.com" <joelaf@google.com>,
	"joel@joelfernandes.org" <joel@joelfernandes.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: FW: [RFC PATCH 00/13] Core scheduling v5
Date: Thu, 14 May 2020 20:51:24 +0000	[thread overview]
Message-ID: <MW3PR11MB45882B1E60D5BCDD6C1450C5E8BC0@MW3PR11MB4588.namprd11.prod.outlook.com> (raw)
In-Reply-To: <d08e6a2f-842f-2145-321d-be4971111065@linux.intel.com>

[-- Attachment #1: Type: text/plain, Size: 11835 bytes --]



-----Original Message-----
From: linux-kernel-owner@vger.kernel.org <linux-kernel-owner@vger.kernel.org> On Behalf Of Ning, Hongyu
Sent: Friday, May 8, 2020 8:40 PM
To: vpillai@digitalocean.com; naravamudan@digitalocean.com; jdesfossez@digitalocean.com; peterz@infradead.org; Tim Chen <tim.c.chen@linux.intel.com>; mingo@kernel.org; tglx@linutronix.de; pjt@google.com; torvalds@linux-foundation.org
Cc: vpillai@digitalocean.com; fweisbec@gmail.com; keescook@chromium.org; kerrnel@google.com; pauld@redhat.com; aaron.lwe@gmail.com; aubrey.intel@gmail.com; Li, Aubrey <aubrey.li@linux.intel.com>; valentin.schneider@arm.com; mgorman@techsingularity.net; pawan.kumar.gupta@linux.intel.com; pbonzini@redhat.com; joelaf@google.com; joel@joelfernandes.org; linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 00/13] Core scheduling v5


- Test environment:
Intel Xeon Server platform
CPU(s):              192
On-line CPU(s) list: 0-191
Thread(s) per core:  2
Core(s) per socket:  48
Socket(s):           2
NUMA node(s):        4

- Kernel under test: 
Core scheduling v5 base
https://github.com/digitalocean/linux-coresched/tree/coresched/v5-v5.5.y

- Test set based on sysbench 1.1.0-bd4b418:
A: sysbench cpu in cgroup cpu 1 + sysbench mysql in cgroup mysql 1 (192 workload tasks for each cgroup)
B: sysbench cpu in cgroup cpu 1 + sysbench cpu in cgroup cpu 2 + sysbench mysql in cgroup mysql 1 + sysbench mysql in cgroup mysql 2 (192 workload tasks for each cgroup)

- Test results briefing:
1 Good results:
1.1 For test set A, coresched could achieve same or better performance compared to smt_off, for both cpu workload and sysbench workload
1.2 For test set B, cpu workload, coresched could achieve better performance compared to smt_off

2 Bad results:
2.1 For test set B, mysql workload, coresched performance is lower than smt_off, potential fairness issue between cpu workloads and mysql workloads
2.2 For test set B, cpu workload, potential fairness issue between 2 cgroups cpu workloads

- Test results:
Note: test results in following tables are Tput normalized to default baseline

-- Test set A Tput normalized results:
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+
|                    | ****   | default   | coresched   | smt_off   | ***   | default     | coresched     | smt_off     |
+====================+========+===========+=============+===========+===
+====+=============+===============+=============+
| cgroups            | ****   | cg cpu 1  | cg cpu 1    | cg cpu 1  | ***   | cg mysql 1  | cg mysql 1    | cg mysql 1  |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+
| sysbench workload  | ****   | cpu       | cpu         | cpu       | ***   | mysql       | mysql         | mysql       |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+
| 192 tasks / cgroup | ****   | 1         | 0.95        | 0.54      | ***   | 1           | 0.92          | 0.97        |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+

-- Test set B Tput normalized results:
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+
|                    | ****   | default   | coresched   | smt_off   | ***   | default     | coresched     | smt_off     | **   | default     | coresched     | smt_off     | *   | default     | coresched     | smt_off     |
+====================+========+===========+=============+===========+===
+====+=============+===============+=============+======+=============+=
+==============+=============+=====+=============+===============+======
+=======+
| cgroups            | ****   | cg cpu 1  | cg cpu 1    | cg cpu 1  | ***   | cg cpu 2    | cg cpu 2      | cg cpu 2    | **   | cg mysql 1  | cg mysql 1    | cg mysql 1  | *   | cg mysql 2  | cg mysql 2    | cg mysql 2  |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+
| sysbench workload  | ****   | cpu       | cpu         | cpu       | ***   | cpu         | cpu           | cpu         | **   | mysql       | mysql         | mysql       | *   | mysql       | mysql         | mysql       |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+
| 192 tasks / cgroup | ****   | 1         | 0.9         | 0.47      | ***   | 1           | 1.32          | 0.66        | **   | 1           | 0.42          | 0.89        | *   | 1           | 0.42          | 0.89        |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+


> On Date: Wed,  4 Mar 2020 16:59:50 +0000, vpillai <vpillai@digitalocean.com> wrote:
> To: Nishanth Aravamudan <naravamudan@digitalocean.com>, Julien 
> Desfossez <jdesfossez@digitalocean.com>, Peter Zijlstra 
> <peterz@infradead.org>, Tim Chen <tim.c.chen@linux.intel.com>, 
> mingo@kernel.org, tglx@linutronix.de, pjt@google.com, 
> torvalds@linux-foundation.org
> CC: vpillai <vpillai@digitalocean.com>, linux-kernel@vger.kernel.org, 
> fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil 
> Auld <pauld@redhat.com>, Aaron Lu <aaron.lwe@gmail.com>, Aubrey Li 
> <aubrey.intel@gmail.com>, aubrey.li@linux.intel.com, Valentin 
> Schneider <valentin.schneider@arm.com>, Mel Gorman 
> <mgorman@techsingularity.net>, Pawan Gupta 
> <pawan.kumar.gupta@linux.intel.com>, Paolo Bonzini 
> <pbonzini@redhat.com>, Joel Fernandes <joelaf@google.com>, 
> joel@joelfernandes.org
> 
> 
> Fifth iteration of the Core-Scheduling feature.
> 
> Core scheduling is a feature that only allows trusted tasks to run 
> concurrently on cpus sharing compute resources(eg: hyperthreads on a 
> core). The goal is to mitigate the core-level side-channel attacks 
> without requiring to disable SMT (which has a significant impact on 
> performance in some situations). So far, the feature mitigates 
> user-space to user-space attacks but not user-space to kernel attack, 
> when one of the hardware thread enters the kernel (syscall, interrupt etc).
> 
> By default, the feature doesn't change any of the current scheduler 
> behavior. The user decides which tasks can run simultaneously on the 
> same core (for now by having them in the same tagged cgroup). When a 
> tag is enabled in a cgroup and a task from that cgroup is running on a 
> hardware thread, the scheduler ensures that only idle or trusted tasks 
> run on the other sibling(s). Besides security concerns, this feature 
> can also be beneficial for RT and performance applications where we 
> want to control how tasks make use of SMT dynamically.
> 
> This version was focusing on performance and stability. Couple of 
> crashes related to task tagging and cpu hotplug path were fixed.
> This version also improves the performance considerably by making task 
> migration and load balancing coresched aware.
> 
> In terms of performance, the major difference since the last iteration 
> is that now even IO-heavy and mixed-resources workloads are less 
> impacted by core-scheduling than by disabling SMT. Both host-level and 
> VM-level benchmarks were performed. Details in:
> https://lkml.org/lkml/2020/2/12/1194
> https://lkml.org/lkml/2019/11/1/269
> 
> v5 is rebased on top of 5.5.5(449718782a46) 
> https://github.com/digitalocean/linux-coresched/tree/coresched/v5-v5.5
> .y
> 


----------------------------------------------------------------------
ABOUT:
----------------------------------------------------------------------
Hello,

Core scheduling is required to protect against leakage of sensitive 
data allocated on a sibling thread. Our goal is to measure performance 
impact of core scheduling across different workloads and show how it 
evolved over time. Below you will find data based on core-sched (v5). 
In attached PDF system configuration setup as well as further 
explanation of the findings.  

----------------------------------------------------------------------
BENCHMARKS:
----------------------------------------------------------------------
- hammerdb      : database benchmarking application
- sysbench-cpu	: multi-threaded cpu benchmark
- sysbench-mysql: multi-threaded benchmark that tests open source DBMS
- build-kernel	: benchmark that is used to build Linux kernel
 

----------------------------------------------------------------------      
PERFORMANCE IMPACT:
----------------------------------------------------------------------

+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| benchmark          | ****   | # of cgroups | overcommit  | baseline + smt_on | coresched + smt_on | baseline + smt_off   |
+====================+========+==============+=============+===================+====================+======================+
| hammerdb           | ****   | 2cgroups     | 2x          | 1		       | 0.96		    | 0.87	           |	  
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| sysbench-cpu	     | ****   | 2cgroups     | 2x          | 1       	       | 0.95		    | 0.54		   |			
| sysbench-mysql     | ****   |     	     |             | 1     	       | 0.90		    | 0.47		   |
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| sysbench-cpu	     | ****   | 4cgroups     | 4x          | 1       	       | 0.90		    | 0.47		   |			
| sysbench-cpu       | ****   |     	     |             | 1     	       | 1.32		    | 0.66		   |
| sysbench-mycql     | ****   | 	     |             | 1       	       | 0.42		    | 0.89		   |			
| sysbench-mysql     | ****   |     	     |             | 1     	       | 0.42		    | 0.89	           |
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| kernel-build       | ****   | 2cgroups     | 0.5x        | 1		       | 1	            | 0.93	           |
|		     | ****   | 	     | 1x          | 1		       | 0.99		    | 0.92	           |
|		     | ****   |		     | 2x          | 1		       | 0.98		    | 0.91		   |
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+


----------------------------------------------------------------------
TAKE AWAYS:
----------------------------------------------------------------------
1. Core scheduling performs better than turning off HT.
2. Impact of core scheduling depends on the workload and thread 
scheduling intensity. 
3. Core scheduling requires cgroups. Tasks from the same cgroup are 
scheduled on the same core. 
4. Having core scheduling, in certain situations will introduce 
an uneven load distribution between multiple workload types. 
In such a case bias towards the cpu intensive workload is expected.  
5. Load balancing is not perfect. It needs more work.

Many thanks,

--Agata



[-- Attachment #2: LKML_core_sched_v5.5.y.pdf --]
[-- Type: application/pdf, Size: 360252 bytes --]

  reply	other threads:[~2020-05-15  0:57 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04 16:59 [RFC PATCH 00/13] Core scheduling v5 vpillai
2020-03-04 16:59 ` [RFC PATCH 01/13] sched: Wrap rq::lock access vpillai
2020-03-04 16:59 ` [RFC PATCH 02/13] sched: Introduce sched_class::pick_task() vpillai
2020-03-04 16:59 ` [RFC PATCH 03/13] sched: Core-wide rq->lock vpillai
2020-04-01 11:42   ` [PATCH] sched/arm64: store cpu topology before notify_cpu_starting Cheng Jian
2020-04-01 13:23     ` Valentin Schneider
2020-04-06  8:00       ` chengjian (D)
2020-04-09  9:59       ` Sudeep Holla
2020-04-09 10:32         ` Valentin Schneider
2020-04-09 11:08           ` Sudeep Holla
2020-04-09 17:54     ` Joel Fernandes
2020-04-10 13:49       ` chengjian (D)
2020-04-14 11:36   ` [RFC PATCH 03/13] sched: Core-wide rq->lock Peter Zijlstra
2020-04-14 21:35     ` Vineeth Remanan Pillai
2020-04-15 10:55       ` Peter Zijlstra
2020-04-14 14:32   ` Peter Zijlstra
2020-03-04 16:59 ` [RFC PATCH 04/13] sched/fair: Add a few assertions vpillai
2020-03-04 16:59 ` [RFC PATCH 05/13] sched: Basic tracking of matching tasks vpillai
2020-03-04 16:59 ` [RFC PATCH 06/13] sched: Update core scheduler queue when taking cpu online/offline vpillai
2020-03-04 16:59 ` [RFC PATCH 07/13] sched: Add core wide task selection and scheduling vpillai
2020-04-14 13:35   ` Peter Zijlstra
2020-04-16 23:32     ` Tim Chen
2020-04-17 10:57       ` Peter Zijlstra
2020-04-16  3:39   ` Chen Yu
2020-04-16 19:59     ` Vineeth Remanan Pillai
2020-04-17 11:18     ` Peter Zijlstra
2020-04-19 15:31       ` Chen Yu
2020-05-21 23:14   ` Joel Fernandes
2020-05-21 23:16     ` Joel Fernandes
2020-05-22  2:35     ` Joel Fernandes
2020-05-22  3:44       ` Aaron Lu
2020-05-22 20:13         ` Joel Fernandes
2020-03-04 16:59 ` [RFC PATCH 08/13] sched/fair: wrapper for cfs_rq->min_vruntime vpillai
2020-03-04 16:59 ` [RFC PATCH 09/13] sched/fair: core wide vruntime comparison vpillai
2020-04-14 13:56   ` Peter Zijlstra
2020-04-15  3:34     ` Aaron Lu
2020-04-15  4:07       ` Aaron Lu
2020-04-15 21:24         ` Vineeth Remanan Pillai
2020-04-17  9:40           ` Aaron Lu
2020-04-20  8:07             ` [PATCH updated] sched/fair: core wide cfs task priority comparison Aaron Lu
2020-04-20 22:26               ` Vineeth Remanan Pillai
2020-04-21  2:51                 ` Aaron Lu
2020-04-24 14:24                   ` [PATCH updated v2] " Aaron Lu
2020-05-06 14:35                     ` Peter Zijlstra
2020-05-08  8:44                       ` Aaron Lu
2020-05-08  9:09                         ` Peter Zijlstra
2020-05-08 12:34                           ` Aaron Lu
2020-05-14 13:02                             ` Peter Zijlstra
2020-05-14 22:51                               ` Vineeth Remanan Pillai
2020-05-15 10:38                                 ` Peter Zijlstra
2020-05-15 10:43                                   ` Peter Zijlstra
2020-05-15 14:24                                   ` Vineeth Remanan Pillai
2020-05-16  3:42                               ` Aaron Lu
2020-05-22  9:40                                 ` Aaron Lu
2020-06-08  1:41                               ` Ning, Hongyu
2020-03-04 17:00 ` [RFC PATCH 10/13] sched: Trivial forced-newidle balancer vpillai
2020-03-04 17:00 ` [RFC PATCH 11/13] sched: migration changes for core scheduling vpillai
2020-06-12 13:21   ` Joel Fernandes
2020-06-12 21:32     ` Vineeth Remanan Pillai
2020-06-13  2:25       ` Joel Fernandes
2020-06-13 18:59         ` Vineeth Remanan Pillai
2020-06-15  2:05           ` Li, Aubrey
2020-03-04 17:00 ` [RFC PATCH 12/13] sched: cgroup tagging interface " vpillai
2020-06-26 15:06   ` Vineeth Remanan Pillai
2020-03-04 17:00 ` [RFC PATCH 13/13] sched: Debug bits vpillai
2020-03-04 17:36 ` [RFC PATCH 00/13] Core scheduling v5 Tim Chen
2020-03-04 17:42   ` Vineeth Remanan Pillai
2020-04-14 14:21 ` Peter Zijlstra
2020-04-15 16:32   ` Joel Fernandes
2020-04-17 11:12     ` Peter Zijlstra
2020-04-17 12:35       ` Alexander Graf
2020-04-17 13:08         ` Peter Zijlstra
2020-04-18  2:25       ` Joel Fernandes
2020-05-09 14:35   ` Dario Faggioli
     [not found] ` <38805656-2e2f-222a-c083-692f4b113313@linux.intel.com>
2020-05-09  3:39   ` Ning, Hongyu
2020-05-14 20:51     ` Gruza, Agata [this message]
2020-05-10 23:46 ` [PATCH RFC] Add support for core-wide protection of IRQ and softirq Joel Fernandes (Google)
2020-05-11 13:49   ` Peter Zijlstra
2020-05-11 14:54     ` Joel Fernandes
2020-05-20 22:26 ` [PATCH RFC] sched: Add a per-thread core scheduling interface Joel Fernandes (Google)
2020-05-21  4:09   ` [PATCH RFC] sched: Add a per-thread core scheduling interface(Internet mail) benbjiang(蒋彪)
2020-05-21 13:49     ` Joel Fernandes
2020-05-21  8:51   ` [PATCH RFC] sched: Add a per-thread core scheduling interface Peter Zijlstra
2020-05-21 13:47     ` Joel Fernandes
2020-05-21 20:20       ` Vineeth Remanan Pillai
2020-05-22 12:59       ` Peter Zijlstra
2020-05-22 21:35         ` Joel Fernandes
2020-05-24 14:00           ` Phil Auld
2020-05-28 14:51             ` Joel Fernandes
2020-05-28 17:01             ` Peter Zijlstra
2020-05-28 18:17               ` Phil Auld
2020-05-28 18:34                 ` Phil Auld
2020-05-28 18:23               ` Joel Fernandes
2020-05-21 18:31   ` Linus Torvalds
2020-05-21 20:40     ` Joel Fernandes
2020-05-21 21:58       ` Jesse Barnes
2020-05-22 16:33         ` Linus Torvalds
2020-05-20 22:37 ` [PATCH RFC v2] Add support for core-wide protection of IRQ and softirq Joel Fernandes (Google)
2020-05-20 22:48 ` [PATCH RFC] sched: Use sched-RCU in core-scheduling balancing logic Joel Fernandes (Google)
2020-05-21 22:52   ` Paul E. McKenney
2020-05-22  1:26     ` Joel Fernandes
2020-06-25 20:12 ` [RFC PATCH 00/13] Core scheduling v5 Vineeth Remanan Pillai
2020-06-26  1:47   ` Joel Fernandes
2020-06-26 14:36     ` Vineeth Remanan Pillai
2020-06-26 15:10       ` Joel Fernandes
2020-06-26 15:12         ` Joel Fernandes
2020-06-27 16:21         ` Joel Fernandes
2020-06-30 14:11         ` Phil Auld
2020-06-29 12:33   ` Li, Aubrey
2020-06-29 19:41     ` Vineeth Remanan Pillai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW3PR11MB45882B1E60D5BCDD6C1450C5E8BC0@MW3PR11MB4588.namprd11.prod.outlook.com \
    --to=agata.gruza@intel.com \
    --cc=aaron.lwe@gmail.com \
    --cc=aubrey.intel@gmail.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=fweisbec@gmail.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=joel@joelfernandes.org \
    --cc=joelaf@google.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@kernel.org \
    --cc=naravamudan@digitalocean.com \
    --cc=pauld@redhat.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=vpillai@digitalocean.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).