LKML Archive on lore.kernel.org
 help / color / Atom feed
From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
To: linux-kernel@vger.kernel.org
Cc: Joel Fernandes <joel@joelfernandes.org>,
	Aaron Lu <aaron.lwe@gmail.com>,
	Aubrey Li <aubrey.li@linux.intel.com>,
	Julien Desfossez <jdesfossez@digitalocean.com>,
	Kees Cook <keescook@chromium.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Paul Turner <pjt@google.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tim Chen <tim.c.chen@intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Vineeth Pillai <viremana@linux.microsoft.com>,
	x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)),
	fweisbec@gmail.com, kerrnel@google.com,
	Phil Auld <pauld@redhat.com>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Christian Brauner <christian.brauner@ubuntu.com>
Subject: [PATCH RFC 00/12] Core-sched v6+: kernel protection and hotplug fixes
Date: Fri, 14 Aug 2020 23:18:56 -0400
Message-ID: <20200815031908.1015049-1-joel@joelfernandes.org> (raw)

Hello!

This series is continuation of main core-sched v6 series [1] and adds support
for syscall and IRQ isolation from usermode processes and guests. It is key to
safely entering kernel mode in an HT while the other HT is in use by a user or
guest. The series also fixes CPU hotplug issues arising because of the
cpu_smt_mask changing while the next task is being picked. These hotplug fixes
are needed also for kernel protection to work correctly.

The series is based on Thomas's x86/entry tree.

[1]  https://lwn.net/Articles/824918/

Background:

Core-scheduling prevents hyperthreads in usermode from attacking each
other, but it does not do anything about one of the hyperthreads
entering the kernel for any reason. This leaves the door open for MDS
and L1TF attacks with concurrent execution sequences between
hyperthreads.

This series adds support for protecting all syscall and IRQ kernel mode entries
by cleverly tracking when any sibling in a core enter the kernel, and when all
the siblings exit the kernel. IPIs are sent to force siblings into the kernel.

Care is taken to avoid waiting in IRQ-disabled sections as Thomas suggested
thus avoiding stop_machine deadlocks. Every attempt is made to avoid
unnecessary IPIs.

Performance tests:
sysbench is used to test the performance of the patch series. Used a 8 cpu/4
Core VM and ran 2 sysbench tests in parallel. Each sysbench test runs 4 tasks:
sysbench --test=cpu --cpu-max-prime=100000 --num-threads=4 run

Compared the performance results for various combinations as below.
The metric below is 'events per second':

1. Coresched disabled
    sysbench-1/sysbench-2 => 175.7/175.6

2. Coreched enabled, both sysbench tagged
  sysbench-1/sysbench-2 => 168.8/165.6

3. Coresched enabled, sysbench-1 tagged and sysbench-2 untagged
    sysbench-1/sysbench-2 => 96.4/176.9

4. smt off
    sysbench-1/sysbench-2 => 97.9/98.8

When both sysbench-es are tagged, there is a perf drop of ~4%. With a
tagged/untagged case, the tagged one suffers because it always gets
stalled when the sibiling enters kernel. But this is no worse than smtoff.

Also a modified rcutorture was used to heavily stress the kernel to make sure
there is not crash or instability.

Joel Fernandes (Google) (5):
irq_work: Add support to detect if work is pending
entry/idle: Add a common function for activites during idle entry/exit
arch/x86: Add a new TIF flag for untrusted tasks
kernel/entry: Add support for core-wide protection of kernel-mode
entry/idle: Enter and exit kernel protection during idle entry and
exit

Vineeth Pillai (7):
entry/kvm: Protect the kernel when entering from guest
bitops: Introduce find_next_or_bit
cpumask: Introduce a new iterator for_each_cpu_wrap_or
sched/coresched: Use for_each_cpu(_wrap)_or for pick_next_task
sched/coresched: Make core_pick_seq per run-queue
sched/coresched: Check for dynamic changes in smt_mask
sched/coresched: rq->core should be set only if not previously set

arch/x86/include/asm/thread_info.h |   2 +
arch/x86/kvm/x86.c                 |   3 +
include/asm-generic/bitops/find.h  |  16 ++
include/linux/cpumask.h            |  42 +++++
include/linux/entry-common.h       |  22 +++
include/linux/entry-kvm.h          |  12 ++
include/linux/irq_work.h           |   1 +
include/linux/sched.h              |  12 ++
kernel/entry/common.c              |  88 +++++----
kernel/entry/kvm.c                 |  12 ++
kernel/irq_work.c                  |  11 ++
kernel/sched/core.c                | 281 ++++++++++++++++++++++++++---
kernel/sched/idle.c                |  17 +-
kernel/sched/sched.h               |  11 +-
lib/cpumask.c                      |  53 ++++++
lib/find_bit.c                     |  56 ++++--
16 files changed, 564 insertions(+), 75 deletions(-)

--
2.28.0.220.ged08abb693-goog


             reply index

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-15  3:18 Joel Fernandes (Google) [this message]
2020-08-15  3:18 ` [PATCH RFC 01/12] irq_work: Add support to detect if work is pending Joel Fernandes (Google)
2020-08-15  8:13   ` peterz
2020-08-17  2:04     ` Joel Fernandes
2020-08-15  3:18 ` [PATCH RFC 02/12] entry/idle: Add a common function for activites during idle entry/exit Joel Fernandes (Google)
2020-08-15  8:14   ` peterz
2020-08-17  2:17     ` Joel Fernandes
2020-08-15  3:18 ` [PATCH RFC 03/12] arch/x86: Add a new TIF flag for untrusted tasks Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 04/12] kernel/entry: Add support for core-wide protection of kernel-mode Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 05/12] entry/idle: Enter and exit kernel protection during idle entry and exit Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 06/12] entry/kvm: Protect the kernel when entering from guest Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 07/12] bitops: Introduce find_next_or_bit Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 08/12] cpumask: Introduce a new iterator for_each_cpu_wrap_or Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 09/12] sched/coresched: Use for_each_cpu(_wrap)_or for pick_next_task Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 10/12] sched/coresched: Make core_pick_seq per run-queue Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 11/12] sched/coresched: Check for dynamic changes in smt_mask Joel Fernandes (Google)
2020-08-15  3:19 ` [PATCH RFC 12/12] sched/coresched: rq->core should be set only if not previously set Joel Fernandes (Google)
2020-08-19 18:26 ` [PATCH RFC 00/12] Core-sched v6+: kernel protection and hotplug fixes Kees Cook
2020-08-20  1:44   ` Joel Fernandes
2020-08-22 20:22 ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200815031908.1015049-1-joel@joelfernandes.org \
    --to=joel@joelfernandes.org \
    --cc=aaron.lwe@gmail.com \
    --cc=aubrey.li@linux.intel.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=fweisbec@gmail.com \
    --cc=jdesfossez@digitalocean.com \
    --cc=keescook@chromium.org \
    --cc=kerrnel@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pauld@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viremana@linux.microsoft.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git