linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@linux.intel.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, dave.hansen@linux.intel.com,
	moritz.lipp@iaik.tugraz.at, daniel.gruss@iaik.tugraz.at,
	michael.schwarz@iaik.tugraz.at,
	richard.fellner@student.tugraz.at, luto@kernel.org,
	torvalds@linux-foundation.org, keescook@google.com,
	hughd@google.com, x86@kernel.org
Subject: [PATCH 22/23] x86, kaiser: allow KAISER to be enabled/disabled at runtime
Date: Wed, 22 Nov 2017 16:35:23 -0800	[thread overview]
Message-ID: <20171123003523.28FFBAB6@viggo.jf.intel.com> (raw)
In-Reply-To: <20171123003438.48A0EEDE@viggo.jf.intel.com>


From: Dave Hansen <dave.hansen@linux.intel.com>

The KAISER CR3 switches are expensive for many reasons.  Not all systems
benefit from the protection provided by KAISER.  Some of them can not
pay the high performance cost.

This patch adds a debugfs file.  To disable KAISER, you do:

	echo 0 > /sys/kernel/debug/x86/kaiser-enabled

and to re-enable it, you can:

	echo 1 > /sys/kernel/debug/x86/kaiser-enabled

This is a *minimal* implementation.  There are certainly plenty of
optimizations that can be done on top of this by using ALTERNATIVES
among other things.

This does, however, completely remove all the KAISER-based CR3 writes.
This permits a paravirtualized system that can not tolerate CR3
writes to theoretically survive with CONFIG_KAISER=y, albeit with
/sys/kernel/debug/x86/kaiser-enabled=0.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Moritz Lipp <moritz.lipp@iaik.tugraz.at>
Cc: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Cc: Richard Fellner <richard.fellner@student.tugraz.at>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kees Cook <keescook@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: x86@kernel.org
---

 b/arch/x86/entry/calling.h |   12 +++++++
 b/arch/x86/mm/kaiser.c     |   70 ++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 78 insertions(+), 4 deletions(-)

diff -puN arch/x86/entry/calling.h~kaiser-dynamic-asm arch/x86/entry/calling.h
--- a/arch/x86/entry/calling.h~kaiser-dynamic-asm	2017-11-22 15:45:56.402619721 -0800
+++ b/arch/x86/entry/calling.h	2017-11-22 15:45:56.407619721 -0800
@@ -209,19 +209,29 @@ For 32-bit we have the following convent
 	orq     $(KAISER_SWITCH_MASK), \reg
 .endm
 
+.macro JUMP_IF_KAISER_OFF	label
+	testq   $1, kaiser_asm_do_switch
+	jz      \label
+.endm
+
 .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+	JUMP_IF_KAISER_OFF	.Lswitch_done_\@
 	mov	%cr3, \scratch_reg
 	ADJUST_KERNEL_CR3 \scratch_reg
 	mov	\scratch_reg, %cr3
+.Lswitch_done_\@:
 .endm
 
 .macro SWITCH_TO_USER_CR3 scratch_reg:req
+	JUMP_IF_KAISER_OFF	.Lswitch_done_\@
 	mov	%cr3, \scratch_reg
 	ADJUST_USER_CR3 \scratch_reg
 	mov	\scratch_reg, %cr3
+.Lswitch_done_\@:
 .endm
 
 .macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+	JUMP_IF_KAISER_OFF	.Ldone_\@
 	movq	%cr3, %r\scratch_reg
 	movq	%r\scratch_reg, \save_reg
 	/*
@@ -244,11 +254,13 @@ For 32-bit we have the following convent
 .endm
 
 .macro RESTORE_CR3 save_reg:req
+	JUMP_IF_KAISER_OFF	.Ldone_\@
 	/*
 	 * The CR3 write could be avoided when not changing its value,
 	 * but would require a CR3 read *and* a scratch register.
 	 */
 	movq	\save_reg, %cr3
+.Ldone_\@:
 .endm
 
 #else /* CONFIG_KAISER=n: */
diff -puN arch/x86/mm/kaiser.c~kaiser-dynamic-asm arch/x86/mm/kaiser.c
--- a/arch/x86/mm/kaiser.c~kaiser-dynamic-asm	2017-11-22 15:45:56.404619721 -0800
+++ b/arch/x86/mm/kaiser.c	2017-11-22 15:45:56.408619721 -0800
@@ -43,6 +43,9 @@
 
 #define KAISER_WALK_ATOMIC  0x1
 
+__aligned(PAGE_SIZE)
+unsigned long kaiser_asm_do_switch[PAGE_SIZE/sizeof(unsigned long)] = { 1 };
+
 /*
  * At runtime, the only things we map are some things for CPU
  * hotplug, and stacks for new processes.  No two CPUs will ever
@@ -395,6 +398,9 @@ void __init kaiser_init(void)
 
 	kaiser_init_all_pgds();
 
+	kaiser_add_user_map_early(&kaiser_asm_do_switch, PAGE_SIZE,
+				  __PAGE_KERNEL | _PAGE_GLOBAL);
+
 	for_each_possible_cpu(cpu) {
 		void *percpu_vaddr = __per_cpu_user_mapped_start +
 				     per_cpu_offset(cpu);
@@ -483,6 +489,56 @@ static ssize_t kaiser_enabled_read_file(
 	return simple_read_from_buffer(user_buf, count, ppos, buf, len);
 }
 
+enum poison {
+	KAISER_POISON,
+	KAISER_UNPOISON
+};
+void kaiser_poison_pgds(enum poison do_poison);
+
+void kaiser_do_disable(void)
+{
+	/* Make sure the kernel PGDs are usable by userspace: */
+	kaiser_poison_pgds(KAISER_UNPOISON);
+
+	/*
+	 * Make sure all the CPUs have the poison clear in their TLBs.
+	 * This also functions as a barrier to ensure that everyone
+	 * sees the unpoisoned PGDs.
+	 */
+	flush_tlb_all();
+
+	/* Tell the assembly code to stop switching CR3. */
+	kaiser_asm_do_switch[0] = 0;
+
+	/*
+	 * Make sure everybody does an interrupt.  This means that
+	 * they have gone through a SWITCH_TO_KERNEL_CR3 amd are no
+	 * longer running on the userspace CR3.  If we did not do
+	 * this, we might have CPUs running on the shadow page tables
+	 * that then enter the kernel and think they do *not* need to
+	 * switch.
+	 */
+	flush_tlb_all();
+}
+
+void kaiser_do_enable(void)
+{
+	/* Tell the assembly code to start switching CR3: */
+	kaiser_asm_do_switch[0] = 1;
+
+	/* Make sure everyone can see the kaiser_asm_do_switch update: */
+	synchronize_rcu();
+
+	/*
+	 * Now that userspace is no longer using the kernel copy of
+	 * the page tables, we can poison it:
+	 */
+	kaiser_poison_pgds(KAISER_POISON);
+
+	/* Make sure all the CPUs see the poison: */
+	flush_tlb_all();
+}
+
 static ssize_t kaiser_enabled_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
@@ -504,7 +560,17 @@ static ssize_t kaiser_enabled_write_file
 	if (kaiser_enabled == enable)
 		return count;
 
+	/*
+	 * This tells the page table code to stop poisoning PGDs
+	 */
 	WRITE_ONCE(kaiser_enabled, enable);
+	synchronize_rcu();
+
+	if (enable)
+		kaiser_do_enable();
+	else
+		kaiser_do_disable();
+
 	return count;
 }
 
@@ -522,10 +588,6 @@ static int __init create_kaiser_enabled(
 }
 late_initcall(create_kaiser_enabled);
 
-enum poison {
-	KAISER_POISON,
-	KAISER_UNPOISON
-};
 void kaiser_poison_pgd_page(pgd_t *pgd_page, enum poison do_poison)
 {
 	int i = 0;
_

  parent reply	other threads:[~2017-11-23  0:36 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-23  0:34 [PATCH 00/23] [v4] KAISER: unmap most of the kernel from userspace page tables Dave Hansen
2017-11-23  0:34 ` [PATCH 01/23] x86, kaiser: disable global pages by default with KAISER Dave Hansen
2017-11-23  0:34 ` [PATCH 02/23] x86, kaiser: prepare assembly for entry/exit CR3 switching Dave Hansen
2017-11-23  0:34 ` [PATCH 03/23] x86, kaiser: introduce user-mapped per-cpu areas Dave Hansen
2017-11-23  0:34 ` [PATCH 04/23] x86, kaiser: mark per-cpu data structures required for entry/exit Dave Hansen
2017-11-23  0:34 ` [PATCH 05/23] x86, kaiser: unmap kernel from userspace page tables (core patch) Dave Hansen
2017-11-23  4:07   ` Andy Lutomirski
2017-11-26 16:10     ` Andy Lutomirski
2017-11-26 16:24       ` Dave Hansen
2017-11-26 16:29         ` Andy Lutomirski
2018-01-05  4:16   ` Yisheng Xie
2018-01-05  5:18     ` Dave Hansen
2018-01-05  6:16       ` Yisheng Xie
2018-01-05  6:29         ` Dave Hansen
2018-01-05 11:49           ` Andrea Arcangeli
2018-01-05 18:19           ` Jiri Kosina
2018-01-05 19:00             ` Jiri Kosina
2018-01-05 19:03             ` Dave Hansen
2018-01-05 19:17               ` Jiri Kosina
2018-01-05 19:18                 ` Jiri Kosina
2018-01-05 19:55                 ` Andrea Arcangeli
2018-01-05 21:07                 ` Dave Hansen
2018-01-05 21:14                   ` Jiri Kosina
2018-01-05 21:29                     ` Andy Lutomirski
2018-01-05 22:48                     ` Hugh Dickins
2018-01-06  4:54             ` Hanjun Guo
2018-01-06  6:06               ` Dave Hansen
2018-01-06  6:28                 ` Hanjun Guo
2018-01-06  6:53                   ` Hanjun Guo
2018-01-06  7:55                     ` Dave Hansen
2018-01-06  8:42                       ` Hanjun Guo
2018-01-06  7:51                   ` Dave Hansen
2018-01-06 17:22                     ` Andrea Arcangeli
2017-11-23  0:34 ` [PATCH 06/23] x86, kaiser: allow NX poison to be set in p4d/pgd Dave Hansen
2017-11-23  0:34 ` [PATCH 07/23] x86, kaiser: make sure static PGDs are 8k in size Dave Hansen
2017-11-23  0:34 ` [PATCH 08/23] x86, kaiser: map cpu entry area Dave Hansen
2017-11-23  0:34 ` [PATCH 09/23] x86, kaiser: map dynamically-allocated LDTs Dave Hansen
2017-11-23 19:42   ` Eric Biggers
2017-11-23 20:12     ` Andy Lutomirski
2017-11-23  0:34 ` [PATCH 10/23] x86, kaiser: map espfix structures Dave Hansen
2017-11-23  0:34 ` [PATCH 11/23] x86, kaiser: map entry stack variables Dave Hansen
2017-11-23  3:31   ` Andy Lutomirski
2017-11-23 15:37     ` Dave Hansen
2017-11-23 15:55       ` Andy Lutomirski
2017-11-23  0:35 ` [PATCH 12/23] x86, kaiser: map virtually-addressed performance monitoring buffers Dave Hansen
2017-11-23  0:35 ` [PATCH 13/23] x86, mm: Move CR3 construction functions Dave Hansen
2017-11-23  0:35 ` [PATCH 14/23] x86, mm: remove hard-coded ASID limit checks Dave Hansen
2017-11-23  0:35 ` [PATCH 15/23] x86, mm: put mmu-to-h/w ASID translation in one place Dave Hansen
2017-11-23  0:35 ` [PATCH 16/23] x86, pcid, kaiser: allow flushing for future ASID switches Dave Hansen
2017-11-23  0:35 ` [PATCH 17/23] x86, kaiser: use PCID feature to make user and kernel switches faster Dave Hansen
2017-11-23  0:35 ` [PATCH 18/23] x86, kaiser: disable native VSYSCALL Dave Hansen
2017-11-23  0:35 ` [PATCH 19/23] x86, kaiser: add debugfs file to turn KAISER on/off at runtime Dave Hansen
2017-11-23  0:35 ` [PATCH 20/23] x86, kaiser: add a function to check for KAISER being enabled Dave Hansen
2017-11-25  1:23   ` Eduardo Valentin
2017-11-23  0:35 ` [PATCH 21/23] x86, kaiser: un-poison PGDs at runtime Dave Hansen
2017-11-25  1:17   ` Eduardo Valentin
2017-11-23  0:35 ` Dave Hansen [this message]
2017-11-23  0:35 ` [PATCH 23/23] x86, kaiser: add Kconfig Dave Hansen
2017-11-23  7:23 ` [PATCH 00/23] [v4] KAISER: unmap most of the kernel from userspace page tables Ingo Molnar
2017-11-23  7:27 ` Ingo Molnar
2017-11-23  7:32   ` Ingo Molnar
2017-11-23 15:02     ` Dave Hansen
2017-11-23 16:20 ` Dave Hansen
2017-11-24  6:35   ` Ingo Molnar
2017-11-24  6:41     ` Dave Hansen
2017-11-24  7:33       ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171123003523.28FFBAB6@viggo.jf.intel.com \
    --to=dave.hansen@linux.intel.com \
    --cc=daniel.gruss@iaik.tugraz.at \
    --cc=hughd@google.com \
    --cc=keescook@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=michael.schwarz@iaik.tugraz.at \
    --cc=moritz.lipp@iaik.tugraz.at \
    --cc=richard.fellner@student.tugraz.at \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).