All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v12 00/18] Enable FSGSBASE instructions
@ 2020-05-11  4:52 Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 01/18] x86/ptrace: Prevent ptrace from clearing the FS/GS selector Sasha Levin
                   ` (20 more replies)
  0 siblings, 21 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

Benefits:
Currently a user process that wishes to read or write the FS/GS base must
make a system call. But recent X86 processors have added new instructions
for use in 64-bit mode that allow direct access to the FS and GS segment
base addresses.  The operating system controls whether applications can
use these instructions with a %cr4 control bit.

In addition to benefits to applications, performance improvements to the
OS context switch code are possible by making use of these instructions. A
third party reported out promising performance numbers out of their
initial benchmarking of the previous version of this patch series [9].

Enablement check:
The kernel provides information about the enabled state of FSGSBASE to
applications using the ELF_AUX vector. If the HWCAP2_FSGSBASE bit is set in
the AUX vector, the kernel has FSGSBASE instructions enabled and
applications can use them.

Kernel changes:
Major changes made in the kernel are in context switch, paranoid path, and
ptrace. In a context switch, a task's FS/GS base will be secured regardless
of its selector. In the paranoid path, GS base is unconditionally
overwritten to the kernel GS base on entry and the original GS base is
restored on exit. Ptrace includes divergence of FS/GS index and base
values.

Security:
For mitigating the Spectre v1 SWAPGS issue, LFENCE instructions were added
on most kernel entries. Those patches are dependent on previous behaviors
that users couldn't load a kernel address into the GS base. These patches
change that assumption since the user can load any address into GS base.
The changes to the kernel entry path in this patch series take account of
the SWAPGS issue.

Changes from v11:

 - Rebase to v5.7-rc5, fix 32bit compilation error.


Andi Kleen (2):
  x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2

Andy Lutomirski (4):
  x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  x86/entry/64: Clean up paranoid exit
  x86/fsgsbase/64: Use FSGSBASE in switch_to() if available
  x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken
    bit

Chang S. Bae (9):
  x86/ptrace: Prevent ptrace from clearing the FS/GS selector
  selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base
    write
  x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
  x86/entry/64: Introduce the FIND_PERCPU_BASE macro
  x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
  x86/entry/64: Document GSBASE handling in the paranoid path
  x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace
  selftests/x86/fsgsbase: Test ptracer-induced GS base write with
    FSGSBASE

Sasha Levin (1):
  x86/fsgsbase/64: move save_fsgs to header file

Thomas Gleixner (1):
  Documentation/x86/64: Add documentation for GS/FS addressing mode

Tony Luck (1):
  x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation

 .../admin-guide/kernel-parameters.txt         |   2 +
 Documentation/x86/entry_64.rst                |   9 +
 Documentation/x86/x86_64/fsgs.rst             | 199 ++++++++++++++++++
 Documentation/x86/x86_64/index.rst            |   1 +
 arch/x86/entry/calling.h                      |  40 ++++
 arch/x86/entry/entry_64.S                     | 131 +++++++++---
 arch/x86/include/asm/fsgsbase.h               |  45 +++-
 arch/x86/include/asm/inst.h                   |  15 ++
 arch/x86/include/uapi/asm/hwcap2.h            |   3 +
 arch/x86/kernel/cpu/bugs.c                    |   6 +-
 arch/x86/kernel/cpu/common.c                  |  22 ++
 arch/x86/kernel/process.c                     |   9 +-
 arch/x86/kernel/process.h                     |  72 +++++++
 arch/x86/kernel/process_64.c                  | 142 +++++++------
 arch/x86/kernel/ptrace.c                      |  17 +-
 tools/testing/selftests/x86/fsgsbase.c        |  24 ++-
 16 files changed, 608 insertions(+), 129 deletions(-)
 create mode 100644 Documentation/x86/x86_64/fsgs.rst

-- 
2.20.1


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH v12 01/18] x86/ptrace: Prevent ptrace from clearing the FS/GS selector
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
@ 2020-05-11  4:52 ` Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 02/18] selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Sasha Levin
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: "Chang S. Bae" <chang.seok.bae@intel.com>

When a ptracer writes a ptracee's FS/GS base with a different value, the
selector is also cleared. While this behavior is incorrect as the selector
should be preserved, most userspace applications did not notice that as
they do not use non-zero segments to begin with.

Instead, with this patch, when a tracee sets the base we will let it do
so without clearing the selector.

The change above means that a tracee that already has a selector set
will fail in an attempt to set the base - the change won't stick and the
value will be instead based on the value of the selector. As with the
above, we haven't found userspace that would be affected by this change.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
[sasha: rewrite commit message]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/ptrace.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f0e1ddbc2fd78..cc56efb75d275 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -380,25 +380,12 @@ static int putreg(struct task_struct *child,
 	case offsetof(struct user_regs_struct,fs_base):
 		if (value >= TASK_SIZE_MAX)
 			return -EIO;
-		/*
-		 * When changing the FS base, use do_arch_prctl_64()
-		 * to set the index to zero and to set the base
-		 * as requested.
-		 *
-		 * NB: This behavior is nonsensical and likely needs to
-		 * change when FSGSBASE support is added.
-		 */
-		if (child->thread.fsbase != value)
-			return do_arch_prctl_64(child, ARCH_SET_FS, value);
+		x86_fsbase_write_task(child, value);
 		return 0;
 	case offsetof(struct user_regs_struct,gs_base):
-		/*
-		 * Exactly the same here as the %fs handling above.
-		 */
 		if (value >= TASK_SIZE_MAX)
 			return -EIO;
-		if (child->thread.gsbase != value)
-			return do_arch_prctl_64(child, ARCH_SET_GS, value);
+		x86_gsbase_write_task(child, value);
 		return 0;
 #endif
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 02/18] selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 01/18] x86/ptrace: Prevent ptrace from clearing the FS/GS selector Sasha Levin
@ 2020-05-11  4:52 ` Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 03/18] x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Sasha Levin
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: "Chang S. Bae" <chang.seok.bae@intel.com>

The test validates that the selector is not changed when a ptracer writes
the ptracee's GS base.

Originally-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 tools/testing/selftests/x86/fsgsbase.c | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index 15a329da59fa3..950a48b2e3662 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -465,7 +465,7 @@ static void test_ptrace_write_gsbase(void)
 	wait(&status);
 
 	if (WSTOPSIG(status) == SIGTRAP) {
-		unsigned long gs, base;
+		unsigned long gs;
 		unsigned long gs_offset = USER_REGS_OFFSET(gs);
 		unsigned long base_offset = USER_REGS_OFFSET(gs_base);
 
@@ -481,7 +481,6 @@ static void test_ptrace_write_gsbase(void)
 			err(1, "PTRACE_POKEUSER");
 
 		gs = ptrace(PTRACE_PEEKUSER, child, gs_offset, NULL);
-		base = ptrace(PTRACE_PEEKUSER, child, base_offset, NULL);
 
 		/*
 		 * In a non-FSGSBASE system, the nonzero selector will load
@@ -489,11 +488,21 @@ static void test_ptrace_write_gsbase(void)
 		 * selector value is changed or not by the GSBASE write in
 		 * a ptracer.
 		 */
-		if (gs == 0 && base == 0xFF) {
-			printf("[OK]\tGS was reset as expected\n");
-		} else {
+		if (gs != *shared_scratch) {
 			nerrs++;
-			printf("[FAIL]\tGS=0x%lx, GSBASE=0x%lx (should be 0, 0xFF)\n", gs, base);
+			printf("[FAIL]\tGS changed to %lx\n", gs);
+
+			/*
+			 * On older kernels, poking a nonzero value into the
+			 * base would zero the selector.  On newer kernels,
+			 * this behavior has changed -- poking the base
+			 * changes only the base and, if FSGSBASE is not
+			 * available, this may not effect.
+			 */
+			if (gs == 0)
+				printf("\tNote: this is expected behavior on older kernels.\n");
+		} else {
+			printf("[OK]\tGS remained 0x%hx\n", *shared_scratch);
 		}
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 03/18] x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 01/18] x86/ptrace: Prevent ptrace from clearing the FS/GS selector Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 02/18] selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Sasha Levin
@ 2020-05-11  4:52 ` Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 04/18] x86/entry/64: Clean up paranoid exit Sasha Levin
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: Andy Lutomirski <luto@kernel.org>

This is temporary.  It will allow the next few patches to be tested
incrementally.

Setting unsafe_fsgsbase is a root hole.  Don't do it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 .../admin-guide/kernel-parameters.txt         |  3 +++
 arch/x86/kernel/cpu/common.c                  | 24 +++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 7bc83f3d9bdfe..af3aaade195b8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3033,6 +3033,9 @@
 	no5lvl		[X86-64] Disable 5-level paging mode. Forces
 			kernel to use 4-level paging instead.
 
+	unsafe_fsgsbase	[X86] Allow FSGSBASE instructions.  This will be
+			replaced with a nofsgsbase flag.
+
 	no_console_suspend
 			[HW] Never suspend the console
 			Disable suspending of consoles during suspend and
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bed0cb83fe245..4224760c74e27 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -418,6 +418,22 @@ static void __init setup_cr_pinning(void)
 	static_key_enable(&cr_pinning.key);
 }
 
+/*
+ * Temporary hack: FSGSBASE is unsafe until a few kernel code paths are
+ * updated. This allows us to get the kernel ready incrementally.
+ *
+ * Once all the pieces are in place, these will go away and be replaced with
+ * a nofsgsbase chicken flag.
+ */
+static bool unsafe_fsgsbase;
+
+static __init int setup_unsafe_fsgsbase(char *arg)
+{
+	unsafe_fsgsbase = true;
+	return 1;
+}
+__setup("unsafe_fsgsbase", setup_unsafe_fsgsbase);
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1478,6 +1494,14 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_smap(c);
 	setup_umip(c);
 
+	/* Enable FSGSBASE instructions if available. */
+	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
+		if (unsafe_fsgsbase)
+			cr4_set_bits(X86_CR4_FSGSBASE);
+		else
+			clear_cpu_cap(c, X86_FEATURE_FSGSBASE);
+	}
+
 	/*
 	 * The vendor-specific functions might have changed features.
 	 * Now we do "generic changes."
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 04/18] x86/entry/64: Clean up paranoid exit
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (2 preceding siblings ...)
  2020-05-11  4:52 ` [PATCH v12 03/18] x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Sasha Levin
@ 2020-05-11  4:52 ` Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 05/18] x86/entry/64: Switch CR3 before SWAPGS in paranoid entry Sasha Levin
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Vegard Nossum

From: Andy Lutomirski <luto@kernel.org>

All that paranoid exit needs to do is to disable IRQs, handle IRQ tracing,
then restore CR3, and restore GS base. Simply do those actions in that
order. Cleaning up the spaghetti code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
---
 arch/x86/entry/entry_64.S | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3063aa9090f9a..0da56e6791b73 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1266,19 +1266,25 @@ SYM_CODE_END(paranoid_entry)
 SYM_CODE_START_LOCAL(paranoid_exit)
 	UNWIND_HINT_REGS
 	DISABLE_INTERRUPTS(CLBR_ANY)
+
+	/*
+	 * The order of operations is important. IRQ tracing requires
+	 * kernel GS base and CR3. RESTORE_CR3 requires kernel GS base.
+	 *
+	 * NB to anyone to try to optimize this code: this code does
+	 * not execute at all for exceptions from user mode. Those
+	 * exceptions go through error_exit instead.
+	 */
 	TRACE_IRQS_OFF_DEBUG
-	testl	%ebx, %ebx			/* swapgs needed? */
-	jnz	.Lparanoid_exit_no_swapgs
-	TRACE_IRQS_IRETQ
-	/* Always restore stashed CR3 value (see paranoid_entry) */
-	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
+	RESTORE_CR3	scratch_reg=%rax save_reg=%r14
+
+	/* If EBX is 0, SWAPGS is required */
+	testl	%ebx, %ebx
+	jnz	restore_regs_and_return_to_kernel
+
+	/* We are returning to a context with user GS base */
 	SWAPGS_UNSAFE_STACK
 	jmp	restore_regs_and_return_to_kernel
-.Lparanoid_exit_no_swapgs:
-	TRACE_IRQS_IRETQ_DEBUG
-	/* Always restore stashed CR3 value (see paranoid_entry) */
-	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
-	jmp restore_regs_and_return_to_kernel
 SYM_CODE_END(paranoid_exit)
 
 /*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 05/18] x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (3 preceding siblings ...)
  2020-05-11  4:52 ` [PATCH v12 04/18] x86/entry/64: Clean up paranoid exit Sasha Levin
@ 2020-05-11  4:52 ` Sasha Levin
  2020-05-11  4:52 ` [PATCH v12 06/18] x86/entry/64: Introduce the FIND_PERCPU_BASE macro Sasha Levin
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Vegard Nossum

From: "Chang S. Bae" <chang.seok.bae@intel.com>

When FSGSBASE is enabled, the GS base handling in paranoid entry will need
to retrieve the kernel GS base which requires that the kernel page table is
active.

As the CR3 switch to the kernel page tables (PTI is active) does not depend
on kernel GS base, move the CR3 switch in front of the GS base handling.

Comment the EBX content while at it.

No functional change.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
---
 arch/x86/entry/entry_64.S | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0da56e6791b73..3ac1313724eaa 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1220,15 +1220,7 @@ SYM_CODE_START_LOCAL(paranoid_entry)
 	cld
 	PUSH_AND_CLEAR_REGS save_ret=1
 	ENCODE_FRAME_POINTER 8
-	movl	$1, %ebx
-	movl	$MSR_GS_BASE, %ecx
-	rdmsr
-	testl	%edx, %edx
-	js	1f				/* negative -> in kernel */
-	SWAPGS
-	xorl	%ebx, %ebx
 
-1:
 	/*
 	 * Always stash CR3 in %r14.  This value will be restored,
 	 * verbatim, at exit.  Needed if paranoid_entry interrupted
@@ -1238,16 +1230,31 @@ SYM_CODE_START_LOCAL(paranoid_entry)
 	 * This is also why CS (stashed in the "iret frame" by the
 	 * hardware at entry) can not be used: this may be a return
 	 * to kernel code, but with a user CR3 value.
+	 *
+	 * Switching CR3 does not depend on kernel GS base so it can
+	 * be done before switching to the kernel GS base. This is
+	 * required for FSGSBASE because the kernel GS base has to
+	 * be retrieved from a kernel internal table.
 	 */
 	SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
 
+	/* EBX = 1 -> kernel GSBASE active, no restore required */
+	movl	$1, %ebx
 	/*
-	 * The above SAVE_AND_SWITCH_TO_KERNEL_CR3 macro doesn't do an
-	 * unconditional CR3 write, even in the PTI case.  So do an lfence
-	 * to prevent GS speculation, regardless of whether PTI is enabled.
+	 * The kernel-enforced convention is a negative GS base indicates
+	 * a kernel value. No SWAPGS needed on entry and exit.
 	 */
-	FENCE_SWAPGS_KERNEL_ENTRY
+	movl	$MSR_GS_BASE, %ecx
+	rdmsr
+	testl	%edx, %edx
+	jns	.Lparanoid_entry_swapgs
+	ret
 
+.Lparanoid_entry_swapgs:
+	SWAPGS
+	FENCE_SWAPGS_KERNEL_ENTRY
+	/* EBX = 0 -> SWAPGS required on exit */
+	xorl	%ebx, %ebx
 	ret
 SYM_CODE_END(paranoid_entry)
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 06/18] x86/entry/64: Introduce the FIND_PERCPU_BASE macro
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (4 preceding siblings ...)
  2020-05-11  4:52 ` [PATCH v12 05/18] x86/entry/64: Switch CR3 before SWAPGS in paranoid entry Sasha Levin
@ 2020-05-11  4:52 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 07/18] x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit Sasha Levin
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:52 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Vegard Nossum

From: "Chang S. Bae" <chang.seok.bae@intel.com>

GS base is used to find per-CPU data in the kernel. But when GS base is
unknown, the per-CPU base can be found from the per_cpu_offset table with a
CPU NR.  The CPU NR is extracted from the limit field of the CPUNODE entry
in GDT, or by the RDPID instruction. This is a prerequisite for using
FSGSBASE in the low level entry code.

Also, add the GAS-compatible RDPID macro as binutils 2.21 does not support
it. Support is added in version 2.27.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
---
 arch/x86/entry/calling.h    | 34 ++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/inst.h | 15 +++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 1c7f13bb67286..29982fe140541 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -6,6 +6,7 @@
 #include <asm/percpu.h>
 #include <asm/asm-offsets.h>
 #include <asm/processor-flags.h>
+#include <asm/inst.h>
 
 /*
 
@@ -349,6 +350,39 @@ For 32-bit we have the following conventions - kernel is built with
 #endif
 .endm
 
+#ifdef CONFIG_SMP
+
+/*
+ * CPU/node NR is loaded from the limit (size) field of a special segment
+ * descriptor entry in GDT.
+ */
+.macro LOAD_CPU_AND_NODE_SEG_LIMIT reg:req
+	movq	$__CPUNODE_SEG, \reg
+	lsl	\reg, \reg
+.endm
+
+/*
+ * Fetch the per-CPU GS base value for this processor and put it in @reg.
+ * We normally use %gs for accessing per-CPU data, but we are setting up
+ * %gs here and obviously can not use %gs itself to access per-CPU data.
+ */
+.macro GET_PERCPU_BASE reg:req
+	ALTERNATIVE \
+		"LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
+		"RDPID	\reg", \
+		X86_FEATURE_RDPID
+	andq	$VDSO_CPUNODE_MASK, \reg
+	movq	__per_cpu_offset(, \reg, 8), \reg
+.endm
+
+#else
+
+.macro GET_PERCPU_BASE reg:req
+	movq	pcpu_unit_offsets(%rip), \reg
+.endm
+
+#endif /* CONFIG_SMP */
+
 /*
  * This does 'call enter_from_user_mode' unless we can avoid it based on
  * kernel config or using the static jump infrastructure.
diff --git a/arch/x86/include/asm/inst.h b/arch/x86/include/asm/inst.h
index f5a796da07f88..d063841a17e39 100644
--- a/arch/x86/include/asm/inst.h
+++ b/arch/x86/include/asm/inst.h
@@ -306,6 +306,21 @@
 	.endif
 	MODRM 0xc0 movq_r64_xmm_opd1 movq_r64_xmm_opd2
 	.endm
+
+.macro RDPID opd
+	REG_TYPE rdpid_opd_type \opd
+	.if rdpid_opd_type == REG_TYPE_R64
+	R64_NUM rdpid_opd \opd
+	.else
+	R32_NUM rdpid_opd \opd
+	.endif
+	.byte 0xf3
+	.if rdpid_opd > 7
+	PFX_REX rdpid_opd 0
+	.endif
+	.byte 0x0f, 0xc7
+	MODRM 0xc0 rdpid_opd 0x7
+.endm
 #endif
 
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 07/18] x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (5 preceding siblings ...)
  2020-05-11  4:52 ` [PATCH v12 06/18] x86/entry/64: Introduce the FIND_PERCPU_BASE macro Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 08/18] x86/entry/64: Document GSBASE handling in the paranoid path Sasha Levin
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Tom Lendacky, Vegard Nossum

From: "Chang S. Bae" <chang.seok.bae@intel.com>

Without FSGSBASE, user space cannot change GS base other than through a
PRCTL. The kernel enforces that the user space GS base value is positive
as negative values are used for detecting the kernel space GS base value
in the paranoid entry code.

If FSGSBASE is enabled, user space can set arbitrary GS base values without
kernel intervention, including negative ones, which breaks the paranoid
entry assumptions.

To avoid this, paranoid entry needs to unconditionally save the current
GS base value independent of the interrupted context, retrieve and write
the kernel GS base and unconditionally restore the saved value on exit.
The restore happens either in paranoid exit or in the special exit path of
the NMI low level code.

All other entry code paths which use unconditional SWAPGS are not affected
as they do not depend on the actual content.

The new logic for paranoid entry, when FSGSBASE is enabled, removes SWAPGS
and replaces with unconditional WRGSBASE. Hence no fences are needed.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
---
 arch/x86/entry/calling.h  |  6 +++
 arch/x86/entry/entry_64.S | 78 ++++++++++++++++++++++++++++++++++-----
 2 files changed, 75 insertions(+), 9 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 29982fe140541..6dc2702a939c7 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -342,6 +342,12 @@ For 32-bit we have the following conventions - kernel is built with
 #endif
 .endm
 
+.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req
+	rdgsbase \save_reg
+	GET_PERCPU_BASE \scratch_reg
+	wrgsbase \scratch_reg
+.endm
+
 #endif /* CONFIG_X86_64 */
 
 .macro STACKLEAK_ERASE
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3ac1313724eaa..c2c4e063c406d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -38,6 +38,7 @@
 #include <asm/export.h>
 #include <asm/frame.h>
 #include <asm/nospec-branch.h>
+#include <asm/fsgsbase.h>
 #include <linux/err.h>
 
 #include "calling.h"
@@ -1211,9 +1212,14 @@ idtentry machine_check		do_mce			has_error_code=0	paranoid=1
 #endif
 
 /*
- * Save all registers in pt_regs, and switch gs if needed.
- * Use slow, but surefire "are we in kernel?" check.
- * Return: ebx=0: need swapgs on exit, ebx=1: otherwise
+ * Save all registers in pt_regs. Return GS base related information
+ * in EBX depending on the availability of the FSGSBASE instructions:
+ *
+ * FSGSBASE	R/EBX
+ *     N        0 -> SWAPGS on exit
+ *              1 -> no SWAPGS on exit
+ *
+ *     Y        GS base value at entry, must be restored in paranoid_exit
  */
 SYM_CODE_START_LOCAL(paranoid_entry)
 	UNWIND_HINT_FUNC
@@ -1238,7 +1244,29 @@ SYM_CODE_START_LOCAL(paranoid_entry)
 	 */
 	SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
 
-	/* EBX = 1 -> kernel GSBASE active, no restore required */
+	/*
+	 * Handling GS base depends on the availability of FSGSBASE.
+	 *
+	 * Without FSGSBASE the kernel enforces that negative GS base
+	 * values indicate kernel GS base. With FSGSBASE no assumptions
+	 * can be made about the GS base value when entering from user
+	 * space.
+	*/
+	ALTERNATIVE "jmp .Lparanoid_entry_checkgs", "", X86_FEATURE_FSGSBASE
+
+	/*
+	 * Read the current GS base and store it in %rbx unconditionally,
+	 * retrieve and set the current CPUs kernel GS base. The stored value
+	 * has to be restored in paranoid_exit unconditionally.
+	 *
+	 * This unconditional write of GS base ensures no subsequent load
+	 * based on a mispredicted GS base.
+	 */
+	SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx
+	ret
+
+.Lparanoid_entry_checkgs:
+	/* EBX = 1 -> kernel GS base active, no restore required */
 	movl	$1, %ebx
 	/*
 	 * The kernel-enforced convention is a negative GS base indicates
@@ -1265,10 +1293,17 @@ SYM_CODE_END(paranoid_entry)
  *
  * We may be returning to very strange contexts (e.g. very early
  * in syscall entry), so checking for preemption here would
- * be complicated.  Fortunately, we there's no good reason
- * to try to handle preemption here.
+ * be complicated.  Fortunately, there's no good reason to try
+ * to handle preemption here.
+ *
+ * R/EBX contains the GS base related information depending on the
+ * availability of the FSGSBASE instructions:
+ *
+ * FSGSBASE	R/EBX
+ *     N        0 -> SWAPGS on exit
+ *              1 -> no SWAPGS on exit
  *
- * On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it)
+ *     Y        User space GS base, must be restored unconditionally
  */
 SYM_CODE_START_LOCAL(paranoid_exit)
 	UNWIND_HINT_REGS
@@ -1285,7 +1320,15 @@ SYM_CODE_START_LOCAL(paranoid_exit)
 	TRACE_IRQS_OFF_DEBUG
 	RESTORE_CR3	scratch_reg=%rax save_reg=%r14
 
-	/* If EBX is 0, SWAPGS is required */
+	/* Handle the three GS base cases */
+	ALTERNATIVE "jmp .Lparanoid_exit_checkgs", "", X86_FEATURE_FSGSBASE
+
+	/* With FSGSBASE enabled, unconditionally resotre GS base */
+	wrgsbase	%rbx
+	jmp	restore_regs_and_return_to_kernel
+
+.Lparanoid_exit_checkgs:
+	/* On non-FSGSBASE systems, conditionally do SWAPGS */
 	testl	%ebx, %ebx
 	jnz	restore_regs_and_return_to_kernel
 
@@ -1699,10 +1742,27 @@ end_repeat_nmi:
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
 
-	testl	%ebx, %ebx			/* swapgs needed? */
+	/*
+	 * The above invocation of paranoid_entry stored the GS base
+	 * related information in R/EBX depending on the availability
+	 * of FSGSBASE.
+	 *
+	 * If FSGSBASE is enabled, restore the saved GS base value
+	 * unconditionally, otherwise take the conditional SWAPGS path.
+	 */
+	ALTERNATIVE "jmp nmi_no_fsgsbase", "", X86_FEATURE_FSGSBASE
+
+	wrgsbase	%rbx
+	jmp	nmi_restore
+
+nmi_no_fsgsbase:
+	/* EBX == 0 -> invoke SWAPGS */
+	testl	%ebx, %ebx
 	jnz	nmi_restore
+
 nmi_swapgs:
 	SWAPGS_UNSAFE_STACK
+
 nmi_restore:
 	POP_REGS
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 08/18] x86/entry/64: Document GSBASE handling in the paranoid path
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (6 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 07/18] x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 09/18] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Sasha Levin
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: "Chang S. Bae" <chang.seok.bae@intel.com>

On FSGSBASE systems, the way to handle GS base in the paranoid path is
different from the existing SWAPGS-based entry/exit path handling. Document
the reason and what has to be done for FSGSBASE enabled systems.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 Documentation/x86/entry_64.rst | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/Documentation/x86/entry_64.rst b/Documentation/x86/entry_64.rst
index a48b3f6ebbe87..0499a40723af3 100644
--- a/Documentation/x86/entry_64.rst
+++ b/Documentation/x86/entry_64.rst
@@ -108,3 +108,12 @@ We try to only use IST entries and the paranoid entry code for vectors
 that absolutely need the more expensive check for the GS base - and we
 generate all 'normal' entry points with the regular (faster) paranoid=0
 variant.
+
+On FSGSBASE systems, however, user space can set GS without kernel
+interaction. It means the value of GS base itself does not imply anything,
+whether a kernel value or a user space value. So, there is no longer a safe
+way to check whether the exception is entering from user mode or kernel
+mode in the paranoid entry code path. So the GS base value needs to be read
+out, saved and the kernel GS base value written. On exit, the saved GS base
+value needs to be restored unconditionally. The non-paranoid entry/exit
+code still uses SWAPGS unconditionally as the state is known.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 09/18] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (7 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 08/18] x86/entry/64: Document GSBASE handling in the paranoid path Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions Sasha Levin
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: Andi Kleen <ak@linux.intel.com>

[ luto: Rename the variables from FS and GS to FSBASE and GSBASE and
  make <asm/fsgsbase.h> safe to include on 32-bit kernels. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/include/asm/fsgsbase.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index bca4c743de77c..fdd1177499b40 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -19,6 +19,36 @@ extern unsigned long x86_gsbase_read_task(struct task_struct *task);
 extern void x86_fsbase_write_task(struct task_struct *task, unsigned long fsbase);
 extern void x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase);
 
+/* Must be protected by X86_FEATURE_FSGSBASE check. */
+
+static __always_inline unsigned long rdfsbase(void)
+{
+	unsigned long fsbase;
+
+	asm volatile("rdfsbase %0" : "=r" (fsbase) :: "memory");
+
+	return fsbase;
+}
+
+static __always_inline unsigned long rdgsbase(void)
+{
+	unsigned long gsbase;
+
+	asm volatile("rdgsbase %0" : "=r" (gsbase) :: "memory");
+
+	return gsbase;
+}
+
+static __always_inline void wrfsbase(unsigned long fsbase)
+{
+	asm volatile("wrfsbase %0" :: "r" (fsbase) : "memory");
+}
+
+static __always_inline void wrgsbase(unsigned long gsbase)
+{
+	asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
+}
+
 /* Helper functions for reading/writing FS/GS base */
 
 static inline unsigned long x86_fsbase_read_cpu(void)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (8 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 09/18] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-18 18:20   ` Thomas Gleixner
  2020-05-11  4:53 ` [PATCH v12 11/18] x86/fsgsbase/64: Use FSGSBASE in switch_to() if available Sasha Levin
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Andrew Cooper

From: "Chang S. Bae" <chang.seok.bae@intel.com>

Add CPU feature conditional FS/GS base access to the relevant helper
functions. That allows accelerating certain FS/GS base operations in
subsequent changes.

Note, that while possible, the user space entry/exit GS base operations are
not going to use the new FSGSBASE instructions. The reason is that it would
require additional storage for the user space value which adds more
complexity to the low level code and experiments have shown marginal
benefit. This may be revisited later but for now the SWAPGS based handling
in the entry code is preserved except for the paranoid entry/exit code.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 arch/x86/include/asm/fsgsbase.h | 27 +++++++--------
 arch/x86/kernel/process_64.c    | 58 +++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index fdd1177499b40..aefd53767a5d4 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -49,35 +49,32 @@ static __always_inline void wrgsbase(unsigned long gsbase)
 	asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
 }
 
+#include <asm/cpufeature.h>
+
 /* Helper functions for reading/writing FS/GS base */
 
 static inline unsigned long x86_fsbase_read_cpu(void)
 {
 	unsigned long fsbase;
 
-	rdmsrl(MSR_FS_BASE, fsbase);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE))
+		fsbase = rdfsbase();
+	else
+		rdmsrl(MSR_FS_BASE, fsbase);
 
 	return fsbase;
 }
 
-static inline unsigned long x86_gsbase_read_cpu_inactive(void)
-{
-	unsigned long gsbase;
-
-	rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
-
-	return gsbase;
-}
-
 static inline void x86_fsbase_write_cpu(unsigned long fsbase)
 {
-	wrmsrl(MSR_FS_BASE, fsbase);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE))
+		wrfsbase(fsbase);
+	else
+		wrmsrl(MSR_FS_BASE, fsbase);
 }
 
-static inline void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
-{
-	wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
-}
+extern unsigned long x86_gsbase_read_cpu_inactive(void);
+extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
 
 #endif /* CONFIG_X86_64 */
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 5ef9d8f25b0e8..aaa65f284b9b9 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -328,6 +328,64 @@ static unsigned long x86_fsgsbase_read_task(struct task_struct *task,
 	return base;
 }
 
+unsigned long x86_gsbase_read_cpu_inactive(void)
+{
+	unsigned long gsbase;
+
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		bool need_restore = false;
+		unsigned long flags;
+
+		/*
+		 * We read the inactive GS base value by swapping
+		 * to make it the active one. But we cannot allow
+		 * an interrupt while we switch to and from.
+		 */
+		if (!irqs_disabled()) {
+			local_irq_save(flags);
+			need_restore = true;
+		}
+
+		native_swapgs();
+		gsbase = rdgsbase();
+		native_swapgs();
+
+		if (need_restore)
+			local_irq_restore(flags);
+	} else {
+		rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
+	}
+
+	return gsbase;
+}
+
+void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
+{
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		bool need_restore = false;
+		unsigned long flags;
+
+		/*
+		 * We write the inactive GS base value by swapping
+		 * to make it the active one. But we cannot allow
+		 * an interrupt while we switch to and from.
+		 */
+		if (!irqs_disabled()) {
+			local_irq_save(flags);
+			need_restore = true;
+		}
+
+		native_swapgs();
+		wrgsbase(gsbase);
+		native_swapgs();
+
+		if (need_restore)
+			local_irq_restore(flags);
+	} else {
+		wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
+	}
+}
+
 unsigned long x86_fsbase_read_task(struct task_struct *task)
 {
 	unsigned long fsbase;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 11/18] x86/fsgsbase/64: Use FSGSBASE in switch_to() if available
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (9 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 12/18] x86/fsgsbase/64: move save_fsgs to header file Sasha Levin
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: Andy Lutomirski <luto@kernel.org>

With the new FSGSBASE instructions, FS/GS base can be efficiently read
and written in __switch_to(). Use that capability to preserve the full
state.

This will enable user code to do whatever it wants with the new
instructions without any kernel-induced gotchas.  (There can still be
architectural gotchas: movl %gs,%eax; movl %eax,%gs may change GS base
if WRGSBASE was used, but users are expected to read the CPU manual
before doing things like that.)

This is a considerable speedup. It seems to save about 100 cycles per
context switch compared to the baseline 4.6-rc1 behavior on a Skylake
laptop.

[ chang: 5~10% performance improvements were seen by a context switch
  benchmark that ran threads with different FS/GS base values (to the
  baseline 4.16). ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/process_64.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index aaa65f284b9b9..e066750be89a0 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -199,8 +199,18 @@ static __always_inline void save_fsgs(struct task_struct *task)
 {
 	savesegment(fs, task->thread.fsindex);
 	savesegment(gs, task->thread.gsindex);
-	save_base_legacy(task, task->thread.fsindex, FS);
-	save_base_legacy(task, task->thread.gsindex, GS);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		/*
+		 * If FSGSBASE is enabled, we can't make any useful guesses
+		 * about the base, and user code expects us to save the current
+		 * value.  Fortunately, reading the base directly is efficient.
+		 */
+		task->thread.fsbase = rdfsbase();
+		task->thread.gsbase = x86_gsbase_read_cpu_inactive();
+	} else {
+		save_base_legacy(task, task->thread.fsindex, FS);
+		save_base_legacy(task, task->thread.gsindex, GS);
+	}
 }
 
 #if IS_ENABLED(CONFIG_KVM)
@@ -279,10 +289,22 @@ static __always_inline void load_seg_legacy(unsigned short prev_index,
 static __always_inline void x86_fsgsbase_load(struct thread_struct *prev,
 					      struct thread_struct *next)
 {
-	load_seg_legacy(prev->fsindex, prev->fsbase,
-			next->fsindex, next->fsbase, FS);
-	load_seg_legacy(prev->gsindex, prev->gsbase,
-			next->gsindex, next->gsbase, GS);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		/* Update the FS and GS selectors if they could have changed. */
+		if (unlikely(prev->fsindex || next->fsindex))
+			loadseg(FS, next->fsindex);
+		if (unlikely(prev->gsindex || next->gsindex))
+			loadseg(GS, next->gsindex);
+
+		/* Update the bases. */
+		wrfsbase(next->fsbase);
+		x86_gsbase_write_cpu_inactive(next->gsbase);
+	} else {
+		load_seg_legacy(prev->fsindex, prev->fsbase,
+				next->fsindex, next->fsbase, FS);
+		load_seg_legacy(prev->gsindex, prev->gsbase,
+				next->gsindex, next->gsbase, GS);
+	}
 }
 
 static unsigned long x86_fsgsbase_read_task(struct task_struct *task,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 12/18] x86/fsgsbase/64: move save_fsgs to header file
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (10 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 11/18] x86/fsgsbase/64: Use FSGSBASE in switch_to() if available Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 13/18] x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace Sasha Levin
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

Given copy_thread_tls() is now shared between 32 and 64 bit and we need
to use save_fsgs() there, move it to a header file.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kernel/process.h    | 72 ++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/process_64.c | 68 ----------------------------------
 2 files changed, 72 insertions(+), 68 deletions(-)

diff --git a/arch/x86/kernel/process.h b/arch/x86/kernel/process.h
index 1d0797b2338a2..2360d340cbf00 100644
--- a/arch/x86/kernel/process.h
+++ b/arch/x86/kernel/process.h
@@ -37,3 +37,75 @@ static inline void switch_to_extra(struct task_struct *prev,
 		     prev_tif & _TIF_WORK_CTXSW_PREV))
 		__switch_to_xtra(prev, next);
 }
+
+#ifdef CONFIG_X86_64
+
+enum which_selector {
+	FS,
+	GS
+};
+
+/*
+ * Saves the FS or GS base for an outgoing thread if FSGSBASE extensions are
+ * not available.  The goal is to be reasonably fast on non-FSGSBASE systems.
+ * It's forcibly inlined because it'll generate better code and this function
+ * is hot.
+ */
+static __always_inline void save_base_legacy(struct task_struct *prev_p,
+                                             unsigned short selector,
+                                             enum which_selector which)
+{
+	if (likely(selector == 0)) {
+		/*
+		 * On Intel (without X86_BUG_NULL_SEG), the segment base could
+		 * be the pre-existing saved base or it could be zero.  On AMD
+		 * (with X86_BUG_NULL_SEG), the segment base could be almost
+		 * anything.
+		 *
+		 * This branch is very hot (it's hit twice on almost every
+		 * context switch between 64-bit programs), and avoiding
+		 * the RDMSR helps a lot, so we just assume that whatever
+		 * value is already saved is correct.  This matches historical
+		 * Linux behavior, so it won't break existing applications.
+		 *
+		 * To avoid leaking state, on non-X86_BUG_NULL_SEG CPUs, if we
+		 * report that the base is zero, it needs to actually be zero:
+		 * see the corresponding logic in load_seg_legacy.
+		 */
+	} else {
+		/*
+		 * If the selector is 1, 2, or 3, then the base is zero on
+		 * !X86_BUG_NULL_SEG CPUs and could be anything on
+		 * X86_BUG_NULL_SEG CPUs.  In the latter case, Linux
+		 * has never attempted to preserve the base across context
+		 * switches.
+		 *
+		 * If selector > 3, then it refers to a real segment, and
+		 * saving the base isn't necessary.
+		 */
+		if (which == FS)
+			prev_p->thread.fsbase = 0;
+		else
+			prev_p->thread.gsbase = 0;
+	}
+}
+
+static __always_inline void save_fsgs(struct task_struct *task)
+{
+	savesegment(fs, task->thread.fsindex);
+	savesegment(gs, task->thread.gsindex);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		/*
+		 * If FSGSBASE is enabled, we can't make any useful guesses
+		 * about the base, and user code expects us to save the current
+		 * value.  Fortunately, reading the base directly is efficient.
+		 */
+		task->thread.fsbase = rdfsbase();
+		task->thread.gsbase = x86_gsbase_read_cpu_inactive();
+	} else {
+		save_base_legacy(task, task->thread.fsindex, FS);
+		save_base_legacy(task, task->thread.gsindex, GS);
+	}
+}
+
+#endif
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e066750be89a0..4be88124d81ea 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -145,74 +145,6 @@ void release_thread(struct task_struct *dead_task)
 	WARN_ON(dead_task->mm);
 }
 
-enum which_selector {
-	FS,
-	GS
-};
-
-/*
- * Saves the FS or GS base for an outgoing thread if FSGSBASE extensions are
- * not available.  The goal is to be reasonably fast on non-FSGSBASE systems.
- * It's forcibly inlined because it'll generate better code and this function
- * is hot.
- */
-static __always_inline void save_base_legacy(struct task_struct *prev_p,
-					     unsigned short selector,
-					     enum which_selector which)
-{
-	if (likely(selector == 0)) {
-		/*
-		 * On Intel (without X86_BUG_NULL_SEG), the segment base could
-		 * be the pre-existing saved base or it could be zero.  On AMD
-		 * (with X86_BUG_NULL_SEG), the segment base could be almost
-		 * anything.
-		 *
-		 * This branch is very hot (it's hit twice on almost every
-		 * context switch between 64-bit programs), and avoiding
-		 * the RDMSR helps a lot, so we just assume that whatever
-		 * value is already saved is correct.  This matches historical
-		 * Linux behavior, so it won't break existing applications.
-		 *
-		 * To avoid leaking state, on non-X86_BUG_NULL_SEG CPUs, if we
-		 * report that the base is zero, it needs to actually be zero:
-		 * see the corresponding logic in load_seg_legacy.
-		 */
-	} else {
-		/*
-		 * If the selector is 1, 2, or 3, then the base is zero on
-		 * !X86_BUG_NULL_SEG CPUs and could be anything on
-		 * X86_BUG_NULL_SEG CPUs.  In the latter case, Linux
-		 * has never attempted to preserve the base across context
-		 * switches.
-		 *
-		 * If selector > 3, then it refers to a real segment, and
-		 * saving the base isn't necessary.
-		 */
-		if (which == FS)
-			prev_p->thread.fsbase = 0;
-		else
-			prev_p->thread.gsbase = 0;
-	}
-}
-
-static __always_inline void save_fsgs(struct task_struct *task)
-{
-	savesegment(fs, task->thread.fsindex);
-	savesegment(gs, task->thread.gsindex);
-	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
-		/*
-		 * If FSGSBASE is enabled, we can't make any useful guesses
-		 * about the base, and user code expects us to save the current
-		 * value.  Fortunately, reading the base directly is efficient.
-		 */
-		task->thread.fsbase = rdfsbase();
-		task->thread.gsbase = x86_gsbase_read_cpu_inactive();
-	} else {
-		save_base_legacy(task, task->thread.fsindex, FS);
-		save_base_legacy(task, task->thread.gsindex, GS);
-	}
-}
-
 #if IS_ENABLED(CONFIG_KVM)
 /*
  * While a process is running,current->thread.fsbase and current->thread.gsbase
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 13/18] x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (11 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 12/18] x86/fsgsbase/64: move save_fsgs to header file Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 14/18] x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation Sasha Levin
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: "Chang S. Bae" <chang.seok.bae@intel.com>

When FSGSBASE is enabled, copying threads and reading FS/GS base using
ptrace must read the actual values.

When copying a thread, use fsgs_save() and copy the saved values. For
ptrace, the bases must be read from memory regardless of the selector
if FSGSBASE is enabled.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/process.c    | 9 +++++----
 arch/x86/kernel/process_64.c | 6 ++++--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 9da70b279dad8..31dd24f9c8d8e 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -140,10 +140,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
 
 #ifdef CONFIG_X86_64
-	savesegment(gs, p->thread.gsindex);
-	p->thread.gsbase = p->thread.gsindex ? 0 : current->thread.gsbase;
-	savesegment(fs, p->thread.fsindex);
-	p->thread.fsbase = p->thread.fsindex ? 0 : current->thread.fsbase;
+	save_fsgs(current);
+	p->thread.fsindex = current->thread.fsindex;
+	p->thread.fsbase = current->thread.fsbase;
+	p->thread.gsindex = current->thread.gsindex;
+	p->thread.gsbase = current->thread.gsbase;
 	savesegment(es, p->thread.es);
 	savesegment(ds, p->thread.ds);
 #else
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4be88124d81ea..57cdbbb0381ac 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -346,7 +346,8 @@ unsigned long x86_fsbase_read_task(struct task_struct *task)
 
 	if (task == current)
 		fsbase = x86_fsbase_read_cpu();
-	else if (task->thread.fsindex == 0)
+	else if (static_cpu_has(X86_FEATURE_FSGSBASE) ||
+		 (task->thread.fsindex == 0))
 		fsbase = task->thread.fsbase;
 	else
 		fsbase = x86_fsgsbase_read_task(task, task->thread.fsindex);
@@ -360,7 +361,8 @@ unsigned long x86_gsbase_read_task(struct task_struct *task)
 
 	if (task == current)
 		gsbase = x86_gsbase_read_cpu_inactive();
-	else if (task->thread.gsindex == 0)
+	else if (static_cpu_has(X86_FEATURE_FSGSBASE) ||
+		 (task->thread.gsindex == 0))
 		gsbase = task->thread.gsbase;
 	else
 		gsbase = x86_fsgsbase_read_task(task, task->thread.gsindex);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 14/18] x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (12 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 13/18] x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 15/18] selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE Sasha Levin
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: Tony Luck <tony.luck@intel.com>

Before enabling FSGSBASE the kernel could safely assume that the content
of GS base was a user address. Thus any speculative access as the result
of a mispredicted branch controlling the execution of SWAPGS would be to
a user address. So systems with speculation-proof SMAP did not need to
add additional LFENCE instructions to mitigate.

With FSGSBASE enabled a hostile user can set GS base to a kernel address.
So they can make the kernel speculatively access data they wish to leak
via a side channel. This means that SMAP provides no protection.

Add FSGSBASE as an additional condition to enable the fence-based SWAPGS
mitigation.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/cpu/bugs.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index ed54b3b21c396..487603ea51cd1 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -450,14 +450,12 @@ static void __init spectre_v1_select_mitigation(void)
 		 * If FSGSBASE is enabled, the user can put a kernel address in
 		 * GS, in which case SMAP provides no protection.
 		 *
-		 * [ NOTE: Don't check for X86_FEATURE_FSGSBASE until the
-		 *	   FSGSBASE enablement patches have been merged. ]
-		 *
 		 * If FSGSBASE is disabled, the user can only put a user space
 		 * address in GS.  That makes an attack harder, but still
 		 * possible if there's no SMAP protection.
 		 */
-		if (!smap_works_speculatively()) {
+		if (boot_cpu_has(X86_FEATURE_FSGSBASE) ||
+		    !smap_works_speculatively()) {
 			/*
 			 * Mitigation can be provided from SWAPGS itself or
 			 * PTI as the CR3 write in the Meltdown mitigation
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 15/18] selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (13 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 14/18] x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 16/18] x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken bit Sasha Levin
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: "Chang S. Bae" <chang.seok.bae@intel.com>

This validates that GS selector and base are independently preserved in
ptrace commands.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 tools/testing/selftests/x86/fsgsbase.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index 950a48b2e3662..9a4349813a30a 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -465,7 +465,7 @@ static void test_ptrace_write_gsbase(void)
 	wait(&status);
 
 	if (WSTOPSIG(status) == SIGTRAP) {
-		unsigned long gs;
+		unsigned long gs, base;
 		unsigned long gs_offset = USER_REGS_OFFSET(gs);
 		unsigned long base_offset = USER_REGS_OFFSET(gs_base);
 
@@ -481,6 +481,7 @@ static void test_ptrace_write_gsbase(void)
 			err(1, "PTRACE_POKEUSER");
 
 		gs = ptrace(PTRACE_PEEKUSER, child, gs_offset, NULL);
+		base = ptrace(PTRACE_PEEKUSER, child, base_offset, NULL);
 
 		/*
 		 * In a non-FSGSBASE system, the nonzero selector will load
@@ -501,8 +502,14 @@ static void test_ptrace_write_gsbase(void)
 			 */
 			if (gs == 0)
 				printf("\tNote: this is expected behavior on older kernels.\n");
+		} else if (have_fsgsbase && (base != 0xFF)) {
+			nerrs++;
+			printf("[FAIL]\tGSBASE changed to %lx\n", base);
 		} else {
-			printf("[OK]\tGS remained 0x%hx\n", *shared_scratch);
+			printf("[OK]\tGS remained 0x%hx", *shared_scratch);
+			if (have_fsgsbase)
+				printf(" and GSBASE changed to 0xFF");
+			printf("\n");
 		}
 	}
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 16/18] x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken bit
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (14 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 15/18] selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 17/18] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Sasha Levin
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: Andy Lutomirski <luto@kernel.org>

Now that FSGSBASE is fully supported, remove unsafe_fsgsbase, enable
FSGSBASE by default, and add nofsgsbase to disable it.

While this changes userspace visible ABI, we could not find a project
that would be affected by this. Few projects were contacted for input
and ack:

- 5-level EPT: http://lkml.kernel.org/r/9ddf602b-6c8b-8c1e-ab46-07ed12366593@redhat.com
- rr: https://mail.mozilla.org/pipermail/rr-dev/2018-March/000616.html
- CRIU: https://lists.openvz.org/pipermail/criu/2018-March/040654.html

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 .../admin-guide/kernel-parameters.txt         |  3 +-
 arch/x86/kernel/cpu/common.c                  | 32 ++++++++-----------
 2 files changed, 15 insertions(+), 20 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index af3aaade195b8..1924845c879c2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3033,8 +3033,7 @@
 	no5lvl		[X86-64] Disable 5-level paging mode. Forces
 			kernel to use 4-level paging instead.
 
-	unsafe_fsgsbase	[X86] Allow FSGSBASE instructions.  This will be
-			replaced with a nofsgsbase flag.
+	nofsgsbase	[X86] Disables FSGSBASE instructions.
 
 	no_console_suspend
 			[HW] Never suspend the console
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4224760c74e27..0d480cbadc7dc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -418,21 +418,21 @@ static void __init setup_cr_pinning(void)
 	static_key_enable(&cr_pinning.key);
 }
 
-/*
- * Temporary hack: FSGSBASE is unsafe until a few kernel code paths are
- * updated. This allows us to get the kernel ready incrementally.
- *
- * Once all the pieces are in place, these will go away and be replaced with
- * a nofsgsbase chicken flag.
- */
-static bool unsafe_fsgsbase;
-
-static __init int setup_unsafe_fsgsbase(char *arg)
+static __init int x86_nofsgsbase_setup(char *arg)
 {
-	unsafe_fsgsbase = true;
+	/* Require an exact match without trailing characters. */
+	if (strlen(arg))
+		return 0;
+
+	/* Do not emit a message if the feature is not present. */
+	if (!boot_cpu_has(X86_FEATURE_FSGSBASE))
+		return 1;
+
+	setup_clear_cpu_cap(X86_FEATURE_FSGSBASE);
+	pr_info("FSGSBASE disabled via kernel command line\n");
 	return 1;
 }
-__setup("unsafe_fsgsbase", setup_unsafe_fsgsbase);
+__setup("nofsgsbase", x86_nofsgsbase_setup);
 
 /*
  * Protection Keys are not available in 32-bit mode.
@@ -1495,12 +1495,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_umip(c);
 
 	/* Enable FSGSBASE instructions if available. */
-	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
-		if (unsafe_fsgsbase)
-			cr4_set_bits(X86_CR4_FSGSBASE);
-		else
-			clear_cpu_cap(c, X86_FEATURE_FSGSBASE);
-	}
+	if (cpu_has(c, X86_FEATURE_FSGSBASE))
+		cr4_set_bits(X86_CR4_FSGSBASE);
 
 	/*
 	 * The vendor-specific functions might have changed features.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 17/18] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (15 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 16/18] x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken bit Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-11  4:53 ` [PATCH v12 18/18] Documentation/x86/64: Add documentation for GS/FS addressing mode Sasha Levin
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin

From: Andi Kleen <ak@linux.intel.com>

The kernel needs to explicitly enable FSGSBASE. So, the application needs
to know if it can safely use these instructions. Just looking at the CPUID
bit is not enough because it may be running in a kernel that does not
enable the instructions.

One way for the application would be to just try and catch the SIGILL.
But that is difficult to do in libraries which may not want to overwrite
the signal handlers of the main application.

Enumerate the enabled FSGSBASE capability in bit 1 of AT_HWCAP2 in the ELF
aux vector. AT_HWCAP2 is already used by PPC for similar purposes.

The application can access it open coded or by using the getauxval()
function in newer versions of glibc.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/include/uapi/asm/hwcap2.h | 3 +++
 arch/x86/kernel/cpu/common.c       | 4 +++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/hwcap2.h b/arch/x86/include/uapi/asm/hwcap2.h
index 8b2effe6efb82..5fdfcb47000f9 100644
--- a/arch/x86/include/uapi/asm/hwcap2.h
+++ b/arch/x86/include/uapi/asm/hwcap2.h
@@ -5,4 +5,7 @@
 /* MONITOR/MWAIT enabled in Ring 3 */
 #define HWCAP2_RING3MWAIT		(1 << 0)
 
+/* Kernel allows FSGSBASE instructions available in Ring 3 */
+#define HWCAP2_FSGSBASE			BIT(1)
+
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0d480cbadc7dc..b5a086ea34258 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1495,8 +1495,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_umip(c);
 
 	/* Enable FSGSBASE instructions if available. */
-	if (cpu_has(c, X86_FEATURE_FSGSBASE))
+	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
 		cr4_set_bits(X86_CR4_FSGSBASE);
+		elf_hwcap2 |= HWCAP2_FSGSBASE;
+	}
 
 	/*
 	 * The vendor-specific functions might have changed features.
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH v12 18/18] Documentation/x86/64: Add documentation for GS/FS addressing mode
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (16 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 17/18] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Sasha Levin
@ 2020-05-11  4:53 ` Sasha Levin
  2020-05-15  9:24 ` [PATCH v12 00/18] Enable FSGSBASE instructions Jarkko Sakkinen
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-11  4:53 UTC (permalink / raw)
  To: linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Randy Dunlap, Jonathan Corbet

From: Thomas Gleixner <tglx@linutronix.de>

Explain how the GS/FS based addressing can be utilized in user space
applications along with the differences between the generic prctl() based
GS/FS base control and the FSGSBASE version available on newer CPUs.

Originally-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jonathan Corbet <corbet@lwn.net>
---
 Documentation/x86/x86_64/fsgs.rst  | 199 +++++++++++++++++++++++++++++
 Documentation/x86/x86_64/index.rst |   1 +
 2 files changed, 200 insertions(+)
 create mode 100644 Documentation/x86/x86_64/fsgs.rst

diff --git a/Documentation/x86/x86_64/fsgs.rst b/Documentation/x86/x86_64/fsgs.rst
new file mode 100644
index 0000000000000..50960e09e1f66
--- /dev/null
+++ b/Documentation/x86/x86_64/fsgs.rst
@@ -0,0 +1,199 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Using FS and GS segments in user space applications
+===================================================
+
+The x86 architecture supports segmentation. Instructions which access
+memory can use segment register based addressing mode. The following
+notation is used to address a byte within a segment:
+
+  Segment-register:Byte-address
+
+The segment base address is added to the Byte-address to compute the
+resulting virtual address which is accessed. This allows to access multiple
+instances of data with the identical Byte-address, i.e. the same code. The
+selection of a particular instance is purely based on the base-address in
+the segment register.
+
+In 32-bit mode the CPU provides 6 segments, which also support segment
+limits. The limits can be used to enforce address space protections.
+
+In 64-bit mode the CS/SS/DS/ES segments are ignored and the base address is
+always 0 to provide a full 64bit address space. The FS and GS segments are
+still functional in 64-bit mode.
+
+Common FS and GS usage
+------------------------------
+
+The FS segment is commonly used to address Thread Local Storage (TLS). FS
+is usually managed by runtime code or a threading library. Variables
+declared with the '__thread' storage class specifier are instantiated per
+thread and the compiler emits the FS: address prefix for accesses to these
+variables. Each thread has its own FS base address so common code can be
+used without complex address offset calculations to access the per thread
+instances. Applications should not use FS for other purposes when they use
+runtimes or threading libraries which manage the per thread FS.
+
+The GS segment has no common use and can be used freely by
+applications. GCC and Clang support GS based addressing via address space
+identifiers.
+
+Reading and writing the FS/GS base address
+------------------------------------------
+
+There exist two mechanisms to read and write the FS/GS base address:
+
+ - the arch_prctl() system call
+
+ - the FSGSBASE instruction family
+
+Accessing FS/GS base with arch_prctl()
+--------------------------------------
+
+ The arch_prctl(2) based mechanism is available on all 64-bit CPUs and all
+ kernel versions.
+
+ Reading the base:
+
+   arch_prctl(ARCH_GET_FS, &fsbase);
+   arch_prctl(ARCH_GET_GS, &gsbase);
+
+ Writing the base:
+
+   arch_prctl(ARCH_SET_FS, fsbase);
+   arch_prctl(ARCH_SET_GS, gsbase);
+
+ The ARCH_SET_GS prctl may be disabled depending on kernel configuration
+ and security settings.
+
+Accessing FS/GS base with the FSGSBASE instructions
+---------------------------------------------------
+
+ With the Ivy Bridge CPU generation Intel introduced a new set of
+ instructions to access the FS and GS base registers directly from user
+ space. These instructions are also supported on AMD Family 17H CPUs. The
+ following instructions are available:
+
+  =============== ===========================
+  RDFSBASE %reg   Read the FS base register
+  RDGSBASE %reg   Read the GS base register
+  WRFSBASE %reg   Write the FS base register
+  WRGSBASE %reg   Write the GS base register
+  =============== ===========================
+
+ The instructions avoid the overhead of the arch_prctl() syscall and allow
+ more flexible usage of the FS/GS addressing modes in user space
+ applications. This does not prevent conflicts between threading libraries
+ and runtimes which utilize FS and applications which want to use it for
+ their own purpose.
+
+FSGSBASE instructions enablement
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ The instructions are enumerated in CPUID leaf 7, bit 0 of EBX. If
+ available /proc/cpuinfo shows 'fsgsbase' in the flag entry of the CPUs.
+
+ The availability of the instructions does not enable them
+ automatically. The kernel has to enable them explicitly in CR4. The
+ reason for this is that older kernels make assumptions about the values in
+ the GS register and enforce them when GS base is set via
+ arch_prctl(). Allowing user space to write arbitrary values to GS base
+ would violate these assumptions and cause malfunction.
+
+ On kernels which do not enable FSGSBASE the execution of the FSGSBASE
+ instructions will fault with a #UD exception.
+
+ The kernel provides reliable information about the enabled state in the
+ ELF AUX vector. If the HWCAP2_FSGSBASE bit is set in the AUX vector, the
+ kernel has FSGSBASE instructions enabled and applications can use them.
+ The following code example shows how this detection works::
+
+   #include <sys/auxv.h>
+   #include <elf.h>
+
+   /* Will be eventually in asm/hwcap.h */
+   #ifndef HWCAP2_FSGSBASE
+   #define HWCAP2_FSGSBASE        (1 << 1)
+   #endif
+
+   ....
+
+   unsigned val = getauxval(AT_HWCAP2);
+
+   if (val & HWCAP2_FSGSBASE)
+        printf("FSGSBASE enabled\n");
+
+FSGSBASE instructions compiler support
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+GCC version 4.6.4 and newer provide instrinsics for the FSGSBASE
+instructions. Clang 5 supports them as well.
+
+  =================== ===========================
+  _readfsbase_u64()   Read the FS base register
+  _readfsbase_u64()   Read the GS base register
+  _writefsbase_u64()  Write the FS base register
+  _writegsbase_u64()  Write the GS base register
+  =================== ===========================
+
+To utilize these instrinsics <immintrin.h> must be included in the source
+code and the compiler option -mfsgsbase has to be added.
+
+Compiler support for FS/GS based addressing
+-------------------------------------------
+
+GCC version 6 and newer provide support for FS/GS based addressing via
+Named Address Spaces. GCC implements the following address space
+identifiers for x86:
+
+  ========= ====================================
+  __seg_fs  Variable is addressed relative to FS
+  __seg_gs  Variable is addressed relative to GS
+  ========= ====================================
+
+The preprocessor symbols __SEG_FS and __SEG_GS are defined when these
+address spaces are supported. Code which implements fallback modes should
+check whether these symbols are defined. Usage example::
+
+  #ifdef __SEG_GS
+
+  long data0 = 0;
+  long data1 = 1;
+
+  long __seg_gs *ptr;
+
+  /* Check whether FSGSBASE is enabled by the kernel (HWCAP2_FSGSBASE) */
+  ....
+
+  /* Set GS base to point to data0 */
+  _writegsbase_u64(&data0);
+
+  /* Access offset 0 of GS */
+  ptr = 0;
+  printf("data0 = %ld\n", *ptr);
+
+  /* Set GS base to point to data1 */
+  _writegsbase_u64(&data1);
+  /* ptr still addresses offset 0! */
+  printf("data1 = %ld\n", *ptr);
+
+
+Clang does not provide the GCC address space identifiers, but it provides
+address spaces via an attribute based mechanism in Clang 2.6 and newer
+versions:
+
+ ==================================== =====================================
+  __attribute__((address_space(256))  Variable is addressed relative to GS
+  __attribute__((address_space(257))  Variable is addressed relative to FS
+ ==================================== =====================================
+
+FS/GS based addressing with inline assembly
+-------------------------------------------
+
+In case the compiler does not support address spaces, inline assembly can
+be used for FS/GS based addressing mode::
+
+	mov %fs:offset, %reg
+	mov %gs:offset, %reg
+
+	mov %reg, %fs:offset
+	mov %reg, %gs:offset
diff --git a/Documentation/x86/x86_64/index.rst b/Documentation/x86/x86_64/index.rst
index d6eaaa5a35fcd..a56070fc8e77a 100644
--- a/Documentation/x86/x86_64/index.rst
+++ b/Documentation/x86/x86_64/index.rst
@@ -14,3 +14,4 @@ x86_64 Support
    fake-numa-for-cpusets
    cpu-hotplug-spec
    machinecheck
+   fsgs
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (17 preceding siblings ...)
  2020-05-11  4:53 ` [PATCH v12 18/18] Documentation/x86/64: Add documentation for GS/FS addressing mode Sasha Levin
@ 2020-05-15  9:24 ` Jarkko Sakkinen
  2020-05-15 16:40   ` Sasha Levin
  2020-05-18  6:18 ` Christoph Hellwig
  2020-05-18 14:53 ` Thomas Gleixner
  20 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-15  9:24 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, tglx, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae

On Mon, 2020-05-11 at 00:52 -0400, Sasha Levin wrote:
> Benefits:
> Currently a user process that wishes to read or write the FS/GS base must
> make a system call. But recent X86 processors have added new instructions
> for use in 64-bit mode that allow direct access to the FS and GS segment
> base addresses.  The operating system controls whether applications can
> use these instructions with a %cr4 control bit.
> 
> In addition to benefits to applications, performance improvements to the
> OS context switch code are possible by making use of these instructions. A
> third party reported out promising performance numbers out of their
> initial benchmarking of the previous version of this patch series [9].
> 
> Enablement check:
> The kernel provides information about the enabled state of FSGSBASE to
> applications using the ELF_AUX vector. If the HWCAP2_FSGSBASE bit is set in
> the AUX vector, the kernel has FSGSBASE instructions enabled and
> applications can use them.
> 
> Kernel changes:
> Major changes made in the kernel are in context switch, paranoid path, and
> ptrace. In a context switch, a task's FS/GS base will be secured regardless
> of its selector. In the paranoid path, GS base is unconditionally
> overwritten to the kernel GS base on entry and the original GS base is
> restored on exit. Ptrace includes divergence of FS/GS index and base
> values.
> 
> Security:
> For mitigating the Spectre v1 SWAPGS issue, LFENCE instructions were added
> on most kernel entries. Those patches are dependent on previous behaviors
> that users couldn't load a kernel address into the GS base. These patches
> change that assumption since the user can load any address into GS base.
> The changes to the kernel entry path in this patch series take account of
> the SWAPGS issue.
> 
> Changes from v11:
> 
>  - Rebase to v5.7-rc5, fix 32bit compilation error.
> 
> 
> Andi Kleen (2):
>   x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
>   x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
> 
> Andy Lutomirski (4):
>   x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
>   x86/entry/64: Clean up paranoid exit
>   x86/fsgsbase/64: Use FSGSBASE in switch_to() if available
>   x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken
>     bit
> 
> Chang S. Bae (9):
>   x86/ptrace: Prevent ptrace from clearing the FS/GS selector
>   selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base
>     write
>   x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
>   x86/entry/64: Introduce the FIND_PERCPU_BASE macro
>   x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
>   x86/entry/64: Document GSBASE handling in the paranoid path
>   x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
>   x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace
>   selftests/x86/fsgsbase: Test ptracer-induced GS base write with
>     FSGSBASE
> 
> Sasha Levin (1):
>   x86/fsgsbase/64: move save_fsgs to header file
> 
> Thomas Gleixner (1):
>   Documentation/x86/64: Add documentation for GS/FS addressing mode
> 
> Tony Luck (1):
>   x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation
> 
>  .../admin-guide/kernel-parameters.txt         |   2 +
>  Documentation/x86/entry_64.rst                |   9 +
>  Documentation/x86/x86_64/fsgs.rst             | 199 ++++++++++++++++++
>  Documentation/x86/x86_64/index.rst            |   1 +
>  arch/x86/entry/calling.h                      |  40 ++++
>  arch/x86/entry/entry_64.S                     | 131 +++++++++---
>  arch/x86/include/asm/fsgsbase.h               |  45 +++-
>  arch/x86/include/asm/inst.h                   |  15 ++
>  arch/x86/include/uapi/asm/hwcap2.h            |   3 +
>  arch/x86/kernel/cpu/bugs.c                    |   6 +-
>  arch/x86/kernel/cpu/common.c                  |  22 ++
>  arch/x86/kernel/process.c                     |   9 +-
>  arch/x86/kernel/process.h                     |  72 +++++++
>  arch/x86/kernel/process_64.c                  | 142 +++++++------
>  arch/x86/kernel/ptrace.c                      |  17 +-
>  tools/testing/selftests/x86/fsgsbase.c        |  24 ++-
>  16 files changed, 608 insertions(+), 129 deletions(-)
>  create mode 100644 Documentation/x86/x86_64/fsgs.rst
> 

Can you put me to the CC-loop for this patches. Some SGX-enabled
frameworks such as Graphene use out-of-tree changes to achieve this.
That's where the interest to possibly test this comes from.

Thanks.

[*] https://github.com/oscarlab/graphene

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-15  9:24 ` [PATCH v12 00/18] Enable FSGSBASE instructions Jarkko Sakkinen
@ 2020-05-15 16:40   ` Sasha Levin
  2020-05-15 17:55     ` Andi Kleen
                       ` (2 more replies)
  0 siblings, 3 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-15 16:40 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-kernel, tglx, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae

On Fri, May 15, 2020 at 12:24:14PM +0300, Jarkko Sakkinen wrote:
>On Mon, 2020-05-11 at 00:52 -0400, Sasha Levin wrote:
>> Benefits:
>> Currently a user process that wishes to read or write the FS/GS base must
>> make a system call. But recent X86 processors have added new instructions
>> for use in 64-bit mode that allow direct access to the FS and GS segment
>> base addresses.  The operating system controls whether applications can
>> use these instructions with a %cr4 control bit.
>>
>> In addition to benefits to applications, performance improvements to the
>> OS context switch code are possible by making use of these instructions. A
>> third party reported out promising performance numbers out of their
>> initial benchmarking of the previous version of this patch series [9].
>>
>> Enablement check:
>> The kernel provides information about the enabled state of FSGSBASE to
>> applications using the ELF_AUX vector. If the HWCAP2_FSGSBASE bit is set in
>> the AUX vector, the kernel has FSGSBASE instructions enabled and
>> applications can use them.
>>
>> Kernel changes:
>> Major changes made in the kernel are in context switch, paranoid path, and
>> ptrace. In a context switch, a task's FS/GS base will be secured regardless
>> of its selector. In the paranoid path, GS base is unconditionally
>> overwritten to the kernel GS base on entry and the original GS base is
>> restored on exit. Ptrace includes divergence of FS/GS index and base
>> values.
>>
>> Security:
>> For mitigating the Spectre v1 SWAPGS issue, LFENCE instructions were added
>> on most kernel entries. Those patches are dependent on previous behaviors
>> that users couldn't load a kernel address into the GS base. These patches
>> change that assumption since the user can load any address into GS base.
>> The changes to the kernel entry path in this patch series take account of
>> the SWAPGS issue.
>>
>> Changes from v11:
>>
>>  - Rebase to v5.7-rc5, fix 32bit compilation error.
>>
>>
>> Andi Kleen (2):
>>   x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
>>   x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
>>
>> Andy Lutomirski (4):
>>   x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
>>   x86/entry/64: Clean up paranoid exit
>>   x86/fsgsbase/64: Use FSGSBASE in switch_to() if available
>>   x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken
>>     bit
>>
>> Chang S. Bae (9):
>>   x86/ptrace: Prevent ptrace from clearing the FS/GS selector
>>   selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base
>>     write
>>   x86/entry/64: Switch CR3 before SWAPGS in paranoid entry
>>   x86/entry/64: Introduce the FIND_PERCPU_BASE macro
>>   x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit
>>   x86/entry/64: Document GSBASE handling in the paranoid path
>>   x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
>>   x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace
>>   selftests/x86/fsgsbase: Test ptracer-induced GS base write with
>>     FSGSBASE
>>
>> Sasha Levin (1):
>>   x86/fsgsbase/64: move save_fsgs to header file
>>
>> Thomas Gleixner (1):
>>   Documentation/x86/64: Add documentation for GS/FS addressing mode
>>
>> Tony Luck (1):
>>   x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation
>>
>>  .../admin-guide/kernel-parameters.txt         |   2 +
>>  Documentation/x86/entry_64.rst                |   9 +
>>  Documentation/x86/x86_64/fsgs.rst             | 199 ++++++++++++++++++
>>  Documentation/x86/x86_64/index.rst            |   1 +
>>  arch/x86/entry/calling.h                      |  40 ++++
>>  arch/x86/entry/entry_64.S                     | 131 +++++++++---
>>  arch/x86/include/asm/fsgsbase.h               |  45 +++-
>>  arch/x86/include/asm/inst.h                   |  15 ++
>>  arch/x86/include/uapi/asm/hwcap2.h            |   3 +
>>  arch/x86/kernel/cpu/bugs.c                    |   6 +-
>>  arch/x86/kernel/cpu/common.c                  |  22 ++
>>  arch/x86/kernel/process.c                     |   9 +-
>>  arch/x86/kernel/process.h                     |  72 +++++++
>>  arch/x86/kernel/process_64.c                  | 142 +++++++------
>>  arch/x86/kernel/ptrace.c                      |  17 +-
>>  tools/testing/selftests/x86/fsgsbase.c        |  24 ++-
>>  16 files changed, 608 insertions(+), 129 deletions(-)
>>  create mode 100644 Documentation/x86/x86_64/fsgs.rst
>>
>
>Can you put me to the CC-loop for this patches. Some SGX-enabled

Sure!

>frameworks such as Graphene use out-of-tree changes to achieve this.
>That's where the interest to possibly test this comes from.

Indeed, we've seen a few hacks that basically just enable FSGSBASE:

 - https://github.com/oscarlab/graphene-sgx-driver
 - https://github.com/occlum/enable_rdfsbase

And would very much like to get rid of them...

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-15 16:40   ` Sasha Levin
@ 2020-05-15 17:55     ` Andi Kleen
  2020-05-15 23:07       ` Sasha Levin
  2020-05-16 12:21       ` Jarkko Sakkinen
  2020-05-16  9:50     ` Jarkko Sakkinen
  2020-05-18  9:51     ` Thomas Gleixner
  2 siblings, 2 replies; 76+ messages in thread
From: Andi Kleen @ 2020-05-15 17:55 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jarkko Sakkinen, linux-kernel, tglx, bp, luto, hpa, dave.hansen,
	tony.luck, ravi.v.shankar, chang.seok.bae

> Indeed, we've seen a few hacks that basically just enable FSGSBASE:
> 
> - https://github.com/oscarlab/graphene-sgx-driver
> - https://github.com/occlum/enable_rdfsbase
> 
> And would very much like to get rid of them...

These are insecure and open root holes without the patches
used here.

-Andi

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-15 17:55     ` Andi Kleen
@ 2020-05-15 23:07       ` Sasha Levin
  2020-05-16 12:21       ` Jarkko Sakkinen
  1 sibling, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-15 23:07 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jarkko Sakkinen, linux-kernel, tglx, bp, luto, hpa, dave.hansen,
	tony.luck, ravi.v.shankar, chang.seok.bae

On Fri, May 15, 2020 at 10:55:50AM -0700, Andi Kleen wrote:
>> Indeed, we've seen a few hacks that basically just enable FSGSBASE:
>>
>> - https://github.com/oscarlab/graphene-sgx-driver
>> - https://github.com/occlum/enable_rdfsbase
>>
>> And would very much like to get rid of them...
>
>These are insecure and open root holes without the patches
>used here.

It's sad that these hacks are being used alongside SGX on "secure"
systems.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-15 16:40   ` Sasha Levin
  2020-05-15 17:55     ` Andi Kleen
@ 2020-05-16  9:50     ` Jarkko Sakkinen
  2020-05-18 15:34       ` Andi Kleen
  2020-05-18  9:51     ` Thomas Gleixner
  2 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-16  9:50 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, tglx, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae

On Fri, 2020-05-15 at 12:40 -0400, Sasha Levin wrote:
> > Can you put me to the CC-loop for this patches. Some SGX-enabled
> 
> Sure!
> 
> > frameworks such as Graphene use out-of-tree changes to achieve this.
> > That's where the interest to possibly test this comes from.
> 
> Indeed, we've seen a few hacks that basically just enable FSGSBASE:
> 
>  - https://github.com/oscarlab/graphene-sgx-driver
>  - https://github.com/occlum/enable_rdfsbase
> 
> And would very much like to get rid of them...

Yes, for SGX this is functional feature because enclave entry points,
thread control structures (aka TCS's), reset FSBASE and GSBASE registers
to fixed (albeit user defined) values. And syscall's can be done only
outside of enclave.

This is a required feature for fancier runtimes (such as Graphene).

I'll try the next version by patching Graphene to use this instead of the
out-of-tree drive. That should give at least fairly realistic
workload (an arbitrary dynamically linked executable running inside an
enclave) for this patch set.

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-15 17:55     ` Andi Kleen
  2020-05-15 23:07       ` Sasha Levin
@ 2020-05-16 12:21       ` Jarkko Sakkinen
  1 sibling, 0 replies; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-16 12:21 UTC (permalink / raw)
  To: Andi Kleen, Sasha Levin
  Cc: linux-kernel, tglx, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

On Fri, 2020-05-15 at 10:55 -0700, Andi Kleen wrote:
> > Indeed, we've seen a few hacks that basically just enable FSGSBASE:
> > 
> > - https://github.com/oscarlab/graphene-sgx-driver
> > - https://github.com/occlum/enable_rdfsbase
> > 
> > And would very much like to get rid of them...
> 
> These are insecure and open root holes without the patches
> used here.
> 
> -Andi

Yup, totally. 

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (18 preceding siblings ...)
  2020-05-15  9:24 ` [PATCH v12 00/18] Enable FSGSBASE instructions Jarkko Sakkinen
@ 2020-05-18  6:18 ` Christoph Hellwig
  2020-05-18 12:33   ` Sasha Levin
  2020-05-18 14:53 ` Thomas Gleixner
  20 siblings, 1 reply; 76+ messages in thread
From: Christoph Hellwig @ 2020-05-18  6:18 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, tglx, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae

On Mon, May 11, 2020 at 12:52:53AM -0400, Sasha Levin wrote:
> Benefits:
> Currently a user process that wishes to read or write the FS/GS base must
> make a system call. But recent X86 processors have added new instructions
> for use in 64-bit mode that allow direct access to the FS and GS segment
> base addresses.  The operating system controls whether applications can
> use these instructions with a %cr4 control bit.
> 
> In addition to benefits to applications, performance improvements to the
> OS context switch code are possible by making use of these instructions. A
> third party reported out promising performance numbers out of their
> initial benchmarking of the previous version of this patch series [9].

The [9] reference can't be resolved anywhere in this mail.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-15 16:40   ` Sasha Levin
  2020-05-15 17:55     ` Andi Kleen
  2020-05-16  9:50     ` Jarkko Sakkinen
@ 2020-05-18  9:51     ` Thomas Gleixner
  2020-05-18 15:16       ` Sasha Levin
  2020-05-18 19:36       ` Jarkko Sakkinen
  2 siblings, 2 replies; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-18  9:51 UTC (permalink / raw)
  To: Sasha Levin, Jarkko Sakkinen
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae

Sasha Levin <sashal@kernel.org> writes:
> On Fri, May 15, 2020 at 12:24:14PM +0300, Jarkko Sakkinen wrote:
>>
>>Can you put me to the CC-loop for this patches. Some SGX-enabled
>>frameworks such as Graphene use out-of-tree changes to achieve this.
>>That's where the interest to possibly test this comes from.
>
> Indeed, we've seen a few hacks that basically just enable FSGSBASE:
>
>  - https://github.com/oscarlab/graphene-sgx-driver
>  - https://github.com/occlum/enable_rdfsbase

I'm really amazed by all these security experts enabling a full root
hole. It clearly puts the SGX hypocrisy into perspective.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18  6:18 ` Christoph Hellwig
@ 2020-05-18 12:33   ` Sasha Levin
  0 siblings, 0 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-18 12:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, tglx, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae

On Sun, May 17, 2020 at 11:18:36PM -0700, Christoph Hellwig wrote:
>On Mon, May 11, 2020 at 12:52:53AM -0400, Sasha Levin wrote:
>> Benefits:
>> Currently a user process that wishes to read or write the FS/GS base must
>> make a system call. But recent X86 processors have added new instructions
>> for use in 64-bit mode that allow direct access to the FS and GS segment
>> base addresses.  The operating system controls whether applications can
>> use these instructions with a %cr4 control bit.
>>
>> In addition to benefits to applications, performance improvements to the
>> OS context switch code are possible by making use of these instructions. A
>> third party reported out promising performance numbers out of their
>> initial benchmarking of the previous version of this patch series [9].
>
>The [9] reference can't be resolved anywhere in this mail.

Sorry, I'll fix it up. The reference was supposed to be pointing to:

	https://www.phoronix.com/scan.php?page=article&item=linux-wip-fsgsbase&num=1

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
                   ` (19 preceding siblings ...)
  2020-05-18  6:18 ` Christoph Hellwig
@ 2020-05-18 14:53 ` Thomas Gleixner
  20 siblings, 0 replies; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-18 14:53 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, x86


Cc: +x86@kernel.org ....

Sasha Levin <sashal@kernel.org> writes:

> Benefits:
> Currently a user process that wishes to read or write the FS/GS base must
> make a system call. But recent X86 processors have added new instructions
> for use in 64-bit mode that allow direct access to the FS and GS segment
> base addresses.  The operating system controls whether applications can
> use these instructions with a %cr4 control bit.
>
> In addition to benefits to applications, performance improvements to the
> OS context switch code are possible by making use of these instructions. A
> third party reported out promising performance numbers out of their
> initial benchmarking of the previous version of this patch series [9].
>
> Enablement check:
> The kernel provides information about the enabled state of FSGSBASE to
> applications using the ELF_AUX vector. If the HWCAP2_FSGSBASE bit is set in
> the AUX vector, the kernel has FSGSBASE instructions enabled and
> applications can use them.
>
> Kernel changes:
> Major changes made in the kernel are in context switch, paranoid path, and
> ptrace. In a context switch, a task's FS/GS base will be secured regardless
> of its selector. In the paranoid path, GS base is unconditionally
> overwritten to the kernel GS base on entry and the original GS base is
> restored on exit. Ptrace includes divergence of FS/GS index and base
> values.
>
> Security:
> For mitigating the Spectre v1 SWAPGS issue, LFENCE instructions were added
> on most kernel entries. Those patches are dependent on previous behaviors
> that users couldn't load a kernel address into the GS base. These patches
> change that assumption since the user can load any address into GS base.
> The changes to the kernel entry path in this patch series take account of
> the SWAPGS issue.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18  9:51     ` Thomas Gleixner
@ 2020-05-18 15:16       ` Sasha Levin
  2020-05-18 18:28         ` Thomas Gleixner
  2020-05-18 19:36       ` Jarkko Sakkinen
  1 sibling, 1 reply; 76+ messages in thread
From: Sasha Levin @ 2020-05-18 15:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jarkko Sakkinen, linux-kernel, bp, luto, hpa, dave.hansen,
	tony.luck, ak, ravi.v.shankar, chang.seok.bae

On Mon, May 18, 2020 at 11:51:07AM +0200, Thomas Gleixner wrote:
>Sasha Levin <sashal@kernel.org> writes:
>> On Fri, May 15, 2020 at 12:24:14PM +0300, Jarkko Sakkinen wrote:
>>>
>>>Can you put me to the CC-loop for this patches. Some SGX-enabled
>>>frameworks such as Graphene use out-of-tree changes to achieve this.
>>>That's where the interest to possibly test this comes from.
>>
>> Indeed, we've seen a few hacks that basically just enable FSGSBASE:
>>
>>  - https://github.com/oscarlab/graphene-sgx-driver
>>  - https://github.com/occlum/enable_rdfsbase
>
>I'm really amazed by all these security experts enabling a full root
>hole. It clearly puts the SGX hypocrisy into perspective.

We can bash Intel all we want here, but sadly there are users in the
"wild" who just enable these root holes thinking they're secure, and
those users are the ones running very sensitive workloads. Here's an
example from a book called "Responsible Genomic Data Sharing":

	https://books.google.com/books?id=y6zWDwAAQBAJ&pg=PA184#v=onepage&q&f=false

That explains how to use Graphene-SGX which just enables FSGSBASE with
root holes.

Maybe it's just me, but I'd love to have my genomic data stored and
processed on systems that are actually secure :)

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-16  9:50     ` Jarkko Sakkinen
@ 2020-05-18 15:34       ` Andi Kleen
  2020-05-18 20:01         ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Andi Kleen @ 2020-05-18 15:34 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Sasha Levin, linux-kernel, tglx, bp, luto, hpa, dave.hansen,
	tony.luck, ravi.v.shankar, chang.seok.bae

> Yes, for SGX this is functional feature because enclave entry points,
> thread control structures (aka TCS's), reset FSBASE and GSBASE registers
> to fixed (albeit user defined) values. And syscall's can be done only
> outside of enclave.
> 
> This is a required feature for fancier runtimes (such as Graphene).

Can you please explain a bit more? What do they need GS for?

-Andi

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-11  4:53 ` [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions Sasha Levin
@ 2020-05-18 18:20   ` Thomas Gleixner
  2020-05-18 20:24     ` Sasha Levin
  0 siblings, 1 reply; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-18 18:20 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, bp, luto
  Cc: hpa, dave.hansen, tony.luck, ak, ravi.v.shankar, chang.seok.bae,
	Sasha Levin, Andrew Cooper, x86

Sasha Levin <sashal@kernel.org> writes:
> +unsigned long x86_gsbase_read_cpu_inactive(void)
> +{
> +	unsigned long gsbase;
> +
> +	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
> +		bool need_restore = false;
> +		unsigned long flags;
> +
> +		/*
> +		 * We read the inactive GS base value by swapping
> +		 * to make it the active one. But we cannot allow
> +		 * an interrupt while we switch to and from.
> +		 */
> +		if (!irqs_disabled()) {
> +			local_irq_save(flags);
> +			need_restore = true;
> +		}
> +
> +		native_swapgs();
> +		gsbase = rdgsbase();
> +		native_swapgs();
> +
> +		if (need_restore)
> +			local_irq_restore(flags);

Where does this crap come from?

This conditional irqsave gunk is clearly NOT what was in the tip tree
before it got reverted:

  a86b4625138d ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions")

In https://lore.kernel.org/r/87ftcrtckg.fsf@nanos.tec.linutronix.de I
explicitely asked for this:

     - Made sure that the cleanups I did when merging them initially have
       been picked up. I'm not going to waste another couple of days on
       this mess just to revert it because it hadn't seen any serious
       testing in development.

and you confirmed in https://lore.kernel.org/r/20200426025243.GJ13035@sasha-vm

       Based on your revert
       (https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/cpu&id=049331f277fef1c3f2527c2c9afa1d285e9a1247)
       I believe that we have all the relevant patches in the series.

And the above while it might not have exploded yet, is simply broken
because the 'swapgs rd/wr swapgs' sequence is not protected against
kprobes. There is even a big fat comment in that original commit:

 /*
  * Out of line to be protected from kprobes. It is not used on Xen
  * paravirt. When paravirt support is needed, it needs to be renamed
  * with native_ prefix.
  */

Yes, you surely got all patches from the git tree and made sure that the
result reflects that.

I've just extracted the original commits from git and applied them and
fixed the trivial rejects. Then I diffed the result against this lot:

 - That above gunk, which is the worst of all

 - In paranoid_exit()

-	TRACE_IRQS_IRETQ_DEBUG
+	TRACE_IRQS_OFF_DEBUG

 - Dropped comments vs. FENCE_SWAPGS and a gazillion of comment
   changes to make reading the diff harder.

Then I gave up looking at it.

It took me ~ 20 minutes (ignoring selftests and documentation) to fixup
the rejects and create a patch queue which is reflecting the state
before the revert and does not have complete crap in it.

This required to add one preparatory patch dealing with the changes in
copy_thread_tls() and no, not by inlining all of that twice.

It took me another 5 minutes to get rid of the local_irq_save/restore()
in save_fsgs() on top without any conditional crap.

I'm seriously tired of this FSGSBASE mess. Every single version I've
looked at in several years was a trainwreck.

Don't bother to send out a new version of this in a frenzy. For my
mental sake I'm not going to look at yet another cobbled together
trainwreck anytime soon.

If you read the above carefully you might find a recipe of properly
engineering this so it's easy to verify against the old version.

Your's seriously grumpy

       tglx

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18 15:16       ` Sasha Levin
@ 2020-05-18 18:28         ` Thomas Gleixner
  0 siblings, 0 replies; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-18 18:28 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jarkko Sakkinen, linux-kernel, bp, luto, hpa, dave.hansen,
	tony.luck, ak, ravi.v.shankar, chang.seok.bae

Sasha Levin <sashal@kernel.org> writes:
> On Mon, May 18, 2020 at 11:51:07AM +0200, Thomas Gleixner wrote:
>>Sasha Levin <sashal@kernel.org> writes:
>>> On Fri, May 15, 2020 at 12:24:14PM +0300, Jarkko Sakkinen wrote:
>>>>
>>>>Can you put me to the CC-loop for this patches. Some SGX-enabled
>>>>frameworks such as Graphene use out-of-tree changes to achieve this.
>>>>That's where the interest to possibly test this comes from.
>>>
>>> Indeed, we've seen a few hacks that basically just enable FSGSBASE:
>>>
>>>  - https://github.com/oscarlab/graphene-sgx-driver
>>>  - https://github.com/occlum/enable_rdfsbase
>>
>>I'm really amazed by all these security experts enabling a full root
>>hole. It clearly puts the SGX hypocrisy into perspective.
>
> We can bash Intel all we want here, but sadly there are users in the

This is not about bashing Intel.

> "wild" who just enable these root holes thinking they're secure, and
> those users are the ones running very sensitive workloads. Here's an
> example from a book called "Responsible Genomic Data Sharing":
>
> 	https://books.google.com/books?id=y6zWDwAAQBAJ&pg=PA184#v=onepage&q&f=false
>
> That explains how to use Graphene-SGX which just enables FSGSBASE with
> root holes.

It's about these SGX promoting security experts which try to tell
everyone else that he has no clue about security.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18  9:51     ` Thomas Gleixner
  2020-05-18 15:16       ` Sasha Levin
@ 2020-05-18 19:36       ` Jarkko Sakkinen
  1 sibling, 0 replies; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-18 19:36 UTC (permalink / raw)
  To: Thomas Gleixner, Sasha Levin
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae

On Mon, 2020-05-18 at 11:51 +0200, Thomas Gleixner wrote:
> Sasha Levin <sashal@kernel.org> writes:
> > On Fri, May 15, 2020 at 12:24:14PM +0300, Jarkko Sakkinen wrote:
> > > Can you put me to the CC-loop for this patches. Some SGX-enabled
> > > frameworks such as Graphene use out-of-tree changes to achieve this.
> > > That's where the interest to possibly test this comes from.
> > 
> > Indeed, we've seen a few hacks that basically just enable FSGSBASE:
> > 
> >  - https://github.com/oscarlab/graphene-sgx-driver
> >  - https://github.com/occlum/enable_rdfsbase
> 
> I'm really amazed by all these security experts enabling a full root
> hole. It clearly puts the SGX hypocrisy into perspective.
> 
> Thanks,
> 
>         tglx

That's exactly why I'm interested to test this series.

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18 15:34       ` Andi Kleen
@ 2020-05-18 20:01         ` Jarkko Sakkinen
  2020-05-18 23:03           ` Thomas Gleixner
  0 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-18 20:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Sasha Levin, linux-kernel, tglx, bp, luto, hpa, dave.hansen,
	tony.luck, ravi.v.shankar, chang.seok.bae

On Mon, 2020-05-18 at 08:34 -0700, Andi Kleen wrote:
> > Yes, for SGX this is functional feature because enclave entry points,
> > thread control structures (aka TCS's), reset FSBASE and GSBASE registers
> > to fixed (albeit user defined) values. And syscall's can be done only
> > outside of enclave.
> > 
> > This is a required feature for fancier runtimes (such as Graphene).
> 
> Can you please explain a bit more? What do they need GS for?

Apparently, uses only wrfsbase:

https://raw.githubusercontent.com/oscarlab/graphene/master/Pal/src/host/Linux-SGX/db_misc.c

I'm not too familiar with the codebase yet but by reading some research
papers in the past the idea is to multiplex one TCS for multiple virtual
threads inside the enclave.

E.g. TCS could represent a vcpu for a libos type of container and on
entry would pick on a thread and set fsbase accordingly for a thread
control block.

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-18 18:20   ` Thomas Gleixner
@ 2020-05-18 20:24     ` Sasha Levin
  2020-05-18 22:59       ` Thomas Gleixner
  2020-05-19 12:20       ` David Laight
  0 siblings, 2 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-18 20:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae, Andrew Cooper, x86

Thank you for taking the time to review this.

On Mon, May 18, 2020 at 08:20:08PM +0200, Thomas Gleixner wrote:
>Sasha Levin <sashal@kernel.org> writes:
>> +unsigned long x86_gsbase_read_cpu_inactive(void)
>> +{
>> +	unsigned long gsbase;
>> +
>> +	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
>> +		bool need_restore = false;
>> +		unsigned long flags;
>> +
>> +		/*
>> +		 * We read the inactive GS base value by swapping
>> +		 * to make it the active one. But we cannot allow
>> +		 * an interrupt while we switch to and from.
>> +		 */
>> +		if (!irqs_disabled()) {
>> +			local_irq_save(flags);
>> +			need_restore = true;
>> +		}
>> +
>> +		native_swapgs();
>> +		gsbase = rdgsbase();
>> +		native_swapgs();
>> +
>> +		if (need_restore)
>> +			local_irq_restore(flags);
>
>Where does this crap come from?
>
>This conditional irqsave gunk is clearly NOT what was in the tip tree
>before it got reverted:
>
>  a86b4625138d ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions")

It wasn't in the reverted series, it came in Intel's v9 series, with
these comments in the cover letter:

	Updates from v8 [10]:
	[...]
	* Simplified GS base helper functions (Tony L.)

>In https://lore.kernel.org/r/87ftcrtckg.fsf@nanos.tec.linutronix.de I
>explicitely asked for this:
>
>     - Made sure that the cleanups I did when merging them initially have
>       been picked up. I'm not going to waste another couple of days on
>       this mess just to revert it because it hadn't seen any serious
>       testing in development.
>
>and you confirmed in https://lore.kernel.org/r/20200426025243.GJ13035@sasha-vm
>
>       Based on your revert
>       (https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/cpu&id=049331f277fef1c3f2527c2c9afa1d285e9a1247)
>       I believe that we have all the relevant patches in the series.

And I did, Thomas. While I'm not intimately familiar with the code I
made sure that all the patches that came on top of the merged series
before it got reverted made it into this new series. However, more work
has happened here after the revert and I would expect that the code in
this new series will be different than the code you reverted last year.

>And the above while it might not have exploded yet, is simply broken
>because the 'swapgs rd/wr swapgs' sequence is not protected against
>kprobes. There is even a big fat comment in that original commit:
>
> /*
>  * Out of line to be protected from kprobes. It is not used on Xen
>  * paravirt. When paravirt support is needed, it needs to be renamed
>  * with native_ prefix.
>  */
>
>Yes, you surely got all patches from the git tree and made sure that the
>result reflects that.
>
>I've just extracted the original commits from git and applied them and
>fixed the trivial rejects. Then I diffed the result against this lot:
>
> - That above gunk, which is the worst of all

Changed in v9 of the series.

> - In paranoid_exit()
>
>-	TRACE_IRQS_IRETQ_DEBUG
>+	TRACE_IRQS_OFF_DEBUG

(assuming we're looking at the same thing here, ) Changed in v8 of the
series.

> - Dropped comments vs. FENCE_SWAPGS and a gazillion of comment
>   changes to make reading the diff harder.

Changed in every version after the revert:

  - v7:
    - "Add more comments for entry changes"
  - v8:
    - "Carried on Thomas' edits on multiple changelogs and comments"
  - v9:
    - "Fixed typos (Randy D.) and massaged a few sentences in the
      documentation"

>Then I gave up looking at it.
>
>It took me ~ 20 minutes (ignoring selftests and documentation) to fixup
>the rejects and create a patch queue which is reflecting the state
>before the revert and does not have complete crap in it.
>
>This required to add one preparatory patch dealing with the changes in
>copy_thread_tls() and no, not by inlining all of that twice.
>
>It took me another 5 minutes to get rid of the local_irq_save/restore()
>in save_fsgs() on top without any conditional crap.
>
>I'm seriously tired of this FSGSBASE mess. Every single version I've
>looked at in several years was a trainwreck.
>
>Don't bother to send out a new version of this in a frenzy. For my
>mental sake I'm not going to look at yet another cobbled together
>trainwreck anytime soon.
>
>If you read the above carefully you might find a recipe of properly
>engineering this so it's easy to verify against the old version.

I'm a bit confused about the surprise here that v12 is different than
the reverted patches. There were multiple rounds of review which
resulted in the code being more than just a revert of the revert along
with a small fix.

This very issue was brought up by Andy in v7 of the series:

On Mon, Sep 16, 2019 at 11:38 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 12 Sep 2019, Andy Lutomirski wrote:
> > On 9/12/19 1:06 PM, Chang S. Bae wrote:
> > > Updates from v7 [7]:
> > > (1) Consider FSGSBASE when determining which Spectre SWAPGS mitigations are
> > >      required.
> > > (2) Fixed save_fsgs() to be aware of interrupt conditions
> > > (3) Made selftest changes based on Andy's previous fixes and cleanups
> > > (4) Included Andy's paranoid exit cleanup
> > > (5) Included documentation rewritten by Thomas
> > > (6) Carried on Thomas' edits on multiple changelogs and comments
> > > (7) Used '[FS|GS] base' consistently, except for selftest where GSBASE has
> > >      been already used in its test messages
> > > (8) Dropped the READ_MSR_GSBASE macro
> > >
> >
> > This looks unpleasant to review.  I wonder if it would be better to unrevert
> > the reversion, merge up to Linus' tree or -tip, and then base the changes on
> > top of that.
>
> I don't think that's a good idea. The old code is broken in several ways
> and not bisectable. So we really better start from scratch.

And this is what we have here, a series that has more than trivial
differences from the revert, and is more of a pain to review. Look at
what you did with your 25 minutes: you've reverted the revert and went
on to apply fixes on top of it, exactly the thing you've asked
not to do a few months prior.

No need to worry about me sending a new series, as I can't - I just
don't know what you want to see at this point: on one hand you're saying
"we really better start from scratch" and on the other hand "this
conditional irqsave gunk is clearly NOT what was in the tip tree before
it got reverted", you're making suggestions to change comments only to
later complain that "a gazillion of comment changes make reading the
diff harder". 

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-18 20:24     ` Sasha Levin
@ 2020-05-18 22:59       ` Thomas Gleixner
  2020-05-19 12:20       ` David Laight
  1 sibling, 0 replies; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-18 22:59 UTC (permalink / raw)
  To: Sasha Levin
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae, Andrew Cooper, x86

Sasha,

Sasha Levin <sashal@kernel.org> writes:
> Thank you for taking the time to review this.

welcome and sorry for the explosion.

> On Mon, May 18, 2020 at 08:20:08PM +0200, Thomas Gleixner wrote:
>>Sasha Levin <sashal@kernel.org> writes:
>>This conditional irqsave gunk is clearly NOT what was in the tip tree
>>before it got reverted:
>>
>>  a86b4625138d ("x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions")
>
> It wasn't in the reverted series, it came in Intel's v9 series, with
> these comments in the cover letter:
>
> 	Updates from v8 [10]:
> 	[...]
> 	* Simplified GS base helper functions (Tony L.)

Ok. I never looked at that series because that requested confirmation
that nothing will regress due to the ptrace changes was not there. After
a bit of handwaving this dried out. So I completely missed that back
then. And I did not look at any later variant which had 0day complaints.

> And I did, Thomas. While I'm not intimately familiar with the code I
> made sure that all the patches that came on top of the merged series
> before it got reverted made it into this new series. However, more work
> has happened here after the revert and I would expect that the code in
> this new series will be different than the code you reverted last
> year.

It's obvious that it would be different from what was merged simply
because the affected code has changed but not in substantial points like
losing a kprobes protection by "simplifying" something which was
carefully done in the first place.

It's not your fault at all, you just happened to be the messanger. The
people responsible for that mess owe you at least a beer.

>> - In paranoid_exit()
>>
>>-	TRACE_IRQS_IRETQ_DEBUG
>>+	TRACE_IRQS_OFF_DEBUG
>
> (assuming we're looking at the same thing here, ) Changed in v8 of the
> series.

Sigh.

> I'm a bit confused about the surprise here that v12 is different than
> the reverted patches. There were multiple rounds of review which
> resulted in the code being more than just a revert of the revert along
> with a small fix.
>
>> > This looks unpleasant to review.  I wonder if it would be better to unrevert
>> > the reversion, merge up to Linus' tree or -tip, and then base the changes on
>> > top of that.
>>
>> I don't think that's a good idea. The old code is broken in several ways
>> and not bisectable. So we really better start from scratch.
>
> And this is what we have here, a series that has more than trivial
> differences from the revert, and is more of a pain to review. Look at
> what you did with your 25 minutes: you've reverted the revert and went
> on to apply fixes on top of it, exactly the thing you've asked
> not to do a few months prior.

I did that to analyse whether that new series has everything what was
fixed back then and did not introduce new bugs. Mission accomplished.

> No need to worry about me sending a new series, as I can't - I just
> don't know what you want to see at this point: on one hand you're saying
> "we really better start from scratch" and on the other hand "this
> conditional irqsave gunk is clearly NOT what was in the tip tree before
> it got reverted", you're making suggestions to change comments only to
> later complain that "a gazillion of comment changes make reading the
> diff harder". 

Gah. That comment change thing was just an annoyance and I complained
about it because I was already grumpy as hell.

So what I meant is that the blind revert of the revert, i.e. just
reapplying the previous stuff, is horrible. Simply because the reverted
patches were already not bisectable. And then applying random changes on
top does not make it any better.

So yes, I would have done exactly where I started:

   1) Extract the original patches from git

   2) Apply them and fixup the rejects

and on top of that:

   3) Make them bisectable by folding back the fixes to the right place
      and reordering them which creates a result which is equivalent to
      'start from scratch' but without losing context and introducing
      new bugs. Simply because it's trivial to diff against the state
      before the revert.

   4) Do the 'improvements' on top, discuss them and fold them back.

For what you tried to do I would have omitted #4 completely and then
did:

   5) Rebase the latest Intel variant

   6) Diff the results ideally step by step

   7) Analyze the deltas carefully and if unsure about the result
      ask.
      
   That way you really would have noticed that this helper patch is
   substantially different and you would have noticed that the kprobes
   protection is gone. Also that would have clearly shown you the IRQ
   flag wreckage.

So to go forward can you please just do #1 - #3 first?

Vs. the s/GSBASE/GS base/g comment changes: I don't mind them per se,
but they are incomplete because they just change it in the new code
while there are still the original comments using GSBASE. So either we
change it wholesale or not at all. If so, then this wants to be a
separate patch right at the beginning of the new series which changes
the existing comments before introducing a different variant.

That "simplified" handling is going nowhere. That conditional irq
disable and the redundant conditionals and the out of line invocation in
switch_to() are just not going to happen.

So when comparing it to the latest Intel trainwreck ignore that part
completely,

I've uploaded my quick shot with a few cleanups on top (folded back) for
reference:

  https://tglx.de/~tglx/patches-fsgs.tar

Uncompiled and untested. I'm not claiming it's bug free either. If you
find one, please keep it. Hope that helps.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18 20:01         ` Jarkko Sakkinen
@ 2020-05-18 23:03           ` Thomas Gleixner
  2020-05-19 16:48             ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-18 23:03 UTC (permalink / raw)
  To: Jarkko Sakkinen, Andi Kleen
  Cc: Sasha Levin, linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> writes:
> On Mon, 2020-05-18 at 08:34 -0700, Andi Kleen wrote:
>> > Yes, for SGX this is functional feature because enclave entry points,
>> > thread control structures (aka TCS's), reset FSBASE and GSBASE registers
>> > to fixed (albeit user defined) values. And syscall's can be done only
>> > outside of enclave.
>> > 
>> > This is a required feature for fancier runtimes (such as Graphene).
>> 
>> Can you please explain a bit more? What do they need GS for?
>
> Apparently, uses only wrfsbase:
>
> https://raw.githubusercontent.com/oscarlab/graphene/master/Pal/src/host/Linux-SGX/db_misc.c
>
> I'm not too familiar with the codebase yet but by reading some research
> papers in the past the idea is to multiplex one TCS for multiple virtual
> threads inside the enclave.
>
> E.g. TCS could represent a vcpu for a libos type of container and on
> entry would pick on a thread and set fsbase accordingly for a thread
> control block.

That justifies to write books which recommend to load a kernel module
which creates a full unpriviledged root hole. I bet none of these papers
ever mentioned that.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-18 20:24     ` Sasha Levin
  2020-05-18 22:59       ` Thomas Gleixner
@ 2020-05-19 12:20       ` David Laight
  2020-05-19 14:48         ` Thomas Gleixner
  1 sibling, 1 reply; 76+ messages in thread
From: David Laight @ 2020-05-19 12:20 UTC (permalink / raw)
  To: 'Sasha Levin', Thomas Gleixner
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae, Andrew Cooper, x86

From: Sasha Levin
> Sent: 18 May 2020 21:25
> Thank you for taking the time to review this.
> 
> On Mon, May 18, 2020 at 08:20:08PM +0200, Thomas Gleixner wrote:
> >Sasha Levin <sashal@kernel.org> writes:
> >> +unsigned long x86_gsbase_read_cpu_inactive(void)
> >> +{
> >> +	unsigned long gsbase;
> >> +
> >> +	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
> >> +		bool need_restore = false;
> >> +		unsigned long flags;
> >> +
> >> +		/*
> >> +		 * We read the inactive GS base value by swapping
> >> +		 * to make it the active one. But we cannot allow
> >> +		 * an interrupt while we switch to and from.
> >> +		 */
> >> +		if (!irqs_disabled()) {
> >> +			local_irq_save(flags);
> >> +			need_restore = true;
> >> +		}
> >> +
> >> +		native_swapgs();
> >> +		gsbase = rdgsbase();
> >> +		native_swapgs();

Does local_irq_save() even do anything useful here.
You need to actually execute CLI, not just set a
flag that indicates interrupts shouldn't happen.
(Which is what I think local_irq_save() might do.)

You also (probably) need to disable NMIs.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-19 12:20       ` David Laight
@ 2020-05-19 14:48         ` Thomas Gleixner
  2020-05-20  9:13           ` David Laight
  0 siblings, 1 reply; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-19 14:48 UTC (permalink / raw)
  To: David Laight, 'Sasha Levin'
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae, Andrew Cooper, x86

David Laight <David.Laight@ACULAB.COM> writes:
> From: Sasha Levin
>> >> +		native_swapgs();
>> >> +		gsbase = rdgsbase();
>> >> +		native_swapgs();
>
> Does local_irq_save() even do anything useful here.
> You need to actually execute CLI, not just set a
> flag that indicates interrupts shouldn't happen.
> (Which is what I think local_irq_save() might do.)

  local_irq_save()
    raw_local_irq_save()
      arch_local_irq_save()
        arch_local_irq_disable()
          native_irq_disable()
            asm("CLI")

> You also (probably) need to disable NMIs.

The NMI entry can deal with that obviously.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-18 23:03           ` Thomas Gleixner
@ 2020-05-19 16:48             ` Jarkko Sakkinen
  2020-05-22 20:14               ` Don Porter
  0 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-19 16:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Tue, May 19, 2020 at 01:03:25AM +0200, Thomas Gleixner wrote:
> Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> writes:
> > On Mon, 2020-05-18 at 08:34 -0700, Andi Kleen wrote:
> >> > Yes, for SGX this is functional feature because enclave entry points,
> >> > thread control structures (aka TCS's), reset FSBASE and GSBASE registers
> >> > to fixed (albeit user defined) values. And syscall's can be done only
> >> > outside of enclave.
> >> > 
> >> > This is a required feature for fancier runtimes (such as Graphene).
> >> 
> >> Can you please explain a bit more? What do they need GS for?
> >
> > Apparently, uses only wrfsbase:
> >
> > https://raw.githubusercontent.com/oscarlab/graphene/master/Pal/src/host/Linux-SGX/db_misc.c
> >
> > I'm not too familiar with the codebase yet but by reading some research
> > papers in the past the idea is to multiplex one TCS for multiple virtual
> > threads inside the enclave.
> >
> > E.g. TCS could represent a vcpu for a libos type of container and on
> > entry would pick on a thread and set fsbase accordingly for a thread
> > control block.
> 
> That justifies to write books which recommend to load a kernel module
> which creates a full unpriviledged root hole. I bet none of these papers
> ever mentioned that.

Fully agree that oot lkm for this is a worst idea ever.

That's why I want to help with this.

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions
  2020-05-19 14:48         ` Thomas Gleixner
@ 2020-05-20  9:13           ` David Laight
  0 siblings, 0 replies; 76+ messages in thread
From: David Laight @ 2020-05-20  9:13 UTC (permalink / raw)
  To: 'Thomas Gleixner', 'Sasha Levin'
  Cc: linux-kernel, bp, luto, hpa, dave.hansen, tony.luck, ak,
	ravi.v.shankar, chang.seok.bae, Andrew Cooper, x86

From: Thomas Gleixner
> Sent: 19 May 2020 15:48
> 
> David Laight <David.Laight@ACULAB.COM> writes:
> > From: Sasha Levin
> >> >> +		native_swapgs();
> >> >> +		gsbase = rdgsbase();
> >> >> +		native_swapgs();
> >
> > Does local_irq_save() even do anything useful here.
> > You need to actually execute CLI, not just set a
> > flag that indicates interrupts shouldn't happen.
> > (Which is what I think local_irq_save() might do.)
> 
>   local_irq_save()
>     raw_local_irq_save()
>       arch_local_irq_save()
>         arch_local_irq_disable()
>           native_irq_disable()
>             asm("CLI")

Ah, I was expecting software 'tricks' to avoid the expensive CLI.
But that call chain probably costs more - unless it is all inlined.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-19 16:48             ` Jarkko Sakkinen
@ 2020-05-22 20:14               ` Don Porter
  2020-05-22 20:55                 ` Dave Hansen
                                   ` (3 more replies)
  0 siblings, 4 replies; 76+ messages in thread
From: Don Porter @ 2020-05-22 20:14 UTC (permalink / raw)
  To: Jarkko Sakkinen, Thomas Gleixner
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On 5/19/20 12:48 PM, Jarkko Sakkinen wrote:
> On Tue, May 19, 2020 at 01:03:25AM +0200, Thomas Gleixner wrote:
>> Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> writes:
>>> On Mon, 2020-05-18 at 08:34 -0700, Andi Kleen wrote:
>>>>> Yes, for SGX this is functional feature because enclave entry points,
>>>>> thread control structures (aka TCS's), reset FSBASE and GSBASE registers
>>>>> to fixed (albeit user defined) values. And syscall's can be done only
>>>>> outside of enclave.
>>>>>
>>>>> This is a required feature for fancier runtimes (such as Graphene).
>>>>
>>>> Can you please explain a bit more? What do they need GS for?
>>>
>>> Apparently, uses only wrfsbase:
>>>
>>> https://raw.githubusercontent.com/oscarlab/graphene/master/Pal/src/host/Linux-SGX/db_misc.c
>>>
>>> I'm not too familiar with the codebase yet but by reading some research
>>> papers in the past the idea is to multiplex one TCS for multiple virtual
>>> threads inside the enclave.
>>>
>>> E.g. TCS could represent a vcpu for a libos type of container and on
>>> entry would pick on a thread and set fsbase accordingly for a thread
>>> control block.
>>
>> That justifies to write books which recommend to load a kernel module
>> which creates a full unpriviledged root hole. I bet none of these papers
>> ever mentioned that.
> 
> Fully agree that oot lkm for this is a worst idea ever.
> 
> That's why I want to help with this.
> 
> /Jarkko
> 

 >

Hi all, and apologies for the resend,

I wanted to clarify that we never intended the Graphene kernel module 
you mention for production use, as well as to comment in support of this 
patch.

Setting the fs register in userspace is an essential feature for running 
legacy code in SGX.  We have been following LKML discussions on this 
instruction for years, and hoping this feature would be supported by 
Linux, so that we can retire this module.  To our knowledge, every SGX 
library OS has a similar module, waiting for this or a similar patch to 
be merged into Linux.  This indicates a growing user base that needs 
this instruction.

Just for some history, Graphene was originally a research 
proof-of-concept that started in my lab, and has since received 
substantial contributions as an open source project from companies 
including Intel.  This code base is explicitly not intended or ready for 
production use at this point, as it is still missing essential features.

We wrote the kernel module as a way to get something working quickly, so 
that we could focus on studying more difficult aspects of porting code 
to SGX.  We had always assumed that the Linux community would eventually 
offer a correct and safe mechanism to enable this instruction, but we 
generally err on the side of publishing code we write for research 
studies as open source in the interest of supporting reproducibility and 
further science.

Nonetheless, Graphene is moving towards adoption in production systems, 
and we are actively working to make the code base secure and robust. 
This issue has been on our to-do list before a production release.  It 
would certainly make our lives easier to deprecate our module and just 
use a robust, in-kernel implementation.

All the best,
Don Porter
Graphene Maintainer
https://grapheneproject.io/

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-22 20:14               ` Don Porter
@ 2020-05-22 20:55                 ` Dave Hansen
  2020-05-23  0:45                 ` Thomas Gleixner
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 76+ messages in thread
From: Dave Hansen @ 2020-05-22 20:55 UTC (permalink / raw)
  To: Don Porter, Jarkko Sakkinen, Thomas Gleixner
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa, tony.luck,
	ravi.v.shankar, chang.seok.bae

On 5/22/20 1:14 PM, Don Porter wrote:
> I wanted to clarify that we never intended the Graphene kernel module
> you mention for production use, as well as to comment in support of this
> patch.

Could you also clarify: Did you know that the FSGSBASE kernel module
introduced a root vulnerability?  Where did it come from in the first place?

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-22 20:14               ` Don Porter
  2020-05-22 20:55                 ` Dave Hansen
@ 2020-05-23  0:45                 ` Thomas Gleixner
  2020-05-24 19:45                   ` hpa
  2020-05-26 12:42                   ` Don Porter
  2020-05-23  4:19                 ` Andi Kleen
  2020-05-27  8:20                 ` Jarkko Sakkinen
  3 siblings, 2 replies; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-23  0:45 UTC (permalink / raw)
  To: Don Porter, Jarkko Sakkinen
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

Don,

Don Porter <porter@cs.unc.edu> writes:
> On 5/19/20 12:48 PM, Jarkko Sakkinen wrote:
>> On Tue, May 19, 2020 at 01:03:25AM +0200, Thomas Gleixner wrote:
>>>
>>> That justifies to write books which recommend to load a kernel module
>>> which creates a full unpriviledged root hole. I bet none of these papers
>>> ever mentioned that.
>
> I wanted to clarify that we never intended the Graphene kernel module 
> you mention for production use, as well as to comment in support of this 
> patch.

let me clarify, that despite your intentions:

    - there is not a single word in any paper, slide deck, documentation
      etc. which mentions that loading this module and enabling FSGSBASE
      behind the kernels back is a fully unpriviledged root hole.

    - the module lacks a big fat warning emitted to dmesg, that this
      turns the host kernel into a complete security disaster.

    - the module fails to set the TAINT_CRAP flag when initialized.

This shows a pretty obvious discrepancy between intention and action.

> Setting the fs register in userspace is an essential feature for running 
> legacy code in SGX.  We have been following LKML discussions on this 
> instruction for years, and hoping this feature would be supported by 
> Linux, so that we can retire this module.

The way to get things done in the kernel is to actively work on the
problem. Hoping that someone else will fix that for you is naive at
best. Wilful ignorance might be a less polite but nevertheless accurate
term.

> To our knowledge, every SGX library OS has a similar module, waiting
> for this or a similar patch to be merged into Linux.  This indicates a
> growing user base that needs this instruction.

I'm failing to understand that a whole industry which is so confident
about their ultimate solution to the security problem puts possible
users and customers into the situation to decide between:

 1) Secure host kernel (with known limitations)

 2) SGX enclaves

I would not mind if this would be a choice between fire and frying pan,
but this is a choice between a well understood reality and a very
dangerous illusion.

> Nonetheless, Graphene is moving towards adoption in production systems, 
> and we are actively working to make the code base secure and robust. 
> This issue has been on our to-do list before a production release.  It 
> would certainly make our lives easier to deprecate our module and just 
> use a robust, in-kernel implementation.

Would make your life easier?

Having proper in kernel FSGSBASE support is the only solution to that
problem and this has been true since the whole SGX frenzy started. Intel
failed to upstream FSGSBASE since 2015 (sic!). See

  https://lore.kernel.org/lkml/alpine.DEB.2.21.1903261010380.1789@nanos.tec.linutronix.de/

for a detailed time line. And that mail is more than a year old.

Since then there happened even more trainwrecks including the revert of
already queued patches a few days before the 5.3 merge window opened.

After that we saw yet more broken variants of that patch set including
the fail to provide information which is required to re-merge that.

Instead of providing that information the next version re-introduced the
wreckage which was carefully sorted out during earlier review cycles up
to the revert.

So you (and everybody else who has interrest in SGX) just sat there,
watched and hoped that this will solve itself magically. And with that
"hope" argument you really want to make me believe that all of this was
against your intentions?

It's beyond hillarious that the renewed attempt to get FSGSBASE support
merged does not come from the company which has the main interest to get
this solved, i.e Intel.

Based on your argumentation that all of this is uninteded, I assume that
the pull request on github which removes this security hole from
graphene:

        https://github.com/oscarlab/graphene/pull/1529

is perfectly fine, right?

Quite the contrary, it's completely usesless and at the same time
perfectly fitting into this picture:

  The changelog is SGX marketing compliant: Zero technical content. Not
  a single word about the real implications of that blantant violation
  of any principle of sane (security) engineering.

Not that I'm surprised about this. That change originates from Intel and
the poor sod who had to place the pull request - coincidentally a few
days after this insanity became public - was not allowed to spell out
the real reasons why this removal is necessary.

Please read security relevant changelogs in the kernel git tree and then
explain to me the utter void in this one.

Looking at the advertising which all involved parties including the
Confidential Computing Consortium are conducting, plus the fact that
Intel has major investments in SGX supporting companies and projects,
this is one of the worst marketing scams I've seen in decades.

This all violates the fundamental engineering principle of "correctnes
first" and I'm flabbergasted that academic research has degraded into
the "features first" advocating domain.

What's worse it that public funded research is failing to serve the
public interest and instead is acting as an advertsiing machine for their
corporate sponsors.

Thanks,

        Thomas
 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-22 20:14               ` Don Porter
  2020-05-22 20:55                 ` Dave Hansen
  2020-05-23  0:45                 ` Thomas Gleixner
@ 2020-05-23  4:19                 ` Andi Kleen
  2020-05-28 10:36                   ` Thomas Gleixner
  2020-05-27  8:20                 ` Jarkko Sakkinen
  3 siblings, 1 reply; 76+ messages in thread
From: Andi Kleen @ 2020-05-23  4:19 UTC (permalink / raw)
  To: Don Porter
  Cc: Jarkko Sakkinen, Thomas Gleixner, Sasha Levin, linux-kernel, bp,
	luto, hpa, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

> Setting the fs register in userspace is an essential feature for running
> legacy code in SGX.  We have been following LKML discussions on this
> instruction for years, and hoping this feature would be supported by Linux,

If you need a feature you should comment on it. One of the reasons
it took so long is that the users didn't speak up.


-Andi

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-23  0:45                 ` Thomas Gleixner
@ 2020-05-24 19:45                   ` hpa
  2020-05-24 21:19                     ` Sasha Levin
  2020-05-27  8:31                     ` Jarkko Sakkinen
  2020-05-26 12:42                   ` Don Porter
  1 sibling, 2 replies; 76+ messages in thread
From: hpa @ 2020-05-24 19:45 UTC (permalink / raw)
  To: Thomas Gleixner, Don Porter, Jarkko Sakkinen
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, dave.hansen,
	tony.luck, ravi.v.shankar, chang.seok.bae

On May 22, 2020 5:45:39 PM PDT, Thomas Gleixner <tglx@linutronix.de> wrote:
>Don,
>
>Don Porter <porter@cs.unc.edu> writes:
>> On 5/19/20 12:48 PM, Jarkko Sakkinen wrote:
>>> On Tue, May 19, 2020 at 01:03:25AM +0200, Thomas Gleixner wrote:
>>>>
>>>> That justifies to write books which recommend to load a kernel
>module
>>>> which creates a full unpriviledged root hole. I bet none of these
>papers
>>>> ever mentioned that.
>>
>> I wanted to clarify that we never intended the Graphene kernel module
>
>> you mention for production use, as well as to comment in support of
>this 
>> patch.
>
>let me clarify, that despite your intentions:
>
>   - there is not a single word in any paper, slide deck, documentation
>     etc. which mentions that loading this module and enabling FSGSBASE
>      behind the kernels back is a fully unpriviledged root hole.
>
>    - the module lacks a big fat warning emitted to dmesg, that this
>      turns the host kernel into a complete security disaster.
>
>    - the module fails to set the TAINT_CRAP flag when initialized.
>
>This shows a pretty obvious discrepancy between intention and action.
>
>> Setting the fs register in userspace is an essential feature for
>running 
>> legacy code in SGX.  We have been following LKML discussions on this 
>> instruction for years, and hoping this feature would be supported by 
>> Linux, so that we can retire this module.
>
>The way to get things done in the kernel is to actively work on the
>problem. Hoping that someone else will fix that for you is naive at
>best. Wilful ignorance might be a less polite but nevertheless accurate
>term.
>
>> To our knowledge, every SGX library OS has a similar module, waiting
>> for this or a similar patch to be merged into Linux.  This indicates
>a
>> growing user base that needs this instruction.
>
>I'm failing to understand that a whole industry which is so confident
>about their ultimate solution to the security problem puts possible
>users and customers into the situation to decide between:
>
> 1) Secure host kernel (with known limitations)
>
> 2) SGX enclaves
>
>I would not mind if this would be a choice between fire and frying pan,
>but this is a choice between a well understood reality and a very
>dangerous illusion.
>
>> Nonetheless, Graphene is moving towards adoption in production
>systems, 
>> and we are actively working to make the code base secure and robust. 
>> This issue has been on our to-do list before a production release. 
>It 
>> would certainly make our lives easier to deprecate our module and
>just 
>> use a robust, in-kernel implementation.
>
>Would make your life easier?
>
>Having proper in kernel FSGSBASE support is the only solution to that
>problem and this has been true since the whole SGX frenzy started.
>Intel
>failed to upstream FSGSBASE since 2015 (sic!). See
>
>https://lore.kernel.org/lkml/alpine.DEB.2.21.1903261010380.1789@nanos.tec.linutronix.de/
>
>for a detailed time line. And that mail is more than a year old.
>
>Since then there happened even more trainwrecks including the revert of
>already queued patches a few days before the 5.3 merge window opened.
>
>After that we saw yet more broken variants of that patch set including
>the fail to provide information which is required to re-merge that.
>
>Instead of providing that information the next version re-introduced
>the
>wreckage which was carefully sorted out during earlier review cycles up
>to the revert.
>
>So you (and everybody else who has interrest in SGX) just sat there,
>watched and hoped that this will solve itself magically. And with that
>"hope" argument you really want to make me believe that all of this was
>against your intentions?
>
>It's beyond hillarious that the renewed attempt to get FSGSBASE support
>merged does not come from the company which has the main interest to
>get
>this solved, i.e Intel.
>
>Based on your argumentation that all of this is uninteded, I assume
>that
>the pull request on github which removes this security hole from
>graphene:
>
>        https://github.com/oscarlab/graphene/pull/1529
>
>is perfectly fine, right?
>
>Quite the contrary, it's completely usesless and at the same time
>perfectly fitting into this picture:
>
>  The changelog is SGX marketing compliant: Zero technical content. Not
>  a single word about the real implications of that blantant violation
>  of any principle of sane (security) engineering.
>
>Not that I'm surprised about this. That change originates from Intel
>and
>the poor sod who had to place the pull request - coincidentally a few
>days after this insanity became public - was not allowed to spell out
>the real reasons why this removal is necessary.
>
>Please read security relevant changelogs in the kernel git tree and
>then
>explain to me the utter void in this one.
>
>Looking at the advertising which all involved parties including the
>Confidential Computing Consortium are conducting, plus the fact that
>Intel has major investments in SGX supporting companies and projects,
>this is one of the worst marketing scams I've seen in decades.
>
>This all violates the fundamental engineering principle of "correctnes
>first" and I'm flabbergasted that academic research has degraded into
>the "features first" advocating domain.
>
>What's worse it that public funded research is failing to serve the
>public interest and instead is acting as an advertsiing machine for
>their
>corporate sponsors.
>
>Thanks,
>
>        Thomas
> 

On a related topic (needless to say, this should never have happened and is being raised at the highest levels inside Intel):

There are legitimate reasons to write a root-hole module, the main one being able to test security features like SMAP. I have requested before a TAINT flag specifically for this purpose, because TAINT_CRAP is nowhere near explicit enough, and is also used for staging drivers. Call it TAINT_TOXIC or TAINT_ROOTHOLE; it should always be accompanied with a CRIT level alert.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-24 19:45                   ` hpa
@ 2020-05-24 21:19                     ` Sasha Levin
  2020-05-24 23:44                       ` hpa
  2020-05-25  7:54                       ` Richard Weinberger
  2020-05-27  8:31                     ` Jarkko Sakkinen
  1 sibling, 2 replies; 76+ messages in thread
From: Sasha Levin @ 2020-05-24 21:19 UTC (permalink / raw)
  To: hpa
  Cc: Thomas Gleixner, Don Porter, Jarkko Sakkinen, Andi Kleen,
	linux-kernel, bp, luto, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

On Sun, May 24, 2020 at 12:45:18PM -0700, hpa@zytor.com wrote:
>There are legitimate reasons to write a root-hole module, the main one being able to test security features like SMAP. I have requested before a TAINT flag specifically for this purpose, because TAINT_CRAP is nowhere near explicit enough, and is also used for staging drivers. Call it TAINT_TOXIC or TAINT_ROOTHOLE; it should always be accompanied with a CRIT level alert.

What I don't like about our current system of TAINT_* flags is that
while we can improve it as much as we want, no one outside of the kernel
tree seems to be using it. While Thomas may have been commenting on
Graphene's behaviour, look at any other code that did the same thing:

- Graphene: https://github.com/oscarlab/graphene-sgx-driver/blob/master/gsgx.c
- Occlum: https://github.com/occlum/enable_rdfsbase/blob/master/enable_rdfsbase.c
- SGX-LKL: https://github.com/lsds/sgx-lkl/blob/master/tools/kmod-set-fsgsbase/mod_set_cr4_fsgsbase.c

None of which set even the CRAP flag.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-24 21:19                     ` Sasha Levin
@ 2020-05-24 23:44                       ` hpa
  2020-05-25  7:54                       ` Richard Weinberger
  1 sibling, 0 replies; 76+ messages in thread
From: hpa @ 2020-05-24 23:44 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Thomas Gleixner, Don Porter, Jarkko Sakkinen, Andi Kleen,
	linux-kernel, bp, luto, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

On May 24, 2020 2:19:45 PM PDT, Sasha Levin <sashal@kernel.org> wrote:
>On Sun, May 24, 2020 at 12:45:18PM -0700, hpa@zytor.com wrote:
>>There are legitimate reasons to write a root-hole module, the main one
>being able to test security features like SMAP. I have requested before
>a TAINT flag specifically for this purpose, because TAINT_CRAP is
>nowhere near explicit enough, and is also used for staging drivers.
>Call it TAINT_TOXIC or TAINT_ROOTHOLE; it should always be accompanied
>with a CRIT level alert.
>
>What I don't like about our current system of TAINT_* flags is that
>while we can improve it as much as we want, no one outside of the
>kernel
>tree seems to be using it. While Thomas may have been commenting on
>Graphene's behaviour, look at any other code that did the same thing:
>
>- Graphene:
>https://github.com/oscarlab/graphene-sgx-driver/blob/master/gsgx.c
>- Occlum:
>https://github.com/occlum/enable_rdfsbase/blob/master/enable_rdfsbase.c
>- SGX-LKL:
>https://github.com/lsds/sgx-lkl/blob/master/tools/kmod-set-fsgsbase/mod_set_cr4_fsgsbase.c
>
>None of which set even the CRAP flag.

That's a separate problem, but I would personally want to have it for my own test modules in case one ever escapes.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-24 21:19                     ` Sasha Levin
  2020-05-24 23:44                       ` hpa
@ 2020-05-25  7:54                       ` Richard Weinberger
  2020-05-25 21:56                         ` Tony Luck
  2020-05-26  8:12                         ` David Laight
  1 sibling, 2 replies; 76+ messages in thread
From: Richard Weinberger @ 2020-05-25  7:54 UTC (permalink / raw)
  To: Sasha Levin
  Cc: H. Peter Anvin, Thomas Gleixner, Don Porter, Jarkko Sakkinen,
	Andi Kleen, LKML, Borislav Petkov, Andy Lutomirski, Dave Hansen,
	Tony Luck, Ravi V Shankar, chang.seok.bae

On Sun, May 24, 2020 at 11:20 PM Sasha Levin <sashal@kernel.org> wrote:
>
> On Sun, May 24, 2020 at 12:45:18PM -0700, hpa@zytor.com wrote:
> >There are legitimate reasons to write a root-hole module, the main one being able to test security features like SMAP. I have requested before a TAINT flag specifically for this purpose, because TAINT_CRAP is nowhere near explicit enough, and is also used for staging drivers. Call it TAINT_TOXIC or TAINT_ROOTHOLE; it should always be accompanied with a CRIT level alert.
>
> What I don't like about our current system of TAINT_* flags is that
> while we can improve it as much as we want, no one outside of the kernel
> tree seems to be using it. While Thomas may have been commenting on
> Graphene's behaviour, look at any other code that did the same thing:

Even if these modules would set TAINT_ROOTHOLE/TOXIC, the vast majority of users
have no clue what these flags really mean nor bother to take them seriously.

Almost every customer system I get my hands on has the following flags set:
C: Some driver from staging was "needed", mostly media or wifi stuff.
O: Customer did a custom module.
W: Random warning from vendor kernel at bootup because of a slightly
configured device-tree, etc.
P: Sadly too. Mostly because customer has custom module and forgot to set it GPL

All this trained users to believe that a few taint flags don't hurt
and only in a perfect world
none are set.

What works and raises attention is Steve's trace_printk() warning:

        pr_warn("**********************************************************\n");
        pr_warn("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
        pr_warn("**                                                      **\n");
        pr_warn("** trace_printk() being used. Allocating extra memory.  **\n");
        pr_warn("**                                                      **\n");
        pr_warn("** This means that this is a DEBUG kernel and it is     **\n");
        pr_warn("** unsafe for production use.                           **\n");
        pr_warn("**                                                      **\n");
        pr_warn("** If you see this message and you are not debugging    **\n");
        pr_warn("** the kernel, report this immediately to your vendor!  **\n");
        pr_warn("**                                                      **\n");
        pr_warn("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
        pr_warn("**********************************************************\n");

Maybe we can add something like this for taints too? :-)

-- 
Thanks,
//richard

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-25  7:54                       ` Richard Weinberger
@ 2020-05-25 21:56                         ` Tony Luck
  2020-05-26  8:12                         ` David Laight
  1 sibling, 0 replies; 76+ messages in thread
From: Tony Luck @ 2020-05-25 21:56 UTC (permalink / raw)
  To: Richard Weinberger
  Cc: Sasha Levin, H. Peter Anvin, Thomas Gleixner, Don Porter,
	Jarkko Sakkinen, Andi Kleen, LKML, Borislav Petkov,
	Andy Lutomirski, Hansen, Dave, Luck, Tony, Shankar, Ravi V, Bae,
	Chang Seok

      pr_warn("** If you see this message and you are not debugging    **\n");
>        pr_warn("** the kernel, report this immediately to your vendor!  **\n");
>        pr_warn("**                                                      **\n");
>        pr_warn("**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n");
>        pr_warn("**********************************************************\n");
> 
> Maybe we can add something like this for taints too? :-)
> 

For TAINT_ROOTHOLE the severity should be “pr_emerg”.  The message should repeat every five minutes.

-Tony

^ permalink raw reply	[flat|nested] 76+ messages in thread

* RE: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-25  7:54                       ` Richard Weinberger
  2020-05-25 21:56                         ` Tony Luck
@ 2020-05-26  8:12                         ` David Laight
  2020-05-26  8:23                           ` Richard Weinberger
  1 sibling, 1 reply; 76+ messages in thread
From: David Laight @ 2020-05-26  8:12 UTC (permalink / raw)
  To: 'Richard Weinberger', Sasha Levin
  Cc: H. Peter Anvin, Thomas Gleixner, Don Porter, Jarkko Sakkinen,
	Andi Kleen, LKML, Borislav Petkov, Andy Lutomirski, Dave Hansen,
	Tony Luck, Ravi V Shankar, chang.seok.bae

From: Richard Weinberger
> Sent: 25 May 2020 08:55
...
> P: Sadly too. Mostly because customer has custom module and forgot to set it GPL

You want us to lie that custom modules are GPL?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-26  8:12                         ` David Laight
@ 2020-05-26  8:23                           ` Richard Weinberger
  0 siblings, 0 replies; 76+ messages in thread
From: Richard Weinberger @ 2020-05-26  8:23 UTC (permalink / raw)
  To: David Laight
  Cc: Sasha Levin, hpa, tglx, Don Porter, Jarkko Sakkinen, Andi Kleen,
	linux-kernel, bp, Andy Lutomirski, Dave Hansen, Tony Luck,
	Ravi V Shankar, chang seok bae

----- Ursprüngliche Mail -----
> Von: "David Laight" <David.Laight@ACULAB.COM>
> An: "Richard Weinberger" <richard.weinberger@gmail.com>, "Sasha Levin" <sashal@kernel.org>
> CC: "hpa" <hpa@zytor.com>, "tglx" <tglx@linutronix.de>, "Don Porter" <porter@cs.unc.edu>, "Jarkko Sakkinen"
> <jarkko.sakkinen@linux.intel.com>, "Andi Kleen" <ak@linux.intel.com>, "linux-kernel" <linux-kernel@vger.kernel.org>,
> "bp" <bp@alien8.de>, "Andy Lutomirski" <luto@kernel.org>, "Dave Hansen" <dave.hansen@intel.com>, "Tony Luck"
> <tony.luck@intel.com>, "Ravi V Shankar" <ravi.v.shankar@intel.com>, "chang seok bae" <chang.seok.bae@intel.com>
> Gesendet: Dienstag, 26. Mai 2020 10:12:02
> Betreff: RE: Re: [PATCH v12 00/18] Enable FSGSBASE instructions

> From: Richard Weinberger
>> Sent: 25 May 2020 08:55
> ...
>> P: Sadly too. Mostly because customer has custom module and forgot to set it GPL
> 
> You want us to lie that custom modules are GPL?

No. Of course not.

But after thinking twice most guys realize that the have to use GPL
no matter whether they are using just EXPORT_SYMBOL() stuff or not.

But this is a completely different topic and something for company lawyers.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-23  0:45                 ` Thomas Gleixner
  2020-05-24 19:45                   ` hpa
@ 2020-05-26 12:42                   ` Don Porter
  2020-05-26 20:27                     ` Sasha Levin
  2020-05-28 10:29                     ` Thomas Gleixner
  1 sibling, 2 replies; 76+ messages in thread
From: Don Porter @ 2020-05-26 12:42 UTC (permalink / raw)
  To: Thomas Gleixner, Jarkko Sakkinen
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

Hi Thomas,

On 5/22/20 8:45 PM, Thomas Gleixner wrote:
> Don,
> 
> Don Porter <porter@cs.unc.edu> writes:
>> On 5/19/20 12:48 PM, Jarkko Sakkinen wrote:
>>> On Tue, May 19, 2020 at 01:03:25AM +0200, Thomas Gleixner wrote:
>>>>
>>>> That justifies to write books which recommend to load a kernel module
>>>> which creates a full unpriviledged root hole. I bet none of these papers
>>>> ever mentioned that.
>>
>> I wanted to clarify that we never intended the Graphene kernel module
>> you mention for production use, as well as to comment in support of this
>> patch.
> 
> let me clarify, that despite your intentions:
> 
>      - there is not a single word in any paper, slide deck, documentation
>        etc. which mentions that loading this module and enabling FSGSBASE
>        behind the kernels back is a fully unpriviledged root hole.
> 
>      - the module lacks a big fat warning emitted to dmesg, that this
>        turns the host kernel into a complete security disaster.
> 
>      - the module fails to set the TAINT_CRAP flag when initialized.
> 
> This shows a pretty obvious discrepancy between intention and action.

I think there is a significant misunderstanding here.  This line of 
research assumes the kernel is already compromised and behaving 
adversarially toward a more trusted application.  Thus, the attack 
surface under scrutiny in these projects is between the enclave and the 
rest of the system.  Not that we want kernels to be rooted, or make this 
easier, but exploits happen in practice.

The threat model for Graphene, and most SGX papers, is quite explicit: 
we assume that Intel’s CPU package, the software in the enclave, and 
possibly Intel’s Attestation Service (IAS) are the only trusted 
components.  Any other software should be assumed compromised, and one 
can even assume memory is physically tampered or that one has plugged in 
an adversarial device. It is not a question of the limitations of the 
kernel, the threat model assumes that the kernel is already rooted.

For the community these papers are typically written to, this assumption 
would be well understood.  And thus it is common to see code artifacts 
that might emulate or even undermine security of untrusted components. 
Not appropriate for production use, but for the typical audience, this 
risk would be understood.  And, initially, when people started using 
Graphene, I checked who they were - almost exclusively SGX researchers 
who would have this context.  It has only been recently that the 
interest has grown to a level that these sorts of warnings need to be 
revised for a more general audience.  But the point that we should 
revise our readme and warnings for a more general audience is well taken.

> Having proper in kernel FSGSBASE support is the only solution to that
> problem and this has been true since the whole SGX frenzy started. Intel
> failed to upstream FSGSBASE since 2015 (sic!). See
> 
>    https://lore.kernel.org/lkml/alpine.DEB.2.21.1903261010380.1789@nanos.tec.linutronix.de/
> 
> for a detailed time line. And that mail is more than a year old.
> 
> Since then there happened even more trainwrecks including the revert of
> already queued patches a few days before the 5.3 merge window opened.
> 
> After that we saw yet more broken variants of that patch set including
> the fail to provide information which is required to re-merge that.
> 
> Instead of providing that information the next version re-introduced the
> wreckage which was carefully sorted out during earlier review cycles up
> to the revert.
> 
> So you (and everybody else who has interrest in SGX) just sat there,
> watched and hoped that this will solve itself magically. And with that
> "hope" argument you really want to make me believe that all of this was
> against your intentions?
> 
> It's beyond hillarious that the renewed attempt to get FSGSBASE support
> merged does not come from the company which has the main interest to get
> this solved, i.e Intel.

Yes!  I think we are in agreement that we expected Intel to upstream 
this support - it is their product. I don’t see why I am personally 
responsible to come to the aid of a multi-billion dollar corporation in 
my free time, or that it is wrong to at least let them try first and see 
how far they get.

Until recently, we were doing proof-of-concept research, not product 
development, and there are limited hours in the day.  I also hasten to 
say that the product of research is an article, the software artifact 
serves as documentation of the experiment.  In contrast, the product of 
software development is software.  It takes significant time and effort 
to convert one to the other.  Upstreaming code is of little scientific 
interest.  But things have changed for our project; we had no users in 
2015 and we are now un-cutting corners that are appropriate for research 
but inappropriate for production.  For a research artifact with an 
audience that knew the risks, we shipped a module because it was easier 
to maintain and install than a kernel patch.

Also, there is a chicken-and-egg problem here: AFAIU a kernel patch 
needs a userspace demonstration to motivate merging.  We can’t do a 
userspace demonstration without this feature.  My main interest in 
showing up for this discussion was to try to make the case that, 
compared to 2015, there is a more convincing userspace demonstration and 
larger population of interested users.

> 
> Based on your argumentation that all of this is uninteded, I assume that
> the pull request on github which removes this security hole from
> graphene:
> 
>          https://github.com/oscarlab/graphene/pull/1529
> 
> is perfectly fine, right?

As far as the patch and pull request, I personally think the right thing 
to do is add the warnings you suggest, help test this or another kernel 
patch, and advise users that patching their kernel is more secure than 
this module.  I am not in favor of fully deleting the module, in the 
interest of transparency and reproducibility.

> 
> Looking at the advertising which all involved parties including the
> Confidential Computing Consortium are conducting, plus the fact that
> Intel has major investments in SGX supporting companies and projects,
> this is one of the worst marketing scams I've seen in decades.
> 
> This all violates the fundamental engineering principle of "correctnes
> first" and I'm flabbergasted that academic research has degraded into
> the "features first" advocating domain.
> 
> What's worse it that public funded research is failing to serve the
> public interest and instead is acting as an advertsiing machine for their
> corporate sponsors.

Finally, I must rebut the claim that my research abuses public funds to 
advertise for Intel.  I have been working on this problem since before I 
knew SGX existed, and have been completely transparent regarding 
subsequent collaborations with Intel.  I believe that understanding the 
pros and cons of different techniques to harden an application against a 
compromised kernel is in the public interest, and my research projects 
have been reviewed and overseen according to standard practices at both 
the university and US government funding agencies.  The expectations of 
agencies in the US funding research are the paper, the insights, and 
proof-of-concept software; converting proof-of-concept software into 
production quality is generally considered a “nice to have”.

- Don


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-26 12:42                   ` Don Porter
@ 2020-05-26 20:27                     ` Sasha Levin
  2020-05-26 22:03                       ` Don Porter
  2020-05-28 10:29                     ` Thomas Gleixner
  1 sibling, 1 reply; 76+ messages in thread
From: Sasha Levin @ 2020-05-26 20:27 UTC (permalink / raw)
  To: Don Porter
  Cc: Thomas Gleixner, Jarkko Sakkinen, Andi Kleen, linux-kernel, bp,
	luto, hpa, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

On Tue, May 26, 2020 at 08:42:09AM -0400, Don Porter wrote:
>On 5/22/20 8:45 PM, Thomas Gleixner wrote:
>>let me clarify, that despite your intentions:
>>
>>     - there is not a single word in any paper, slide deck, documentation
>>       etc. which mentions that loading this module and enabling FSGSBASE
>>       behind the kernels back is a fully unpriviledged root hole.
>>
>>     - the module lacks a big fat warning emitted to dmesg, that this
>>       turns the host kernel into a complete security disaster.
>>
>>     - the module fails to set the TAINT_CRAP flag when initialized.
>>
>>This shows a pretty obvious discrepancy between intention and action.
>
>I think there is a significant misunderstanding here.  This line of 
>research assumes the kernel is already compromised and behaving 
>adversarially toward a more trusted application.  Thus, the attack 
>surface under scrutiny in these projects is between the enclave and 
>the rest of the system.  Not that we want kernels to be rooted, or 
>make this easier, but exploits happen in practice.
>
>The threat model for Graphene, and most SGX papers, is quite explicit: 
>we assume that Intel’s CPU package, the software in the enclave, and 
>possibly Intel’s Attestation Service (IAS) are the only trusted 
>components.  Any other software should be assumed compromised, and one 
>can even assume memory is physically tampered or that one has plugged 
>in an adversarial device. It is not a question of the limitations of 
>the kernel, the threat model assumes that the kernel is already 
>rooted.

You really have to look beyond just what Graphene guarantees at this
point; it does not live on it's own island and it's success isn't
measured purely based on how well it handles it's threat model.

Yes, the threat model assumes the kernel was rooted, but you don't go
off and set the root password to '12345678' on those machines, right?
Attackers would be more than happy to run botnets, spam mailers, and
host child porn on your servers if you give them the opportunity, let's
not do that.

>For the community these papers are typically written to, this 
>assumption would be well understood.  And thus it is common to see 
>code artifacts that might emulate or even undermine security of 
>untrusted components. Not appropriate for production use, but for the 
>typical audience, this risk would be understood.  And, initially, when 
>people started using Graphene, I checked who they were - almost 
>exclusively SGX researchers who would have this context.  It has only 
>been recently that the interest has grown to a level that these sorts 
>of warnings need to be revised for a more general audience.  But the 
>point that we should revise our readme and warnings for a more general 
>audience is well taken.

I'm really worried about the disconnect between how you view the current
state of Graphene (and the industry) vs Intel and the various cloud
providers.

You keep suggesting that its just past the academic research state,
while Intel and the big cloud providers are already pushing it to
external customers.  Every one of those cloud providers has a preview/GA
secure enclave offering.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-26 20:27                     ` Sasha Levin
@ 2020-05-26 22:03                       ` Don Porter
  2020-05-26 22:51                         ` Sasha Levin
  0 siblings, 1 reply; 76+ messages in thread
From: Don Porter @ 2020-05-26 22:03 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Thomas Gleixner, Jarkko Sakkinen, Andi Kleen, linux-kernel, bp,
	luto, hpa, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

On 5/26/20 4:27 PM, Sasha Levin wrote:
> On Tue, May 26, 2020 at 08:42:09AM -0400, Don Porter wrote:
>> On 5/22/20 8:45 PM, Thomas Gleixner wrote:
>>> let me clarify, that despite your intentions:
>>>
>>>     - there is not a single word in any paper, slide deck, documentation
>>>       etc. which mentions that loading this module and enabling FSGSBASE
>>>       behind the kernels back is a fully unpriviledged root hole.
>>>
>>>     - the module lacks a big fat warning emitted to dmesg, that this
>>>       turns the host kernel into a complete security disaster.
>>>
>>>     - the module fails to set the TAINT_CRAP flag when initialized.
>>>
>>> This shows a pretty obvious discrepancy between intention and action.
>>
>> I think there is a significant misunderstanding here.  This line of 
>> research assumes the kernel is already compromised and behaving 
>> adversarially toward a more trusted application.  Thus, the attack 
>> surface under scrutiny in these projects is between the enclave and 
>> the rest of the system.  Not that we want kernels to be rooted, or 
>> make this easier, but exploits happen in practice.
>>
>> The threat model for Graphene, and most SGX papers, is quite explicit: 
>> we assume that Intel’s CPU package, the software in the enclave, and 
>> possibly Intel’s Attestation Service (IAS) are the only trusted 
>> components.  Any other software should be assumed compromised, and one 
>> can even assume memory is physically tampered or that one has plugged 
>> in an adversarial device. It is not a question of the limitations of 
>> the kernel, the threat model assumes that the kernel is already rooted.
> 
> You really have to look beyond just what Graphene guarantees at this
> point; it does not live on it's own island and it's success isn't
> measured purely based on how well it handles it's threat model.
> 
> Yes, the threat model assumes the kernel was rooted, but you don't go
> off and set the root password to '12345678' on those machines, right?
> Attackers would be more than happy to run botnets, spam mailers, and
> host child porn on your servers if you give them the opportunity, let's
> not do that.

I think we are in agreement and have a common interest here.

>> For the community these papers are typically written to, this 
>> assumption would be well understood.  And thus it is common to see 
>> code artifacts that might emulate or even undermine security of 
>> untrusted components. Not appropriate for production use, but for the 
>> typical audience, this risk would be understood.  And, initially, when 
>> people started using Graphene, I checked who they were - almost 
>> exclusively SGX researchers who would have this context.  It has only 
>> been recently that the interest has grown to a level that these sorts 
>> of warnings need to be revised for a more general audience.  But the 
>> point that we should revise our readme and warnings for a more general 
>> audience is well taken.
> 
> I'm really worried about the disconnect between how you view the current
> state of Graphene (and the industry) vs Intel and the various cloud
> providers.
> 
> You keep suggesting that its just past the academic research state,
> while Intel and the big cloud providers are already pushing it to
> external customers.  Every one of those cloud providers has a preview/GA
> secure enclave offering.
> 

I wonder if you are conflating Graphene with SGX?  I understand that 
many cloud vendors are offering SGX in preview/GA, but there are other 
frameworks to build these offerings on, such as Intel's SGX SDK or 
Haven.  It would be news to me if every major cloud vendor were putting 
Graphene in production.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-26 22:03                       ` Don Porter
@ 2020-05-26 22:51                         ` Sasha Levin
  2020-05-28 17:37                           ` Don Porter
  0 siblings, 1 reply; 76+ messages in thread
From: Sasha Levin @ 2020-05-26 22:51 UTC (permalink / raw)
  To: Don Porter
  Cc: Thomas Gleixner, Jarkko Sakkinen, Andi Kleen, linux-kernel, bp,
	luto, hpa, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

On Tue, May 26, 2020 at 06:03:35PM -0400, Don Porter wrote:
>On 5/26/20 4:27 PM, Sasha Levin wrote:
>>I'm really worried about the disconnect between how you view the current
>>state of Graphene (and the industry) vs Intel and the various cloud
>>providers.
>>
>>You keep suggesting that its just past the academic research state,
>>while Intel and the big cloud providers are already pushing it to
>>external customers.  Every one of those cloud providers has a preview/GA
>>secure enclave offering.
>>
>
>I wonder if you are conflating Graphene with SGX?  I understand that 
>many cloud vendors are offering SGX in preview/GA, but there are other 
>frameworks to build these offerings on, such as Intel's SGX SDK or 
>Haven.  It would be news to me if every major cloud vendor were 
>putting Graphene in production.

Sorry, I wasn't trying to suggest that all cloud vendors are pushing
Graphene, but rather than SGX enabled platforms became a commodity
product, users will end up using Graphene-like applications.

Let me provide an example:
https://www.alibabacloud.com/blog/protecting-go-language-applications-with-the-graphene-library-os-on-intel%C2%AE-sgx%C2%AE-secured-alibaba-cloud_594889
- a "practical" guide on how to run Graphene in production environment
   on one of the big cloud vendor platforms. 

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-22 20:14               ` Don Porter
                                   ` (2 preceding siblings ...)
  2020-05-23  4:19                 ` Andi Kleen
@ 2020-05-27  8:20                 ` Jarkko Sakkinen
  2020-05-27 12:42                   ` Wojtek Porczyk
  3 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-27  8:20 UTC (permalink / raw)
  To: Don Porter, Thomas Gleixner
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Fri, 2020-05-22 at 16:14 -0400, Don Porter wrote:
> legacy code in SGX.  We have been following LKML discussions on this 
> instruction for years, and hoping this feature would be supported by 
> Linux, so that we can retire this module.  To our knowledge, every SGX 

Why have you followed this for years and never tried the patches?

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-24 19:45                   ` hpa
  2020-05-24 21:19                     ` Sasha Levin
@ 2020-05-27  8:31                     ` Jarkko Sakkinen
  1 sibling, 0 replies; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-27  8:31 UTC (permalink / raw)
  To: hpa, Thomas Gleixner, Don Porter
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, dave.hansen,
	tony.luck, ravi.v.shankar, chang.seok.bae

On Sun, 2020-05-24 at 12:45 -0700, hpa@zytor.com wrote:
> On a related topic (needless to say, this should never have happened
> and is being raised at the highest levels inside Intel):
> 
> There are legitimate reasons to write a root-hole module, the main one
> being able to test security features like SMAP. I have requested
> before a TAINT flag specifically for this purpose, because
> TAINT_CRAP is nowhere near explicit enough, and is also used for
> staging drivers. Call it TAINT_TOXIC or TAINT_ROOTHOLE; it should
> always be accompanied with a CRIT level alert.

Are these flags easy to bump into in the first place for a person with
no prior familarity with the kernel?

/Jarkko


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-27  8:20                 ` Jarkko Sakkinen
@ 2020-05-27 12:42                   ` Wojtek Porczyk
  0 siblings, 0 replies; 76+ messages in thread
From: Wojtek Porczyk @ 2020-05-27 12:42 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Don Porter, Thomas Gleixner, Andi Kleen, Sasha Levin,
	linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

On Wed, May 27, 2020 at 11:20:08AM +0300, Jarkko Sakkinen wrote:
> On Fri, 2020-05-22 at 16:14 -0400, Don Porter wrote:
> > legacy code in SGX.  We have been following LKML discussions on this 
> > instruction for years, and hoping this feature would be supported by 
> > Linux, so that we can retire this module.  To our knowledge, every SGX 
> 
> Why have you followed this for years and never tried the patches?

For all the reasons stated before (we preferred a module, not a patchset, and
we didn't really care about implications), and because the general situation
about kernel drivers is a mess: Intel made three, mutually incompatible linux
drivers [1] and we used SDK driver, which incidentally is also how people
learn SGX programming in general.

With three different drivers and unclear future directions, we chose to wait
and see how the situation settles, so we stuck to the driver that was already
working.

Also, we're no kernel developers and there were/still are more urgent things
to fix in the graphene proper. For example we only recently have support for
running non-debug enclaves using DCAP LE [2].

[1] Here are my notes from when I was figuring out:
    https://graphene.rtfd.io/en/latest/sgx-intro.html#linux-kernel-drivers

[2] https://github.com/oscarlab/graphene/issues/881
    https://github.com/oscarlab/graphene/pull/978

-- 
pozdrawiam / best regards
Wojtek Porczyk
Graphene / Invisible Things Lab
 
 I do not fear computers,
 I fear lack of them.
    -- Isaac Asimov

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-26 12:42                   ` Don Porter
  2020-05-26 20:27                     ` Sasha Levin
@ 2020-05-28 10:29                     ` Thomas Gleixner
  2020-05-28 17:40                       ` Don Porter
  1 sibling, 1 reply; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-28 10:29 UTC (permalink / raw)
  To: Don Porter, Jarkko Sakkinen
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

Don,

Don Porter <porter@cs.unc.edu> writes:
> On 5/22/20 8:45 PM, Thomas Gleixner wrote:
> The threat model for Graphene, and most SGX papers, is quite explicit: 
> we assume that Intel’s CPU package, the software in the enclave, and 
> possibly Intel’s Attestation Service (IAS) are the only trusted 
> components.  Any other software should be assumed compromised, and one 
> can even assume memory is physically tampered or that one has plugged in 
> an adversarial device. It is not a question of the limitations of the 
> kernel, the threat model assumes that the kernel is already rooted.

I'm well aware about that model and that the research is focussed on
this.

> For the community these papers are typically written to, this assumption 
> would be well understood.  And thus it is common to see code artifacts 
> that might emulate or even undermine security of untrusted
> components.

No disagreement here.

> Not appropriate for production use, but for the typical audience, this 
> risk would be understood.  And, initially, when people started using 
> Graphene, I checked who they were - almost exclusively SGX researchers 
> who would have this context.  It has only been recently that the 
> interest has grown to a level that these sorts of warnings need to be 
> revised for a more general audience.  But the point that we should 
> revise our readme and warnings for a more general audience is well
> taken.

The problem is that this has spread out. And it's not only Graphene.

As at least two different incarnations originate from Intel written by
two different Intel people, it's really on Intel to get the message out
that enabling FSGSBASE behind the kernels back is a horrible idea.

>> It's beyond hillarious that the renewed attempt to get FSGSBASE support
>> merged does not come from the company which has the main interest to get
>> this solved, i.e Intel.
>
> Yes!  I think we are in agreement that we expected Intel to upstream 
> this support - it is their product. I don’t see why I am personally 
> responsible to come to the aid of a multi-billion dollar corporation in 
> my free time, or that it is wrong to at least let them try first and see 
> how far they get.

You surely are not responsible. It's definitely Intel's fault.

> Until recently, we were doing proof-of-concept research, not product 
> development, and there are limited hours in the day.  I also hasten to 
> say that the product of research is an article, the software artifact 
> serves as documentation of the experiment.  In contrast, the product of 
> software development is software.  It takes significant time and effort 
> to convert one to the other.  Upstreaming code is of little scientific 
> interest.  But things have changed for our project; we had no users in 
> 2015 and we are now un-cutting corners that are appropriate for research 
> but inappropriate for production.  For a research artifact with an 
> audience that knew the risks, we shipped a module because it was easier 
> to maintain and install than a kernel patch.

I understand that and with a big fat warning and documentation from
start I wouldn't have complained so vehemently. 

> Also, there is a chicken-and-egg problem here: AFAIU a kernel patch 
> needs a userspace demonstration to motivate merging.  We can’t do a 
> userspace demonstration without this feature.  My main interest in 
> showing up for this discussion was to try to make the case that, 
> compared to 2015, there is a more convincing userspace demonstration and 
> larger population of interested users.

As one of the X86 maintainers I have to say that we were perfectly
willing to merge FSGSBASE even without the SGX background. There are
perfect other reasons to do so.

> As far as the patch and pull request, I personally think the right thing 
> to do is add the warnings you suggest, help test this or another kernel 
> patch, and advise users that patching their kernel is more secure than 
> this module.  I am not in favor of fully deleting the module, in the 
> interest of transparency and reproducibility.

Fair enough.

> Finally, I must rebut the claim that my research abuses public funds to 
> advertise for Intel.  I have been working on this problem since before I 
> knew SGX existed, and have been completely transparent regarding 
> subsequent collaborations with Intel.  I believe that understanding the 
> pros and cons of different techniques to harden an application against a 
> compromised kernel is in the public interest, and my research projects 
> have been reviewed and overseen according to standard practices at both 
> the university and US government funding agencies.  The expectations of 
> agencies in the US funding research are the paper, the insights, and 
> proof-of-concept software; converting proof-of-concept software into 
> production quality is generally considered a “nice to have”.

Sorry for that innuendo. Now that my anger and general frustration about
this whole disaster have calmed down, I surely would not write that
again.

Thanks,

        Thomas

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-23  4:19                 ` Andi Kleen
@ 2020-05-28 10:36                   ` Thomas Gleixner
  0 siblings, 0 replies; 76+ messages in thread
From: Thomas Gleixner @ 2020-05-28 10:36 UTC (permalink / raw)
  To: Andi Kleen, Don Porter
  Cc: Jarkko Sakkinen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

Andi,

Andi Kleen <ak@linux.intel.com> writes:
>> Setting the fs register in userspace is an essential feature for running
>> legacy code in SGX.  We have been following LKML discussions on this
>> instruction for years, and hoping this feature would be supported by Linux,
>
> If you need a feature you should comment on it. One of the reasons
> it took so long is that the users didn't speak up.

nice try to rewrite history.

You know very well that the only reason why FSGSBASE support is not
upstream is Intel.

It has absolutely nothing to do with users not speaking up, unless you
mean the Intel SGX people.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-26 22:51                         ` Sasha Levin
@ 2020-05-28 17:37                           ` Don Porter
  0 siblings, 0 replies; 76+ messages in thread
From: Don Porter @ 2020-05-28 17:37 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Thomas Gleixner, Jarkko Sakkinen, Andi Kleen, linux-kernel, bp,
	luto, hpa, dave.hansen, tony.luck, ravi.v.shankar,
	chang.seok.bae

On 5/26/20 6:51 PM, Sasha Levin wrote:
> On Tue, May 26, 2020 at 06:03:35PM -0400, Don Porter wrote:
>> On 5/26/20 4:27 PM, Sasha Levin wrote:
>>> I'm really worried about the disconnect between how you view the current
>>> state of Graphene (and the industry) vs Intel and the various cloud
>>> providers.
>>>
> 
> Sorry, I wasn't trying to suggest that all cloud vendors are pushing
> Graphene, but rather than SGX enabled platforms became a commodity
> product, users will end up using Graphene-like applications.
> 
> Let me provide an example:
> https://www.alibabacloud.com/blog/protecting-go-language-applications-with-the-graphene-library-os-on-intel%C2%AE-sgx%C2%AE-secured-alibaba-cloud_594889 
> 
> - a "practical" guide on how to run Graphene in production environment
>    on one of the big cloud vendor platforms.

You have convinced me there is a concerning disconnect here, and we need 
to be proactive as a project to correct this.  We are adding warnings to 
the project with all due haste.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-28 10:29                     ` Thomas Gleixner
@ 2020-05-28 17:40                       ` Don Porter
  2020-05-28 18:38                         ` Andy Lutomirski
  2020-05-28 19:19                         ` Jarkko Sakkinen
  0 siblings, 2 replies; 76+ messages in thread
From: Don Porter @ 2020-05-28 17:40 UTC (permalink / raw)
  To: Thomas Gleixner, Jarkko Sakkinen
  Cc: Andi Kleen, Sasha Levin, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

Hi Thomas,

On 5/28/20 6:29 AM, Thomas Gleixner wrote:
>> Until recently, we were doing proof-of-concept research, not product
>> development, and there are limited hours in the day.  I also hasten to
>> say that the product of research is an article, the software artifact
>> serves as documentation of the experiment.  In contrast, the product of
>> software development is software.  It takes significant time and effort
>> to convert one to the other.  Upstreaming code is of little scientific
>> interest.  But things have changed for our project; we had no users in
>> 2015 and we are now un-cutting corners that are appropriate for research
>> but inappropriate for production.  For a research artifact with an
>> audience that knew the risks, we shipped a module because it was easier
>> to maintain and install than a kernel patch.
> 
> I understand that and with a big fat warning and documentation from
> start I wouldn't have complained so vehemently.

This is a fair point.  We will fix this ASAP, and I will be more careful 
about this going forward.

>
> Sorry for that innuendo. Now that my anger and general frustration about
> this whole disaster have calmed down, I surely would not write that
> again.

I appreciate you saying so.  Thank you.

I can also understand how frustrating the history was with this feature, 
and we missed an opportunity to help sooner.  There is a lot I still 
don't understand about the process of merging and testing patches in 
this community, but if it makes sense for us to help now, we would be 
willing.

-Don

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-28 17:40                       ` Don Porter
@ 2020-05-28 18:38                         ` Andy Lutomirski
  2020-05-29 15:27                           ` Wojtek Porczyk
  2020-05-28 19:19                         ` Jarkko Sakkinen
  1 sibling, 1 reply; 76+ messages in thread
From: Andy Lutomirski @ 2020-05-28 18:38 UTC (permalink / raw)
  To: Don Porter
  Cc: Thomas Gleixner, Jarkko Sakkinen, Andi Kleen, Sasha Levin,
	linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae



> On May 28, 2020, at 10:40 AM, Don Porter <porter@cs.unc.edu> wrote:
> 
> Hi Thomas,
> 
> On 5/28/20 6:29 AM, Thomas Gleixner wrote:
>>> Until recently, we were doing proof-of-concept research, not product
>>> development, and there are limited hours in the day.  I also hasten to
>>> say that the product of research is an article, the software artifact
>>> serves as documentation of the experiment.  In contrast, the product of
>>> software development is software.  It takes significant time and effort
>>> to convert one to the other.  Upstreaming code is of little scientific
>>> interest.  But things have changed for our project; we had no users in
>>> 2015 and we are now un-cutting corners that are appropriate for research
>>> but inappropriate for production.  For a research artifact with an
>>> audience that knew the risks, we shipped a module because it was easier
>>> to maintain and install than a kernel patch.
>> I understand that and with a big fat warning and documentation from
>> start I wouldn't have complained so vehemently.
> 
> This is a fair point.  We will fix this ASAP, and I will be more careful about this going forward.
> 
>> 
>> Sorry for that innuendo. Now that my anger and general frustration about
>> this whole disaster have calmed down, I surely would not write that
>> again.
> 
> I appreciate you saying so.  Thank you.
> 
> I can also understand how frustrating the history was with this feature, and we missed an opportunity to help sooner.  There is a lot I still don't understand about the process of merging and testing patches in this community, but if it makes sense for us to help now, we would be willing.
> 
> 

With my x86 hat on, I have no particular expectation that you would be familiar with the particular problems wi TV FSGSBASE. One sequence that will kill the kernel is to use WRGSBASE to load a negative value (e.g. ~0), then set EFLAGS.TF and do SYSENTER. I’m adding a test like this to the x86 selftests.

One useful test for the actual kernel patches would be to run your SGX workload on a loaded core.  That is, do
something like taskset -c 0 graphene_thing and, simultaneously, write a trivial infinite loop program and run that under taskset -c 0 as well. For good measure, you could have perf top or perf record running at the same time.  Look for kernel errors, but also look for any evidence of your workload malfunctioning.

—Andy

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-28 17:40                       ` Don Porter
  2020-05-28 18:38                         ` Andy Lutomirski
@ 2020-05-28 19:19                         ` Jarkko Sakkinen
  2020-05-28 19:41                           ` Sasha Levin
  1 sibling, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-28 19:19 UTC (permalink / raw)
  To: Don Porter
  Cc: Thomas Gleixner, Andi Kleen, Sasha Levin, linux-kernel, bp, luto,
	hpa, dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Thu, May 28, 2020 at 01:40:16PM -0400, Don Porter wrote:
> Hi Thomas,
> 
> On 5/28/20 6:29 AM, Thomas Gleixner wrote:
> > > Until recently, we were doing proof-of-concept research, not product
> > > development, and there are limited hours in the day.  I also hasten to
> > > say that the product of research is an article, the software artifact
> > > serves as documentation of the experiment.  In contrast, the product of
> > > software development is software.  It takes significant time and effort
> > > to convert one to the other.  Upstreaming code is of little scientific
> > > interest.  But things have changed for our project; we had no users in
> > > 2015 and we are now un-cutting corners that are appropriate for research
> > > but inappropriate for production.  For a research artifact with an
> > > audience that knew the risks, we shipped a module because it was easier
> > > to maintain and install than a kernel patch.
> > 
> > I understand that and with a big fat warning and documentation from
> > start I wouldn't have complained so vehemently.
> 
> This is a fair point.  We will fix this ASAP, and I will be more careful
> about this going forward.

Are you going to experiment with this patch set and Graphene? Just
sanity checking so that I don't unnecessarily do duplicate work.

I ignored most of the discussion since I came here only with the
motivation of testing Graphene together with this patch set. I'm
assuming that motivation is always good no matter which angle you come
from. Thus, I might have missed the part I'm asking.

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-28 19:19                         ` Jarkko Sakkinen
@ 2020-05-28 19:41                           ` Sasha Levin
  2020-05-29  3:07                             ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Sasha Levin @ 2020-05-28 19:41 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Don Porter, Thomas Gleixner, Andi Kleen, linux-kernel, bp, luto,
	hpa, dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Thu, May 28, 2020 at 10:19:10PM +0300, Jarkko Sakkinen wrote:
>On Thu, May 28, 2020 at 01:40:16PM -0400, Don Porter wrote:
>> Hi Thomas,
>>
>> On 5/28/20 6:29 AM, Thomas Gleixner wrote:
>> > > Until recently, we were doing proof-of-concept research, not product
>> > > development, and there are limited hours in the day.  I also hasten to
>> > > say that the product of research is an article, the software artifact
>> > > serves as documentation of the experiment.  In contrast, the product of
>> > > software development is software.  It takes significant time and effort
>> > > to convert one to the other.  Upstreaming code is of little scientific
>> > > interest.  But things have changed for our project; we had no users in
>> > > 2015 and we are now un-cutting corners that are appropriate for research
>> > > but inappropriate for production.  For a research artifact with an
>> > > audience that knew the risks, we shipped a module because it was easier
>> > > to maintain and install than a kernel patch.
>> >
>> > I understand that and with a big fat warning and documentation from
>> > start I wouldn't have complained so vehemently.
>>
>> This is a fair point.  We will fix this ASAP, and I will be more careful
>> about this going forward.
>
>Are you going to experiment with this patch set and Graphene? Just
>sanity checking so that I don't unnecessarily do duplicate work.
>
>I ignored most of the discussion since I came here only with the
>motivation of testing Graphene together with this patch set. I'm
>assuming that motivation is always good no matter which angle you come
>from. Thus, I might have missed the part I'm asking.

This series was heavily tested with Graphene-like workloads.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-28 19:41                           ` Sasha Levin
@ 2020-05-29  3:07                             ` Jarkko Sakkinen
  2020-05-29  3:10                               ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-29  3:07 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Don Porter, Thomas Gleixner, Andi Kleen, linux-kernel, bp, luto,
	hpa, dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Thu, May 28, 2020 at 03:41:57PM -0400, Sasha Levin wrote:
> On Thu, May 28, 2020 at 10:19:10PM +0300, Jarkko Sakkinen wrote:
> > On Thu, May 28, 2020 at 01:40:16PM -0400, Don Porter wrote:
> > > Hi Thomas,
> > > 
> > > On 5/28/20 6:29 AM, Thomas Gleixner wrote:
> > > > > Until recently, we were doing proof-of-concept research, not product
> > > > > development, and there are limited hours in the day.  I also hasten to
> > > > > say that the product of research is an article, the software artifact
> > > > > serves as documentation of the experiment.  In contrast, the product of
> > > > > software development is software.  It takes significant time and effort
> > > > > to convert one to the other.  Upstreaming code is of little scientific
> > > > > interest.  But things have changed for our project; we had no users in
> > > > > 2015 and we are now un-cutting corners that are appropriate for research
> > > > > but inappropriate for production.  For a research artifact with an
> > > > > audience that knew the risks, we shipped a module because it was easier
> > > > > to maintain and install than a kernel patch.
> > > >
> > > > I understand that and with a big fat warning and documentation from
> > > > start I wouldn't have complained so vehemently.
> > > 
> > > This is a fair point.  We will fix this ASAP, and I will be more careful
> > > about this going forward.
> > 
> > Are you going to experiment with this patch set and Graphene? Just
> > sanity checking so that I don't unnecessarily do duplicate work.
> > 
> > I ignored most of the discussion since I came here only with the
> > motivation of testing Graphene together with this patch set. I'm
> > assuming that motivation is always good no matter which angle you come
> > from. Thus, I might have missed the part I'm asking.
> 
> This series was heavily tested with Graphene-like workloads.

Is there something then readily available to test such workload with SGX
enabled? Or should I go patching Graphene? Not sure what I should take
from that comment :-)

For me the main point is that I need a tool to create arbitrary work
loads and run them inside enclave, once the SGX support reaches the
upstream. It's not just about testing this particular series.

The reason why I've been passive with this work so far is that I've been
busy combining updating of SGX series for over two years and maintaining
work. Now is the first time when I have time for this.

Actually I found this by searching lore.kernel.org whether anything has
happend with this. Have had a bullet in my backlog for ages.

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-29  3:07                             ` Jarkko Sakkinen
@ 2020-05-29  3:10                               ` Jarkko Sakkinen
  2020-06-25 15:30                                 ` Don Porter
  0 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-05-29  3:10 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Don Porter, Thomas Gleixner, Andi Kleen, linux-kernel, bp, luto,
	hpa, dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Fri, May 29, 2020 at 06:07:23AM +0300, Jarkko Sakkinen wrote:
> On Thu, May 28, 2020 at 03:41:57PM -0400, Sasha Levin wrote:
> > On Thu, May 28, 2020 at 10:19:10PM +0300, Jarkko Sakkinen wrote:
> > > On Thu, May 28, 2020 at 01:40:16PM -0400, Don Porter wrote:
> > > > Hi Thomas,
> > > > 
> > > > On 5/28/20 6:29 AM, Thomas Gleixner wrote:
> > > > > > Until recently, we were doing proof-of-concept research, not product
> > > > > > development, and there are limited hours in the day.  I also hasten to
> > > > > > say that the product of research is an article, the software artifact
> > > > > > serves as documentation of the experiment.  In contrast, the product of
> > > > > > software development is software.  It takes significant time and effort
> > > > > > to convert one to the other.  Upstreaming code is of little scientific
> > > > > > interest.  But things have changed for our project; we had no users in
> > > > > > 2015 and we are now un-cutting corners that are appropriate for research
> > > > > > but inappropriate for production.  For a research artifact with an
> > > > > > audience that knew the risks, we shipped a module because it was easier
> > > > > > to maintain and install than a kernel patch.
> > > > >
> > > > > I understand that and with a big fat warning and documentation from
> > > > > start I wouldn't have complained so vehemently.
> > > > 
> > > > This is a fair point.  We will fix this ASAP, and I will be more careful
> > > > about this going forward.
> > > 
> > > Are you going to experiment with this patch set and Graphene? Just
> > > sanity checking so that I don't unnecessarily do duplicate work.
> > > 
> > > I ignored most of the discussion since I came here only with the
> > > motivation of testing Graphene together with this patch set. I'm
> > > assuming that motivation is always good no matter which angle you come
> > > from. Thus, I might have missed the part I'm asking.
> > 
> > This series was heavily tested with Graphene-like workloads.
> 
> Is there something then readily available to test such workload with SGX
> enabled? Or should I go patching Graphene? Not sure what I should take
> from that comment :-)
> 
> For me the main point is that I need a tool to create arbitrary work
> loads and run them inside enclave, once the SGX support reaches the
> upstream. It's not just about testing this particular series.
> 
> The reason why I've been passive with this work so far is that I've been
> busy combining updating of SGX series for over two years and maintaining
> work. Now is the first time when I have time for this.
> 
> Actually I found this by searching lore.kernel.org whether anything has
> happend with this. Have had a bullet in my backlog for ages.

Just need the info if anyone else is going to do something to Graphene
or not in near future. If not, I will do it myself.

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-28 18:38                         ` Andy Lutomirski
@ 2020-05-29 15:27                           ` Wojtek Porczyk
  2020-06-25 15:27                             ` Don Porter
  0 siblings, 1 reply; 76+ messages in thread
From: Wojtek Porczyk @ 2020-05-29 15:27 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Don Porter, Thomas Gleixner, Jarkko Sakkinen, Andi Kleen,
	Sasha Levin, linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

[-- Attachment #1: Type: text/plain, Size: 1163 bytes --]

On Thu, May 28, 2020 at 11:38:01AM -0700, Andy Lutomirski wrote:
> One useful test for the actual kernel patches would be to run your SGX
> workload on a loaded core.  That is, do something like taskset -c
> 0 graphene_thing and, simultaneously, write a trivial infinite loop program
> and run that under taskset -c 0 as well. For good measure, you could have
> perf top or perf record running at the same time.  Look for kernel errors,
> but also look for any evidence of your workload malfunctioning.

We currently run as part of CI several workloads[1], among them LTP tests[2],
and sometimes it's not pretty, because we encounter stability problems in
Graphene+SGX even without the patchset. We'll pick some stable subset and
will let know. Right now we'll have to retool CI for custom kernels, which
will take some back and forth with uni's admins.

[1] https://github.com/oscarlab/graphene/tree/master/Examples
[2] https://github.com/oscarlab/graphene/tree/master/LibOS/shim/test/ltp

-- 
pozdrawiam / best regards
Wojtek Porczyk
Graphene / Invisible Things Lab
 
 I do not fear computers,
 I fear lack of them.
    -- Isaac Asimov

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-29 15:27                           ` Wojtek Porczyk
@ 2020-06-25 15:27                             ` Don Porter
  2020-06-25 21:37                               ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Don Porter @ 2020-06-25 15:27 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Jarkko Sakkinen, Andi Kleen,
	Sasha Levin, linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

On 5/29/20 11:27 AM, Wojtek Porczyk wrote:
> On Thu, May 28, 2020 at 11:38:01AM -0700, Andy Lutomirski wrote:
>> One useful test for the actual kernel patches would be to run your SGX
>> workload on a loaded core.  That is, do something like taskset -c
>> 0 graphene_thing and, simultaneously, write a trivial infinite loop program
>> and run that under taskset -c 0 as well. For good measure, you could have
>> perf top or perf record running at the same time.  Look for kernel errors,
>> but also look for any evidence of your workload malfunctioning.
> 
> We currently run as part of CI several workloads[1], among them LTP tests[2],
> and sometimes it's not pretty, because we encounter stability problems in
> Graphene+SGX even without the patchset. We'll pick some stable subset and
> will let know. Right now we'll have to retool CI for custom kernels, which
> will take some back and forth with uni's admins.
> 
> [1] https://github.com/oscarlab/graphene/tree/master/Examples
> [2] https://github.com/oscarlab/graphene/tree/master/LibOS/shim/test/ltp
> 

Following up: we have been running a patched 5.7 kernel with v12 of this 
series on one of our CI workers.  As Wojtek mentions, infrastructure and 
other orthogonal issues took some time.

We have run our complete SGX testing pipelines successfully several 
times with no issues: no errors in Graphene or suspicious kernel messages.

I also did Andy's suggested test:
* Graphene running nginx pinned to core 0
* infinite loop on core 0
* perf top running
* Exercised with non-SGX apache bench several times (~10 minutes of 
testing time) also from core 0

Again, no apparent issues, nothing in dmesg.  I ran a similar setup with 
our SGX-specific Graphene (PAL) unit tests.  Same story: everything 
looks good.

Let us know if we can be of any more help here.

Thanks,
Don

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-05-29  3:10                               ` Jarkko Sakkinen
@ 2020-06-25 15:30                                 ` Don Porter
  2020-06-25 21:40                                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Don Porter @ 2020-06-25 15:30 UTC (permalink / raw)
  To: Jarkko Sakkinen, Sasha Levin
  Cc: Thomas Gleixner, Andi Kleen, linux-kernel, bp, luto, hpa,
	dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On 5/28/20 11:10 PM, Jarkko Sakkinen wrote:
> On Fri, May 29, 2020 at 06:07:23AM +0300, Jarkko Sakkinen wrote:
>>
>> Is there something then readily available to test such workload with SGX
>> enabled? Or should I go patching Graphene? Not sure what I should take
>> from that comment :-)
>>
>> For me the main point is that I need a tool to create arbitrary work
>> loads and run them inside enclave, once the SGX support reaches the
>> upstream. It's not just about testing this particular series.
>>
>> The reason why I've been passive with this work so far is that I've been
>> busy combining updating of SGX series for over two years and maintaining
>> work. Now is the first time when I have time for this.
>>
>> Actually I found this by searching lore.kernel.org whether anything has
>> happend with this. Have had a bullet in my backlog for ages.
> 
> Just need the info if anyone else is going to do something to Graphene
> or not in near future. If not, I will do it myself.
> 
> /Jarkko
> 

In re-reading, I realized you didn't get a clear answer.

We are merging the changes to Graphene to run a patched 5.7 kernel with 
these patches, so it should work for you (or anyone else) once all of 
the changes are merged.  I'd be happy to talk, perhaps off this thread, 
about how we can help you with any other SGX-related kernel testing in 
the future, or issues with running Graphene.

-Don

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-06-25 15:27                             ` Don Porter
@ 2020-06-25 21:37                               ` Jarkko Sakkinen
  2020-07-18 18:19                                 ` Don Porter
  0 siblings, 1 reply; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-06-25 21:37 UTC (permalink / raw)
  To: Don Porter
  Cc: Andy Lutomirski, Thomas Gleixner, Andi Kleen, Sasha Levin,
	linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

On Thu, Jun 25, 2020 at 11:27:28AM -0400, Don Porter wrote:
> On 5/29/20 11:27 AM, Wojtek Porczyk wrote:
> > On Thu, May 28, 2020 at 11:38:01AM -0700, Andy Lutomirski wrote:
> > > One useful test for the actual kernel patches would be to run your SGX
> > > workload on a loaded core.  That is, do something like taskset -c
> > > 0 graphene_thing and, simultaneously, write a trivial infinite loop program
> > > and run that under taskset -c 0 as well. For good measure, you could have
> > > perf top or perf record running at the same time.  Look for kernel errors,
> > > but also look for any evidence of your workload malfunctioning.
> > 
> > We currently run as part of CI several workloads[1], among them LTP tests[2],
> > and sometimes it's not pretty, because we encounter stability problems in
> > Graphene+SGX even without the patchset. We'll pick some stable subset and
> > will let know. Right now we'll have to retool CI for custom kernels, which
> > will take some back and forth with uni's admins.
> > 
> > [1] https://github.com/oscarlab/graphene/tree/master/Examples
> > [2] https://github.com/oscarlab/graphene/tree/master/LibOS/shim/test/ltp
> > 
> 
> Following up: we have been running a patched 5.7 kernel with v12 of this
> series on one of our CI workers.  As Wojtek mentions, infrastructure and
> other orthogonal issues took some time.
> 
> We have run our complete SGX testing pipelines successfully several times
> with no issues: no errors in Graphene or suspicious kernel messages.
> 
> I also did Andy's suggested test:
> * Graphene running nginx pinned to core 0
> * infinite loop on core 0
> * perf top running
> * Exercised with non-SGX apache bench several times (~10 minutes of testing
> time) also from core 0
> 
> Again, no apparent issues, nothing in dmesg.  I ran a similar setup with our
> SGX-specific Graphene (PAL) unit tests.  Same story: everything looks good.
> 
> Let us know if we can be of any more help here.
> 
> Thanks,
> Don

Can unmodified Graphene-SGX used with these changes?

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-06-25 15:30                                 ` Don Porter
@ 2020-06-25 21:40                                   ` Jarkko Sakkinen
  0 siblings, 0 replies; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-06-25 21:40 UTC (permalink / raw)
  To: Don Porter
  Cc: Sasha Levin, Thomas Gleixner, Andi Kleen, linux-kernel, bp, luto,
	hpa, dave.hansen, tony.luck, ravi.v.shankar, chang.seok.bae

On Thu, Jun 25, 2020 at 11:30:54AM -0400, Don Porter wrote:
> We are merging the changes to Graphene to run a patched 5.7 kernel with
> these patches, so it should work for you (or anyone else) once all of the
> changes are merged.  I'd be happy to talk, perhaps off this thread, about
> how we can help you with any other SGX-related kernel testing in the future,
> or issues with running Graphene.
> 
> -Don

This is great to hear. And thanks for the proposal. I'll get back to
you once I have had time to knock the ice myself a bit.

My main interest is not the upstreaming process that we are going
through right now, but more like post-upstreaming when the code needs to
be thrown at with arbitrary workloads. I think this will be a perfect
tool maturizing the code over time.

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-06-25 21:37                               ` Jarkko Sakkinen
@ 2020-07-18 18:19                                 ` Don Porter
  2020-07-23  3:23                                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 76+ messages in thread
From: Don Porter @ 2020-07-18 18:19 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Andy Lutomirski, Thomas Gleixner, Andi Kleen, Sasha Levin,
	linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

On 6/25/20 5:37 PM, Jarkko Sakkinen wrote:
> 
> Can unmodified Graphene-SGX used with these changes?
> 

Yes.  I just double-checked that all of the needed changes have made it 
to master branch.

I also re-tested on 5.8-rc1 with v13 of the patch, and it looks good.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH v12 00/18] Enable FSGSBASE instructions
  2020-07-18 18:19                                 ` Don Porter
@ 2020-07-23  3:23                                   ` Jarkko Sakkinen
  0 siblings, 0 replies; 76+ messages in thread
From: Jarkko Sakkinen @ 2020-07-23  3:23 UTC (permalink / raw)
  To: Don Porter
  Cc: Andy Lutomirski, Thomas Gleixner, Andi Kleen, Sasha Levin,
	linux-kernel, bp, luto, hpa, dave.hansen, tony.luck,
	ravi.v.shankar, chang.seok.bae

On Sat, Jul 18, 2020 at 02:19:52PM -0400, Don Porter wrote:
> On 6/25/20 5:37 PM, Jarkko Sakkinen wrote:
> > 
> > Can unmodified Graphene-SGX used with these changes?
> > 
> 
> Yes.  I just double-checked that all of the needed changes have made it to
> master branch.
> 
> I also re-tested on 5.8-rc1 with v13 of the patch, and it looks good.

OK, cool, have to play with this once I'm back from vacation (away
WW31-WW32). Thanks for the info.

/Jarkko

^ permalink raw reply	[flat|nested] 76+ messages in thread

end of thread, other threads:[~2020-07-23  3:23 UTC | newest]

Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11  4:52 [PATCH v12 00/18] Enable FSGSBASE instructions Sasha Levin
2020-05-11  4:52 ` [PATCH v12 01/18] x86/ptrace: Prevent ptrace from clearing the FS/GS selector Sasha Levin
2020-05-11  4:52 ` [PATCH v12 02/18] selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Sasha Levin
2020-05-11  4:52 ` [PATCH v12 03/18] x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Sasha Levin
2020-05-11  4:52 ` [PATCH v12 04/18] x86/entry/64: Clean up paranoid exit Sasha Levin
2020-05-11  4:52 ` [PATCH v12 05/18] x86/entry/64: Switch CR3 before SWAPGS in paranoid entry Sasha Levin
2020-05-11  4:52 ` [PATCH v12 06/18] x86/entry/64: Introduce the FIND_PERCPU_BASE macro Sasha Levin
2020-05-11  4:53 ` [PATCH v12 07/18] x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit Sasha Levin
2020-05-11  4:53 ` [PATCH v12 08/18] x86/entry/64: Document GSBASE handling in the paranoid path Sasha Levin
2020-05-11  4:53 ` [PATCH v12 09/18] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Sasha Levin
2020-05-11  4:53 ` [PATCH v12 10/18] x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions Sasha Levin
2020-05-18 18:20   ` Thomas Gleixner
2020-05-18 20:24     ` Sasha Levin
2020-05-18 22:59       ` Thomas Gleixner
2020-05-19 12:20       ` David Laight
2020-05-19 14:48         ` Thomas Gleixner
2020-05-20  9:13           ` David Laight
2020-05-11  4:53 ` [PATCH v12 11/18] x86/fsgsbase/64: Use FSGSBASE in switch_to() if available Sasha Levin
2020-05-11  4:53 ` [PATCH v12 12/18] x86/fsgsbase/64: move save_fsgs to header file Sasha Levin
2020-05-11  4:53 ` [PATCH v12 13/18] x86/fsgsbase/64: Use FSGSBASE instructions on thread copy and ptrace Sasha Levin
2020-05-11  4:53 ` [PATCH v12 14/18] x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation Sasha Levin
2020-05-11  4:53 ` [PATCH v12 15/18] selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE Sasha Levin
2020-05-11  4:53 ` [PATCH v12 16/18] x86/fsgsbase/64: Enable FSGSBASE on 64bit by default and add a chicken bit Sasha Levin
2020-05-11  4:53 ` [PATCH v12 17/18] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Sasha Levin
2020-05-11  4:53 ` [PATCH v12 18/18] Documentation/x86/64: Add documentation for GS/FS addressing mode Sasha Levin
2020-05-15  9:24 ` [PATCH v12 00/18] Enable FSGSBASE instructions Jarkko Sakkinen
2020-05-15 16:40   ` Sasha Levin
2020-05-15 17:55     ` Andi Kleen
2020-05-15 23:07       ` Sasha Levin
2020-05-16 12:21       ` Jarkko Sakkinen
2020-05-16  9:50     ` Jarkko Sakkinen
2020-05-18 15:34       ` Andi Kleen
2020-05-18 20:01         ` Jarkko Sakkinen
2020-05-18 23:03           ` Thomas Gleixner
2020-05-19 16:48             ` Jarkko Sakkinen
2020-05-22 20:14               ` Don Porter
2020-05-22 20:55                 ` Dave Hansen
2020-05-23  0:45                 ` Thomas Gleixner
2020-05-24 19:45                   ` hpa
2020-05-24 21:19                     ` Sasha Levin
2020-05-24 23:44                       ` hpa
2020-05-25  7:54                       ` Richard Weinberger
2020-05-25 21:56                         ` Tony Luck
2020-05-26  8:12                         ` David Laight
2020-05-26  8:23                           ` Richard Weinberger
2020-05-27  8:31                     ` Jarkko Sakkinen
2020-05-26 12:42                   ` Don Porter
2020-05-26 20:27                     ` Sasha Levin
2020-05-26 22:03                       ` Don Porter
2020-05-26 22:51                         ` Sasha Levin
2020-05-28 17:37                           ` Don Porter
2020-05-28 10:29                     ` Thomas Gleixner
2020-05-28 17:40                       ` Don Porter
2020-05-28 18:38                         ` Andy Lutomirski
2020-05-29 15:27                           ` Wojtek Porczyk
2020-06-25 15:27                             ` Don Porter
2020-06-25 21:37                               ` Jarkko Sakkinen
2020-07-18 18:19                                 ` Don Porter
2020-07-23  3:23                                   ` Jarkko Sakkinen
2020-05-28 19:19                         ` Jarkko Sakkinen
2020-05-28 19:41                           ` Sasha Levin
2020-05-29  3:07                             ` Jarkko Sakkinen
2020-05-29  3:10                               ` Jarkko Sakkinen
2020-06-25 15:30                                 ` Don Porter
2020-06-25 21:40                                   ` Jarkko Sakkinen
2020-05-23  4:19                 ` Andi Kleen
2020-05-28 10:36                   ` Thomas Gleixner
2020-05-27  8:20                 ` Jarkko Sakkinen
2020-05-27 12:42                   ` Wojtek Porczyk
2020-05-18  9:51     ` Thomas Gleixner
2020-05-18 15:16       ` Sasha Levin
2020-05-18 18:28         ` Thomas Gleixner
2020-05-18 19:36       ` Jarkko Sakkinen
2020-05-18  6:18 ` Christoph Hellwig
2020-05-18 12:33   ` Sasha Levin
2020-05-18 14:53 ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.