linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/13] x86: Enable FSGSBASE instructions
@ 2019-02-01 20:53 Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 01/13] taint: Introduce a new taint flag (insecure) Chang S. Bae
                   ` (15 more replies)
  0 siblings, 16 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

update from v4 [4]:
* Remove the FSGSBASE assembly macros

Update from v3 [3]:
* Raise minimum binutils requirement to use the new instructions directly
* Optimize FIND_PERCPU_BASE macro
* Rename some helper functions, __{rd,wr}gsbase_inactive()
* Use NOKPROBE_SYMBOL instead of __kprobes
* Rebase on top of the helper function fix [6]

Update from v2 [2]:
* Separate out the preparatory patches [5] (merged as of now)
* Bisect the paranoid_entry update patch
* Edit minor nits

Updates from v1 [1]:
* Update the GSBASE update mechanism on the paranoid entry/exit.
* Exclude ptracer backward compatibility patches.
* Include FSGSBASE documentation and enumerating capability
for user space
* Add the TAINT_INSECURE flag.

[1] Version 1: https://lore.kernel.org/patchwork/cover/934843
[2] Version 2: https://lore.kernel.org/patchwork/cover/912063
[3] Version 3: https://lore.kernel.org/patchwork/cover/1002725
[4] Version 4: https://lore.kernel.org/patchwork/cover/1032799
[5] https://lore.kernel.org/patchwork/cover/988180
[6] https://lore.kernel.org/patchwork/patch/1017513

Andi Kleen (3):
  x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
  x86/fsgsbase/64: Add documentation for FSGSBASE

Andy Lutomirski (4):
  x86/fsgsbase/64: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  x86/fsgsbase/64: Preserve FS/GS state in __switch_to() if FSGSBASE is
    on
  selftests/x86/fsgsbase: Test WRGSBASE
  x86/fsgsbase/64: Enable FSGSBASE by default and add a chicken bit

Chang S. Bae (6):
  taint: Introduce a new taint flag (insecure)
  kbuild: Raise the minimum required binutils version to 2.21
  x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions
  x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions
    if available
  x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro
  x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry

 .../admin-guide/kernel-parameters.txt         |   2 +
 Documentation/process/changes.rst             |   6 +-
 Documentation/sysctl/kernel.txt               |   1 +
 Documentation/x86/fsgs.txt                    | 104 +++++++++++++++++
 arch/x86/entry/entry_64.S                     |  71 +++++++++---
 arch/x86/include/asm/fsgsbase.h               | 100 ++++++++++++++--
 arch/x86/include/asm/inst.h                   |  15 +++
 arch/x86/include/uapi/asm/hwcap2.h            |   3 +
 arch/x86/kernel/cpu/common.c                  |  22 ++++
 arch/x86/kernel/process_64.c                  | 108 ++++++++++++++++--
 include/linux/kernel.h                        |   3 +-
 kernel/panic.c                                |   1 +
 tools/testing/selftests/x86/fsgsbase.c        | 102 ++++++++++++++++-
 13 files changed, 497 insertions(+), 41 deletions(-)
 create mode 100644 Documentation/x86/fsgs.txt

--
2.19.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v5 01/13] taint: Introduce a new taint flag (insecure)
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-02  2:42   ` Andy Lutomirski
  2019-02-01 20:53 ` [PATCH v5 02/13] x86/fsgsbase/64: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Chang S. Bae
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

For testing (or root-only) purposes, the new flag will serve to tag the
kernel taint accurately.

When adding a new feature support, patches need to be incrementally
applied and tested with temporal parameters. Currently, there is no flag
for this usage.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
---
 Documentation/sysctl/kernel.txt | 1 +
 include/linux/kernel.h          | 3 ++-
 kernel/panic.c                  | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 379063e58326..fb4244515314 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -1064,6 +1064,7 @@ ORed together. The letters are seen in "Tainted" line of Oops reports.
  32768 (K): The kernel has been live patched.
  65536 (X): Auxiliary taint, defined and used by for distros.
 131072 (T): The kernel was built with the struct randomization plugin.
+262144 (Z): The kernel is running in a known insecure configuration.
 
 ==============================================================
 
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 8f0e68e250a7..dc149ff8cc52 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -599,7 +599,8 @@ extern enum system_states {
 #define TAINT_LIVEPATCH			15
 #define TAINT_AUX			16
 #define TAINT_RANDSTRUCT		17
-#define TAINT_FLAGS_COUNT		18
+#define TAINT_INSECURE			18
+#define TAINT_FLAGS_COUNT		19
 
 struct taint_flag {
 	char c_true;	/* character printed when tainted */
diff --git a/kernel/panic.c b/kernel/panic.c
index f121e6ba7e11..cb6b90538375 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -362,6 +362,7 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
 	[ TAINT_LIVEPATCH ]		= { 'K', ' ', true },
 	[ TAINT_AUX ]			= { 'X', ' ', true },
 	[ TAINT_RANDSTRUCT ]		= { 'T', ' ', true },
+	[ TAINT_INSECURE ]		= { 'Z', ' ', false },
 };
 
 /**
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 02/13] x86/fsgsbase/64: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 01/13] taint: Introduce a new taint flag (insecure) Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21 Chang S. Bae
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andy Lutomirski <luto@kernel.org>

This is temporary.  It will allow the next few patches to be tested
incrementally.

Setting unsafe_fsgsbase is a root hole.  Don't do it.

[ chang: Minor fix. Add the TAINT_INSECURE flag. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
---
 .../admin-guide/kernel-parameters.txt         |  3 +++
 arch/x86/kernel/cpu/common.c                  | 27 +++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d59dff450614..871260e3e832 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2760,6 +2760,9 @@
 	no5lvl		[X86-64] Disable 5-level paging mode. Forces
 			kernel to use 4-level paging instead.
 
+	unsafe_fsgsbase	[X86] Allow FSGSBASE instructions.  This will be
+			replaced with a nofsgsbase flag.
+
 	no_console_suspend
 			[HW] Never suspend the console
 			Disable suspending of consoles during suspend and
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cb28e98a0659..6e2cba21328f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -365,6 +365,25 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 	cr4_clear_bits(X86_CR4_UMIP);
 }
 
+/*
+ * Temporary hack: FSGSBASE is unsafe until a few kernel code paths are
+ * updated. This allows us to get the kernel ready incrementally. Setting
+ * unsafe_fsgsbase and TAINT_INSECURE flags will allow the series to be
+ * bisected if necessary.
+ *
+ * Once all the pieces are in place, these will go away and be replaced with
+ * a nofsgsbase chicken flag.
+ */
+static bool unsafe_fsgsbase;
+
+static __init int setup_unsafe_fsgsbase(char *arg)
+{
+	unsafe_fsgsbase = true;
+	add_taint(TAINT_INSECURE, LOCKDEP_STILL_OK);
+	return 1;
+}
+__setup("unsafe_fsgsbase", setup_unsafe_fsgsbase);
+
 /*
  * Protection Keys are not available in 32-bit mode.
  */
@@ -1352,6 +1371,14 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_smap(c);
 	setup_umip(c);
 
+	/* Enable FSGSBASE instructions if available. */
+	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
+		if (unsafe_fsgsbase)
+			cr4_set_bits(X86_CR4_FSGSBASE);
+		else
+			clear_cpu_cap(c, X86_FEATURE_FSGSBASE);
+	}
+
 	/*
 	 * The vendor-specific functions might have changed features.
 	 * Now we do "generic changes."
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 01/13] taint: Introduce a new taint flag (insecure) Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 02/13] x86/fsgsbase/64: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-02  2:45   ` Andy Lutomirski
  2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Chang S. Bae
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML, Linux Torvalds

It helps to use some new instructions directly in inline assembly.

Suggested-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linux Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
---
 Documentation/process/changes.rst | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
index 18735dc460a0..0a18075c485e 100644
--- a/Documentation/process/changes.rst
+++ b/Documentation/process/changes.rst
@@ -31,7 +31,7 @@ you probably needn't concern yourself with isdn4k-utils.
 ====================== ===============  ========================================
 GNU C                  4.6              gcc --version
 GNU make               3.81             make --version
-binutils               2.20             ld -v
+binutils               2.21             ld -v
 flex                   2.5.35           flex --version
 bison                  2.0              bison --version
 util-linux             2.10o            fdformat --version
@@ -77,9 +77,7 @@ You will need GNU make 3.81 or later to build the kernel.
 Binutils
 --------
 
-The build system has, as of 4.13, switched to using thin archives (`ar T`)
-rather than incremental linking (`ld -r`) for built-in.a intermediate steps.
-This requires binutils 2.20 or newer.
+Binutils 2.21 or newer is needed to build the kernel.
 
 pkg-config
 ----------
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (2 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21 Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-02  2:52   ` Andy Lutomirski
  2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics/macros " Chang S. Bae
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andi Kleen <ak@linux.intel.com>

Add C intrinsics and assembler macros for the new FSBASE and GSBASE
instructions.

Very straight forward. Used in followon patches.

[ luto: Rename the variables from FS and GS to FSBASE and GSBASE and
  make <asm/fsgsbase.h> safe to include on 32-bit kernels. ]

v2: Use __always_inline

[ chang: Revise the changelog. Place them in <asm/fsgsbase.h>. Replace
  the macros with GAS-compatible ones. ]

If GCC supports it, we can add -mfsgsbase to CFLAGS and use the builtins
here for extra performance.

[ chang: Use FSGSBASE instructions directly. Removed GAS-compatible
  macros as the minimum required binutils (v2.21) supports the FSGSBASE
  instructions. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fsgsbase.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index bca4c743de77..fdd1177499b4 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -19,6 +19,36 @@ extern unsigned long x86_gsbase_read_task(struct task_struct *task);
 extern void x86_fsbase_write_task(struct task_struct *task, unsigned long fsbase);
 extern void x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase);
 
+/* Must be protected by X86_FEATURE_FSGSBASE check. */
+
+static __always_inline unsigned long rdfsbase(void)
+{
+	unsigned long fsbase;
+
+	asm volatile("rdfsbase %0" : "=r" (fsbase) :: "memory");
+
+	return fsbase;
+}
+
+static __always_inline unsigned long rdgsbase(void)
+{
+	unsigned long gsbase;
+
+	asm volatile("rdgsbase %0" : "=r" (gsbase) :: "memory");
+
+	return gsbase;
+}
+
+static __always_inline void wrfsbase(unsigned long fsbase)
+{
+	asm volatile("wrfsbase %0" :: "r" (fsbase) : "memory");
+}
+
+static __always_inline void wrgsbase(unsigned long gsbase)
+{
+	asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
+}
+
 /* Helper functions for reading/writing FS/GS base */
 
 static inline unsigned long x86_fsbase_read_cpu(void)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics/macros for FSGSBASE instructions
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (3 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 05/13] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions Chang S. Bae
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andi Kleen <ak@linux.intel.com>

Add C intrinsics and assembler macros for the new FSBASE and GSBASE
instructions.

Very straight forward. Used in followon patches.

[ luto: Rename the variables from FS and GS to FSBASE and GSBASE and
  make <asm/fsgsbase.h> safe to include on 32-bit kernels. ]

v2: Use __always_inline

[ chang: Revise the changelog. Place them in <asm/fsgsbase.h>. Replace
  the macros with GAS-compatible ones. ]

If GCC supports it, we can add -mfsgsbase to CFLAGS and use the builtins
here for extra performance.

[ chang: Use FSGSBASE instructions directly ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fsgsbase.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index bca4c743de77..fdd1177499b4 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -19,6 +19,36 @@ extern unsigned long x86_gsbase_read_task(struct task_struct *task);
 extern void x86_fsbase_write_task(struct task_struct *task, unsigned long fsbase);
 extern void x86_gsbase_write_task(struct task_struct *task, unsigned long gsbase);
 
+/* Must be protected by X86_FEATURE_FSGSBASE check. */
+
+static __always_inline unsigned long rdfsbase(void)
+{
+	unsigned long fsbase;
+
+	asm volatile("rdfsbase %0" : "=r" (fsbase) :: "memory");
+
+	return fsbase;
+}
+
+static __always_inline unsigned long rdgsbase(void)
+{
+	unsigned long gsbase;
+
+	asm volatile("rdgsbase %0" : "=r" (gsbase) :: "memory");
+
+	return gsbase;
+}
+
+static __always_inline void wrfsbase(unsigned long fsbase)
+{
+	asm volatile("wrfsbase %0" :: "r" (fsbase) : "memory");
+}
+
+static __always_inline void wrgsbase(unsigned long gsbase)
+{
+	asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
+}
+
 /* Helper functions for reading/writing FS/GS base */
 
 static inline unsigned long x86_fsbase_read_cpu(void)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 05/13] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (4 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics/macros " Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-02  2:57   ` Andy Lutomirski
  2019-02-01 20:53 ` [PATCH v5 06/13] x86/fsgsbase/64: Preserve FS/GS state in __switch_to() if FSGSBASE is on Chang S. Bae
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML, Andrew Cooper

The helper functions will switch on faster accesses to FSBASE and GSBASE
when the FSGSBASE feature is enabled.

Accessing user GSBASE needs a couple of SWAPGS operations. It is avoidable
if the user GSBASE is saved at kernel entry, being updated as changes, and
restored back at kernel exit. However, it seems to spend more cycles for
savings and restorations. Little or no benefit was measured from
experiments.

Also, introduce __{rd,wr}gsbase_inactive() as helpers to access user GSBASE
with SWAPGS. Note, for Xen PV, paravirt hooks can be added, since it may
allow a very efficient but different implementation.

[ Use NOKPROBE_SYMBOL instead of __kprobes ]

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Any Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 arch/x86/include/asm/fsgsbase.h | 27 +++++++-------
 arch/x86/kernel/process_64.c    | 62 +++++++++++++++++++++++++++++++--
 2 files changed, 72 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index fdd1177499b4..aefd53767a5d 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -49,35 +49,32 @@ static __always_inline void wrgsbase(unsigned long gsbase)
 	asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
 }
 
+#include <asm/cpufeature.h>
+
 /* Helper functions for reading/writing FS/GS base */
 
 static inline unsigned long x86_fsbase_read_cpu(void)
 {
 	unsigned long fsbase;
 
-	rdmsrl(MSR_FS_BASE, fsbase);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE))
+		fsbase = rdfsbase();
+	else
+		rdmsrl(MSR_FS_BASE, fsbase);
 
 	return fsbase;
 }
 
-static inline unsigned long x86_gsbase_read_cpu_inactive(void)
-{
-	unsigned long gsbase;
-
-	rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
-
-	return gsbase;
-}
-
 static inline void x86_fsbase_write_cpu(unsigned long fsbase)
 {
-	wrmsrl(MSR_FS_BASE, fsbase);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE))
+		wrfsbase(fsbase);
+	else
+		wrmsrl(MSR_FS_BASE, fsbase);
 }
 
-static inline void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
-{
-	wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
-}
+extern unsigned long x86_gsbase_read_cpu_inactive(void);
+extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
 
 #endif /* CONFIG_X86_64 */
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6a62f4af9fcf..ebc55ed31fe7 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -160,6 +160,42 @@ enum which_selector {
 	GS
 };
 
+/*
+ * Interrupts are disabled here. Out of line to be protected
+ * from kprobes. It is not used on Xen paravirt. When paravirt
+ * support is needed, it needs to be renamed with native_ prefix.
+ */
+static noinline unsigned long __rdgsbase_inactive(void)
+{
+	unsigned long gsbase, flags;
+
+	local_irq_save(flags);
+	native_swapgs();
+	gsbase = rdgsbase();
+	native_swapgs();
+	local_irq_restore(flags);
+
+	return gsbase;
+}
+NOKPROBE_SYMBOL(__rdgsbase_inactive);
+
+/*
+ * Interrupts are disabled here. Out of line to be protected
+ * from kprobes. It is not used on Xen paravirt. When paravirt
+ * support is needed, it needs to be renamed with native_ prefix.
+ */
+static noinline void __wrgsbase_inactive(unsigned long gsbase)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	native_swapgs();
+	wrgsbase(gsbase);
+	native_swapgs();
+	local_irq_restore(flags);
+}
+NOKPROBE_SYMBOL(__wrgsbase_inactive);
+
 /*
  * Saves the FS or GS base for an outgoing thread if FSGSBASE extensions are
  * not available.  The goal is to be reasonably fast on non-FSGSBASE systems.
@@ -338,13 +374,34 @@ static unsigned long x86_fsgsbase_read_task(struct task_struct *task,
 	return base;
 }
 
+unsigned long x86_gsbase_read_cpu_inactive(void)
+{
+	unsigned long gsbase;
+
+	if (static_cpu_has(X86_FEATURE_FSGSBASE))
+		gsbase = __rdgsbase_inactive();
+	else
+		rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
+
+	return gsbase;
+}
+
+void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
+{
+	if (static_cpu_has(X86_FEATURE_FSGSBASE))
+		__wrgsbase_inactive(gsbase);
+	else
+		wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
+}
+
 unsigned long x86_fsbase_read_task(struct task_struct *task)
 {
 	unsigned long fsbase;
 
 	if (task == current)
 		fsbase = x86_fsbase_read_cpu();
-	else if (task->thread.fsindex == 0)
+	else if (static_cpu_has(X86_FEATURE_FSGSBASE) ||
+		 (task->thread.fsindex == 0))
 		fsbase = task->thread.fsbase;
 	else
 		fsbase = x86_fsgsbase_read_task(task, task->thread.fsindex);
@@ -358,7 +415,8 @@ unsigned long x86_gsbase_read_task(struct task_struct *task)
 
 	if (task == current)
 		gsbase = x86_gsbase_read_cpu_inactive();
-	else if (task->thread.gsindex == 0)
+	else if (static_cpu_has(X86_FEATURE_FSGSBASE) ||
+		 (task->thread.gsindex == 0))
 		gsbase = task->thread.gsbase;
 	else
 		gsbase = x86_fsgsbase_read_task(task, task->thread.gsindex);
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 06/13] x86/fsgsbase/64: Preserve FS/GS state in __switch_to() if FSGSBASE is on
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (5 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 05/13] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 07/13] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available Chang S. Bae
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andy Lutomirski <luto@kernel.org>

With the new FSGSBASE instructions, we can efficiently read and write
the FSBASE and GSBASE in __switch_to().  Use that capability to preserve
the full state.

This will enable user code to do whatever it wants with the new
instructions without any kernel-induced gotchas.  (There can still be
architectural gotchas: movl %gs,%eax; movl %eax,%gs may change GSBASE
if WRGSBASE was used, but users are expected to read the CPU manual
before doing things like that.)

This is a considerable speedup.  It seems to save about 100 cycles
per context switch compared to the baseline 4.6-rc1 behavior on my
Skylake laptop.

[ chang: 5~10% performance improvements were seen by a context switch
  benchmark that ran threads with different FS/GSBASE values (to the
  baseline 4.16). Minor edit on the changelog. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/process_64.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ebc55ed31fe7..d8ade9530fdb 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -245,8 +245,18 @@ static __always_inline void save_fsgs(struct task_struct *task)
 {
 	savesegment(fs, task->thread.fsindex);
 	savesegment(gs, task->thread.gsindex);
-	save_base_legacy(task, task->thread.fsindex, FS);
-	save_base_legacy(task, task->thread.gsindex, GS);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		/*
+		 * If FSGSBASE is enabled, we can't make any useful guesses
+		 * about the base, and user code expects us to save the current
+		 * value.  Fortunately, reading the base directly is efficient.
+		 */
+		task->thread.fsbase = rdfsbase();
+		task->thread.gsbase = __rdgsbase_inactive();
+	} else {
+		save_base_legacy(task, task->thread.fsindex, FS);
+		save_base_legacy(task, task->thread.gsindex, GS);
+	}
 }
 
 #if IS_ENABLED(CONFIG_KVM)
@@ -325,10 +335,22 @@ static __always_inline void load_seg_legacy(unsigned short prev_index,
 static __always_inline void x86_fsgsbase_load(struct thread_struct *prev,
 					      struct thread_struct *next)
 {
-	load_seg_legacy(prev->fsindex, prev->fsbase,
-			next->fsindex, next->fsbase, FS);
-	load_seg_legacy(prev->gsindex, prev->gsbase,
-			next->gsindex, next->gsbase, GS);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		/* Update the FS and GS selectors if they could have changed. */
+		if (unlikely(prev->fsindex || next->fsindex))
+			loadseg(FS, next->fsindex);
+		if (unlikely(prev->gsindex || next->gsindex))
+			loadseg(GS, next->gsindex);
+
+		/* Update the bases. */
+		wrfsbase(next->fsbase);
+		__wrgsbase_inactive(next->gsbase);
+	} else {
+		load_seg_legacy(prev->fsindex, prev->fsbase,
+				next->fsindex, next->fsbase, FS);
+		load_seg_legacy(prev->gsindex, prev->gsbase,
+				next->gsindex, next->gsbase, GS);
+	}
 }
 
 static unsigned long x86_fsgsbase_read_task(struct task_struct *task,
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 07/13] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (6 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 06/13] x86/fsgsbase/64: Preserve FS/GS state in __switch_to() if FSGSBASE is on Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-02 17:28   ` Andy Lutomirski
  2019-02-01 20:53 ` [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro Chang S. Bae
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

Copy real FS/GSBASE values instead of approximation when FSGSBASE is
enabled.

Factoring out to save_fsgs() does not result in the same behavior because
save_base_legacy() does not copy FS/GSBASE when the index is zero.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/process_64.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d8ade9530fdb..648e43b58c69 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -477,10 +477,16 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp = (unsigned long) fork_frame;
 	p->thread.io_bitmap_ptr = NULL;
 
-	savesegment(gs, p->thread.gsindex);
-	p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
 	savesegment(fs, p->thread.fsindex);
-	p->thread.fsbase = p->thread.fsindex ? 0 : me->thread.fsbase;
+	savesegment(gs, p->thread.gsindex);
+	if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+		p->thread.fsbase = rdfsbase();
+		p->thread.gsbase = __rdgsbase_inactive();
+	} else {
+		/* save_base_legacy() does not set base when index is zero. */
+		p->thread.fsbase = p->thread.fsindex ? 0 : me->thread.fsbase;
+		p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
+	}
 	savesegment(es, p->thread.es);
 	savesegment(ds, p->thread.ds);
 	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (7 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 07/13] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-02 17:17   ` Andy Lutomirski
  2019-02-01 20:53 ` [PATCH v5 09/13] x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry Chang S. Bae
                   ` (6 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML, Dave Hansen

GSBASE is used to find per-CPU data in the kernel. But when it is unknown,
the per-CPU base can be found from the per_cpu_offset table with a CPU NR.
The CPU NR is extracted from the limit field of the CPUNODE entry in GDT,
or by the RDPID instruction.

Also, add the GAS-compatible RDPID macro.

The new macro will be used on a following patch.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/include/asm/fsgsbase.h | 46 +++++++++++++++++++++++++++++++++
 arch/x86/include/asm/inst.h     | 15 +++++++++++
 2 files changed, 61 insertions(+)

diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index aefd53767a5d..eecca2250748 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -78,6 +78,52 @@ extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
 
 #endif /* CONFIG_X86_64 */
 
+#else /* __ASSEMBLY__ */
+
+#ifdef CONFIG_X86_64
+
+#include <asm/inst.h>
+
+#if CONFIG_SMP
+
+/*
+ * CPU/node NR is loaded from the limit (size) field of a special segment
+ * descriptor entry in GDT.
+ */
+.macro LOAD_CPU_AND_NODE_SEG_LIMIT reg:req
+	movq	$__CPUNODE_SEG, \reg
+	lsl	\reg, \reg
+.endm
+
+/*
+ * Fetch the per-CPU GSBASE value for this processor and put it in @reg.
+ * We normally use %gs for accessing per-CPU data, but we are setting up
+ * %gs here and obviously can not use %gs itself to access per-CPU data.
+ */
+.macro FIND_PERCPU_BASE reg:req
+	/*
+	 * The CPU/node NR is initialized earlier, directly in cpu_init().
+	 * The CPU NR is extracted from it.
+	 */
+	ALTERNATIVE \
+		"LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
+		"RDPID	\reg", \
+		X86_FEATURE_RDPID
+	andq	$VDSO_CPUNODE_MASK, \reg
+	movq	__per_cpu_offset(, \reg, 8), \reg
+.endm
+
+#else
+
+.macro FIND_PERCPU_BASE reg:req
+	/* Tracking the base offset value */
+	movq	pcpu_unit_offsets(%rip), \reg
+.endm
+
+#endif /* CONFIG_SMP */
+
+#endif /* CONFIG_X86_64 */
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_FSGSBASE_H */
diff --git a/arch/x86/include/asm/inst.h b/arch/x86/include/asm/inst.h
index f5a796da07f8..d063841a17e3 100644
--- a/arch/x86/include/asm/inst.h
+++ b/arch/x86/include/asm/inst.h
@@ -306,6 +306,21 @@
 	.endif
 	MODRM 0xc0 movq_r64_xmm_opd1 movq_r64_xmm_opd2
 	.endm
+
+.macro RDPID opd
+	REG_TYPE rdpid_opd_type \opd
+	.if rdpid_opd_type == REG_TYPE_R64
+	R64_NUM rdpid_opd \opd
+	.else
+	R32_NUM rdpid_opd \opd
+	.endif
+	.byte 0xf3
+	.if rdpid_opd > 7
+	PFX_REX rdpid_opd 0
+	.endif
+	.byte 0x0f, 0xc7
+	MODRM 0xc0 rdpid_opd 0x7
+.endm
 #endif
 
 #endif
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 09/13] x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (8 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 10/13] selftests/x86/fsgsbase: Test WRGSBASE Chang S. Bae
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML, Dave Hansen

The FSGSBASE instructions allow fast accesses on GSBASE.  Now, at the
paranoid_entry, the per-CPU base value can be always copied to GSBASE.
And the original GSBASE value will be restored at the exit.

So far, GSBASE modification has not been directly allowed from userspace.
So, swapping GSBASE has been conditionally executed according to the
kernel-enforced convention that a negative GSBASE indicates a kernel value.
But when FSGSBASE is enabled, userspace can put an arbitrary value in
GSBASE. The change will secure a correct GSBASE value with FSGSBASE.

Also, factor out the RDMSR-based GSBASE read into a new macro,
READ_MSR_GSBASE.

Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/entry/entry_64.S       | 71 +++++++++++++++++++++++++++------
 arch/x86/include/asm/fsgsbase.h |  9 +++++
 2 files changed, 67 insertions(+), 13 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1f0efdb7b629..9df528565e40 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -38,6 +38,7 @@
 #include <asm/export.h>
 #include <asm/frame.h>
 #include <asm/nospec-branch.h>
+#include <asm/fsgsbase.h>
 #include <linux/err.h>
 
 #include "calling.h"
@@ -934,10 +935,14 @@ ENTRY(\sym)
 	addq	$EXCEPTION_STKSZ, CPU_TSS_IST(\shift_ist)
 	.endif
 
-	/* these procedures expect "no swapgs" flag in ebx */
 	.if \paranoid
+	/*
+	 * With FSGSBASE, original GSBASE is stored in %rbx
+	 * Without FSGSBASE, expect "no swapgs" flag in %ebx
+	 */
 	jmp	paranoid_exit
 	.else
+	/* Expect "no swapgs" flag in %ebx */
 	jmp	error_exit
 	.endif
 
@@ -1151,22 +1156,24 @@ idtentry machine_check		do_mce			has_error_code=0	paranoid=1
 #endif
 
 /*
- * Save all registers in pt_regs, and switch gs if needed.
- * Use slow, but surefire "are we in kernel?" check.
- * Return: ebx=0: need swapgs on exit, ebx=1: otherwise
+ * Save all registers in pt_regs.
+ *
+ * When FSGSBASE enabled, current GSBASE is always copied to %rbx.
+ *
+ * Without FSGSBASE, SWAPGS is needed when entering from userspace.
+ * A positive GSBASE means it is a user value and a negative GSBASE
+ * means it is a kernel value.
+ *
+ * Return:
+ * 	With FSGSBASE, %rbx has current GSBASE.
+ * 	Without that,
+ *		%ebx=0: need SWAPGS on exit, %ebx=1: otherwise
  */
 ENTRY(paranoid_entry)
 	UNWIND_HINT_FUNC
 	cld
 	PUSH_AND_CLEAR_REGS save_ret=1
 	ENCODE_FRAME_POINTER 8
-	movl	$1, %ebx
-	movl	$MSR_GS_BASE, %ecx
-	rdmsr
-	testl	%edx, %edx
-	js	1f				/* negative -> in kernel */
-	SWAPGS
-	xorl	%ebx, %ebx
 
 1:
 	/*
@@ -1178,9 +1185,38 @@ ENTRY(paranoid_entry)
 	 * This is also why CS (stashed in the "iret frame" by the
 	 * hardware at entry) can not be used: this may be a return
 	 * to kernel code, but with a user CR3 value.
+	 *
+	 * As long as this PTI macro doesn't depend on kernel GSBASE,
+	 * we can do it early. This is because FIND_PERCPU_BASE
+	 * references data in kernel space.
 	 */
 	SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
 
+	/*
+	 * Read GSBASE by RDGSBASE. Kernel GSBASE is found
+	 * from the per-CPU offset table with a CPU NR.
+	 */
+	ALTERNATIVE "jmp .Lparanoid_entry_no_fsgsbase",	"",\
+		X86_FEATURE_FSGSBASE
+	rdgsbase	%rbx
+	FIND_PERCPU_BASE	%rax
+	wrgsbase	%rax
+	ret
+
+.Lparanoid_entry_no_fsgsbase:
+	movl	$1, %ebx
+	/*
+	 * FSGSBASE is not in use, so depend on the kernel-enforced
+	 * convention that a negative GSBASE indicates a kernel value.
+	 */
+	READ_MSR_GSBASE save_reg=%edx
+	testl	%edx, %edx	/* Negative -> in kernel */
+	jns	.Lparanoid_entry_swapgs
+	ret
+
+.Lparanoid_entry_swapgs:
+	SWAPGS
+	xorl	%ebx, %ebx
 	ret
 END(paranoid_entry)
 
@@ -1194,12 +1230,21 @@ END(paranoid_entry)
  * be complicated.  Fortunately, we there's no good reason
  * to try to handle preemption here.
  *
- * On entry, ebx is "no swapgs" flag (1: don't need swapgs, 0: need it)
+ * On entry,
+ *	With FSGSBASE,
+ *		%rbx is original GSBASE that needs to be restored on the exit
+ *	Without that,
+ * 		%ebx is "no swapgs" flag (1: don't need swapgs, 0: need it)
  */
 ENTRY(paranoid_exit)
 	UNWIND_HINT_REGS
 	DISABLE_INTERRUPTS(CLBR_ANY)
 	TRACE_IRQS_OFF_DEBUG
+	ALTERNATIVE "jmp .Lparanoid_exit_no_fsgsbase",	"nop",\
+		X86_FEATURE_FSGSBASE
+	wrgsbase	%rbx
+	jmp	.Lparanoid_exit_no_swapgs;
+.Lparanoid_exit_no_fsgsbase:
 	testl	%ebx, %ebx			/* swapgs needed? */
 	jnz	.Lparanoid_exit_no_swapgs
 	TRACE_IRQS_IRETQ
@@ -1212,7 +1257,7 @@ ENTRY(paranoid_exit)
 	/* Always restore stashed CR3 value (see paranoid_entry) */
 	RESTORE_CR3	scratch_reg=%rbx save_reg=%r14
 .Lparanoid_exit_restore:
-	jmp restore_regs_and_return_to_kernel
+	jmp	restore_regs_and_return_to_kernel
 END(paranoid_exit)
 
 /*
diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
index eecca2250748..1cb7b03c107a 100644
--- a/arch/x86/include/asm/fsgsbase.h
+++ b/arch/x86/include/asm/fsgsbase.h
@@ -122,6 +122,15 @@ extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
 
 #endif /* CONFIG_SMP */
 
+.macro READ_MSR_GSBASE save_reg:req
+	movl	$MSR_GS_BASE, %ecx
+	/* Read MSR specified by %ecx into %edx:%eax */
+	rdmsr
+	.ifnc \save_reg, %edx
+	movl	%edx, \save_reg
+	.endif
+.endm
+
 #endif /* CONFIG_X86_64 */
 
 #endif /* __ASSEMBLY__ */
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 10/13] selftests/x86/fsgsbase: Test WRGSBASE
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (9 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 09/13] x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 11/13] x86/fsgsbase/64: Enable FSGSBASE by default and add a chicken bit Chang S. Bae
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andy Lutomirski <luto@kernel.org>

This validates that GS and GSBASE are independently preserved across
context switches.

[ chang: Use FSGSBASE instructions directly instead of .byte ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 tools/testing/selftests/x86/fsgsbase.c | 102 ++++++++++++++++++++++++-
 1 file changed, 99 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/x86/fsgsbase.c b/tools/testing/selftests/x86/fsgsbase.c
index f249e042b3b5..5956475972f1 100644
--- a/tools/testing/selftests/x86/fsgsbase.c
+++ b/tools/testing/selftests/x86/fsgsbase.c
@@ -23,6 +23,7 @@
 #include <pthread.h>
 #include <asm/ldt.h>
 #include <sys/mman.h>
+#include <setjmp.h>
 
 #ifndef __x86_64__
 # error This test is 64-bit only
@@ -71,6 +72,43 @@ static void sigsegv(int sig, siginfo_t *si, void *ctx_void)
 
 }
 
+static jmp_buf jmpbuf;
+
+static void sigill(int sig, siginfo_t *si, void *ctx_void)
+{
+	siglongjmp(jmpbuf, 1);
+}
+
+static bool have_fsgsbase;
+
+static inline unsigned long rdgsbase(void)
+{
+	unsigned long gsbase;
+
+	asm volatile("rdgsbase %0" : "=r" (gsbase) :: "memory");
+
+	return gsbase;
+}
+
+static inline unsigned long rdfsbase(void)
+{
+	unsigned long fsbase;
+
+	asm volatile("rdfsbase %0" : "=r" (fsbase) :: "memory");
+
+	return fsbase;
+}
+
+static inline void wrgsbase(unsigned long gsbase)
+{
+	asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
+}
+
+static inline void wrfsbase(unsigned long fsbase)
+{
+	asm volatile("wrfsbase %0" :: "r" (fsbase) : "memory");
+}
+
 enum which_base { FS, GS };
 
 static unsigned long read_base(enum which_base which)
@@ -199,14 +237,16 @@ static void do_remote_base()
 	       to_set, hard_zero ? " and clear gs" : "", sel);
 }
 
-void do_unexpected_base(void)
+static __thread int set_thread_area_entry_number = -1;
+
+static void do_unexpected_base(void)
 {
 	/*
 	 * The goal here is to try to arrange for GS == 0, GSBASE !=
 	 * 0, and for the the kernel the think that GSBASE == 0.
 	 *
 	 * To make the test as reliable as possible, this uses
-	 * explicit descriptorss.  (This is not the only way.  This
+	 * explicit descriptors.  (This is not the only way.  This
 	 * could use ARCH_SET_GS with a low, nonzero base, but the
 	 * relevant side effect of ARCH_SET_GS could change.)
 	 */
@@ -239,7 +279,7 @@ void do_unexpected_base(void)
 			MAP_PRIVATE | MAP_ANONYMOUS | MAP_32BIT, -1, 0);
 		memcpy(low_desc, &desc, sizeof(desc));
 
-		low_desc->entry_number = -1;
+		low_desc->entry_number = set_thread_area_entry_number;
 
 		/* 32-bit set_thread_area */
 		long ret;
@@ -254,6 +294,8 @@ void do_unexpected_base(void)
 			return;
 		}
 		printf("\tother thread: using GDT slot %d\n", desc.entry_number);
+		set_thread_area_entry_number = desc.entry_number;
+
 		asm volatile ("mov %0, %%gs" : : "rm" ((unsigned short)((desc.entry_number << 3) | 0x3)));
 	}
 
@@ -265,6 +307,34 @@ void do_unexpected_base(void)
 	asm volatile ("mov %0, %%gs" : : "rm" ((unsigned short)0));
 }
 
+void test_wrbase(unsigned short index, unsigned long base)
+{
+	unsigned short newindex;
+	unsigned long newbase;
+
+	printf("[RUN]\tGS = 0x%hx, GSBASE = 0x%lx\n", index, base);
+
+	asm volatile ("mov %0, %%gs" : : "rm" (index));
+	wrgsbase(base);
+
+	remote_base = 0;
+	ftx = 1;
+	syscall(SYS_futex, &ftx, FUTEX_WAKE, 0, NULL, NULL, 0);
+	while (ftx != 0)
+		syscall(SYS_futex, &ftx, FUTEX_WAIT, 1, NULL, NULL, 0);
+
+	asm volatile ("mov %%gs, %0" : "=rm" (newindex));
+	newbase = rdgsbase();
+
+	if (newindex == index && newbase == base) {
+		printf("[OK]\tIndex and base were preserved\n");
+	} else {
+		printf("[FAIL]\tAfter switch, GS = 0x%hx and GSBASE = 0x%lx\n",
+		       newindex, newbase);
+		nerrs++;
+	}
+}
+
 static void *threadproc(void *ctx)
 {
 	while (1) {
@@ -371,6 +441,17 @@ int main()
 {
 	pthread_t thread;
 
+	/* Probe FSGSBASE */
+	sethandler(SIGILL, sigill, 0);
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		rdfsbase();
+		have_fsgsbase = true;
+		printf("\tFSGSBASE instructions are enabled\n");
+	} else {
+		printf("\tFSGSBASE instructions are disabled\n");
+	}
+	clearhandler(SIGILL);
+
 	sethandler(SIGSEGV, sigsegv, 0);
 
 	check_gs_value(0);
@@ -417,6 +498,21 @@ int main()
 
 	test_unexpected_base();
 
+	if (have_fsgsbase) {
+		unsigned short ss;
+
+		asm volatile ("mov %%ss, %0" : "=rm" (ss));
+
+		test_wrbase(0, 0);
+		test_wrbase(0, 1);
+		test_wrbase(0, 0x200000000);
+		test_wrbase(0, 0xffffffffffffffff);
+		test_wrbase(ss, 0);
+		test_wrbase(ss, 1);
+		test_wrbase(ss, 0x200000000);
+		test_wrbase(ss, 0xffffffffffffffff);
+	}
+
 	ftx = 3;  /* Kill the thread. */
 	syscall(SYS_futex, &ftx, FUTEX_WAKE, 0, NULL, NULL, 0);
 
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 11/13] x86/fsgsbase/64: Enable FSGSBASE by default and add a chicken bit
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (10 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 10/13] selftests/x86/fsgsbase: Test WRGSBASE Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 12/13] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Chang S. Bae
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andy Lutomirski <luto@kernel.org>

Now that FSGSBASE is fully supported, remove unsafe_fsgsbase, enable
FSGSBASE by default, and add nofsgsbase to disable it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 .../admin-guide/kernel-parameters.txt         |  3 +-
 arch/x86/kernel/cpu/common.c                  | 35 ++++++++-----------
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 871260e3e832..20ab1ba22a3e 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2760,8 +2760,7 @@
 	no5lvl		[X86-64] Disable 5-level paging mode. Forces
 			kernel to use 4-level paging instead.
 
-	unsafe_fsgsbase	[X86] Allow FSGSBASE instructions.  This will be
-			replaced with a nofsgsbase flag.
+	nofsgsbase	[X86] Disables FSGSBASE instructions.
 
 	no_console_suspend
 			[HW] Never suspend the console
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 6e2cba21328f..3d7d4ca1a29e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -365,24 +365,21 @@ static __always_inline void setup_umip(struct cpuinfo_x86 *c)
 	cr4_clear_bits(X86_CR4_UMIP);
 }
 
-/*
- * Temporary hack: FSGSBASE is unsafe until a few kernel code paths are
- * updated. This allows us to get the kernel ready incrementally. Setting
- * unsafe_fsgsbase and TAINT_INSECURE flags will allow the series to be
- * bisected if necessary.
- *
- * Once all the pieces are in place, these will go away and be replaced with
- * a nofsgsbase chicken flag.
- */
-static bool unsafe_fsgsbase;
-
-static __init int setup_unsafe_fsgsbase(char *arg)
+static __init int x86_nofsgsbase_setup(char *arg)
 {
-	unsafe_fsgsbase = true;
-	add_taint(TAINT_INSECURE, LOCKDEP_STILL_OK);
+	/* Require an exact match without trailing characters. */
+	if (strlen(arg))
+		return 0;
+
+	/* Do not emit a message if the feature is not present. */
+	if (!boot_cpu_has(X86_FEATURE_FSGSBASE))
+		return 1;
+
+	setup_clear_cpu_cap(X86_FEATURE_FSGSBASE);
+	pr_info("nofsgsbase: FSGSBASE disabled\n");
 	return 1;
 }
-__setup("unsafe_fsgsbase", setup_unsafe_fsgsbase);
+__setup("nofsgsbase", x86_nofsgsbase_setup);
 
 /*
  * Protection Keys are not available in 32-bit mode.
@@ -1372,12 +1369,8 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_umip(c);
 
 	/* Enable FSGSBASE instructions if available. */
-	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
-		if (unsafe_fsgsbase)
-			cr4_set_bits(X86_CR4_FSGSBASE);
-		else
-			clear_cpu_cap(c, X86_FEATURE_FSGSBASE);
-	}
+	if (cpu_has(c, X86_FEATURE_FSGSBASE))
+		cr4_set_bits(X86_CR4_FSGSBASE);
 
 	/*
 	 * The vendor-specific functions might have changed features.
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 12/13] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (11 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 11/13] x86/fsgsbase/64: Enable FSGSBASE by default and add a chicken bit Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 20:53 ` [PATCH v5 13/13] x86/fsgsbase/64: Add documentation for FSGSBASE Chang S. Bae
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andi Kleen <ak@linux.intel.com>

The kernel needs to explicitly enable FSGSBASE. So, the application needs
to know if it can safely use these instructions. Just looking at the CPUID
bit is not enough because it may be running in a kernel that does not
enable the instructions.

One way for the application would be to just try and catch the SIGILL.
But that is difficult to do in libraries which may not want to overwrite
the signal handlers of the main application.

So we need to provide a way for the application to discover the kernel
capability.

I used AT_HWCAP2 in the ELF aux vector which is already used by PPC for
similar things. We define a new Linux defined bitmap returned in AT_HWCAP.
Next to MONITOR/MWAIT, bit 1 is reserved for FSGSBASE capability checks.

The application can then access it manually or using the getauxval()
function in newer glibc.

[ chang: Rebase and edit the changelog accordingly. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/uapi/asm/hwcap2.h | 3 +++
 arch/x86/kernel/cpu/common.c       | 4 +++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/uapi/asm/hwcap2.h b/arch/x86/include/uapi/asm/hwcap2.h
index 6ebaae90e207..c5ce54e749f6 100644
--- a/arch/x86/include/uapi/asm/hwcap2.h
+++ b/arch/x86/include/uapi/asm/hwcap2.h
@@ -5,4 +5,7 @@
 /* MONITOR/MWAIT enabled in Ring 3 */
 #define HWCAP2_RING3MWAIT		(1 << 0)
 
+/* Kernel allows FSGSBASE instructions available in Ring 3 */
+#define HWCAP2_FSGSBASE			BIT(1)
+
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 3d7d4ca1a29e..3bdac91316c9 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1369,8 +1369,10 @@ static void identify_cpu(struct cpuinfo_x86 *c)
 	setup_umip(c);
 
 	/* Enable FSGSBASE instructions if available. */
-	if (cpu_has(c, X86_FEATURE_FSGSBASE))
+	if (cpu_has(c, X86_FEATURE_FSGSBASE)) {
 		cr4_set_bits(X86_CR4_FSGSBASE);
+		elf_hwcap2 |= HWCAP2_FSGSBASE;
+	}
 
 	/*
 	 * The vendor-specific functions might have changed features.
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v5 13/13] x86/fsgsbase/64: Add documentation for FSGSBASE
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (12 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 12/13] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Chang S. Bae
@ 2019-02-01 20:53 ` Chang S. Bae
  2019-02-01 23:02 ` [PATCH v5 00/13] x86: Enable FSGSBASE instructions Andi Kleen
  2019-02-02  2:43 ` Andy Lutomirski
  15 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2019-02-01 20:53 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen
  Cc: Markus T Metzger, Ravi Shankar, Chang S . Bae, LKML

From: Andi Kleen <ak@linux.intel.com>

v2: Minor updates to documentation requested in review.
v3: Update for new gcc and various improvements.

[ chang: Fix some typo. Fix the example code. ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
---
 Documentation/x86/fsgs.txt | 104 +++++++++++++++++++++++++++++++++++++
 1 file changed, 104 insertions(+)
 create mode 100644 Documentation/x86/fsgs.txt

diff --git a/Documentation/x86/fsgs.txt b/Documentation/x86/fsgs.txt
new file mode 100644
index 000000000000..7a973a5c1767
--- /dev/null
+++ b/Documentation/x86/fsgs.txt
@@ -0,0 +1,104 @@
+
+Using FS and GS prefixes on 64bit x86 linux
+
+The x86 architecture supports segment prefixes per instruction to add an
+offset to an address.  On 64bit x86, these are mostly nops, except for FS
+and GS.
+
+This offers an efficient way to reference a global pointer.
+
+The compiler has to generate special code to use these base registers,
+or they can be accessed with inline assembler.
+
+	mov %gs:offset,%reg
+	mov %fs:offset,%reg
+
+On 64bit code, FS is used to address the thread local segment (TLS), declared using
+__thread.  The compiler then automatically generates the correct prefixes and
+relocations to access these values.
+
+FS is normally managed by the runtime code or the threading library
+Overwriting it can break a lot of things (including syscalls and gdb),
+but it can make sense to save/restore it for threading purposes.
+
+GS is freely available, but may need special (compiler or inline assembler)
+code to use.
+
+Traditionally 64bit FS and GS could be set by the arch_prctl system call
+
+	arch_prctl(ARCH_SET_GS, value)
+	arch_prctl(ARCH_SET_FS, value)
+
+[There was also an older method using modify_ldt(), inherited from 32bit,
+but this is not discussed here.]
+
+However using a syscall is problematic for user space threading libraries
+that want to context switch in user space. The whole point of them
+is avoiding the overhead of a syscall. It's also cleaner for compilers
+wanting to use the extra register to use instructions to write
+it, or read it directly to compute addresses and offsets.
+
+Newer Intel CPUs (Ivy Bridge and later) added new instructions to directly
+access these registers quickly from user context
+
+	RDFSBASE %reg	read the FS base	(or _readfsbase_u64)
+	RDGSBASE %reg	read the GS base	(or _readgsbase_u64)
+
+	WRFSBASE %reg	write the FS base	(or _writefsbase_u64)
+	WRGSBASE %reg	write the GS base	(or _writegsbase_u64)
+
+If you use the intrinsics include <immintrin.h> and set the -mfsgsbase option.
+
+The instructions are supported by the CPU when the "fsgsbase" string is shown in
+/proc/cpuinfo (or directly retrieved through the CPUID instruction,
+7:0 (ebx), word 9, bit 0)
+
+The instructions are only available to 64bit binaries.
+
+In addition the kernel needs to explicitly enable these instructions, as it
+may otherwise not correctly context switch the state. Newer Linux
+kernels enable this. When the kernel did not enable the instruction
+they will fault with an #UD exception.
+
+An FSGSBASE enabled kernel can be detected by checking the AT_HWCAP2
+bitmask in the aux vector. When the HWCAP2_FSGSBASE bit is set the
+kernel supports FSGSBASE.
+
+	#include <sys/auxv.h>
+	#include <elf.h>
+
+	/* Will be eventually in asm/hwcap.h */
+	#define HWCAP2_FSGSBASE        (1 << 1)
+
+        unsigned val = getauxval(AT_HWCAP2);
+        if (val & HWCAP2_FSGSBASE) {
+                asm("wrgsbase %0" :: "r" (ptr));
+        }
+
+No extra CPUID check needed as the kernel will not set this bit if the CPU
+does not support it.
+
+gcc 6 will have special support to directly access data relative
+to fs/gs using the __seg_fs and __seg_gs address space pointer
+modifiers.
+
+#ifndef __SEG_GS
+#error "Need gcc 6 or later"
+#endif
+
+struct gsdata {
+	int a;
+	int b;
+} gsdata = { 1, 2 };
+
+int __seg_gs *valp = 0;		/* offset relative to GS */
+
+	/* Check if kernel supports FSGSBASE as above */
+
+	/* Set up new GS */
+	asm("wrgsbase %0" :: "r" (&gsdata));
+
+	/* Now the global pointer can be used normally */
+	printf("gsdata.a = %d\n", *valp);
+
+Andi Kleen
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 00/13] x86: Enable FSGSBASE instructions
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (13 preceding siblings ...)
  2019-02-01 20:53 ` [PATCH v5 13/13] x86/fsgsbase/64: Add documentation for FSGSBASE Chang S. Bae
@ 2019-02-01 23:02 ` Andi Kleen
  2019-02-02  2:43 ` Andy Lutomirski
  15 siblings, 0 replies; 29+ messages in thread
From: Andi Kleen @ 2019-02-01 23:02 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Markus T Metzger, Ravi Shankar, LKML


Patches all look good to me.

Reviewed-by: Andi Kleen <ak@linux.intel.com>

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 01/13] taint: Introduce a new taint flag (insecure)
  2019-02-01 20:53 ` [PATCH v5 01/13] taint: Introduce a new taint flag (insecure) Chang S. Bae
@ 2019-02-02  2:42   ` Andy Lutomirski
  2019-02-05 21:21     ` Andrew Morton
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02  2:42 UTC (permalink / raw)
  To: Chang S. Bae, Andrew Morton
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML

On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> For testing (or root-only) purposes, the new flag will serve to tag the
> kernel taint accurately.
>
> When adding a new feature support, patches need to be incrementally
> applied and tested with temporal parameters. Currently, there is no flag
> for this usage.

I think this should be reviewed by someone like akpm.  akpm, for
background, this is part of an x86 patch series.  If only part of the
series is applied, the kernel will be blatantly insecure (but still
functional and useful for testing and bisection), and this taint flag
will be set if this kernel is booted.  With the whole series applied,
there are no users of the taint flag in the kernel.

Do you think this is a good idea?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 00/13] x86: Enable FSGSBASE instructions
  2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
                   ` (14 preceding siblings ...)
  2019-02-01 23:02 ` [PATCH v5 00/13] x86: Enable FSGSBASE instructions Andi Kleen
@ 2019-02-02  2:43 ` Andy Lutomirski
  2019-02-05  6:26   ` hpa
  15 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02  2:43 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML

Hi hpa-

A while back, you were working on some patches to make modify_ldt()
play better with this series.  What happened to them?

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21
  2019-02-01 20:53 ` [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21 Chang S. Bae
@ 2019-02-02  2:45   ` Andy Lutomirski
  2019-02-05  0:08     ` Andrew Morton
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02  2:45 UTC (permalink / raw)
  To: Chang S. Bae, Andrew Morton
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML, Linux Torvalds

On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> It helps to use some new instructions directly in inline assembly.

akpm, can you ack this patch?  AFAIK you are the only, or at least
most vocal, user of ancient userspace to build new kernels.  Are you
okay with this?

>
> Suggested-by: Andi Kleen <ak@linux.intel.com>
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Linux Torvalds <torvalds@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: H. Peter Anvin <hpa@zytor.com>
> ---
>  Documentation/process/changes.rst | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
> index 18735dc460a0..0a18075c485e 100644
> --- a/Documentation/process/changes.rst
> +++ b/Documentation/process/changes.rst
> @@ -31,7 +31,7 @@ you probably needn't concern yourself with isdn4k-utils.
>  ====================== ===============  ========================================
>  GNU C                  4.6              gcc --version
>  GNU make               3.81             make --version
> -binutils               2.20             ld -v
> +binutils               2.21             ld -v
>  flex                   2.5.35           flex --version
>  bison                  2.0              bison --version
>  util-linux             2.10o            fdformat --version
> @@ -77,9 +77,7 @@ You will need GNU make 3.81 or later to build the kernel.
>  Binutils
>  --------
>
> -The build system has, as of 4.13, switched to using thin archives (`ar T`)
> -rather than incremental linking (`ld -r`) for built-in.a intermediate steps.
> -This requires binutils 2.20 or newer.
> +Binutils 2.21 or newer is needed to build the kernel.
>
>  pkg-config
>  ----------
> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions
  2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Chang S. Bae
@ 2019-02-02  2:52   ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02  2:52 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML

On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> From: Andi Kleen <ak@linux.intel.com>
>
> Add C intrinsics and assembler macros for the new FSBASE and GSBASE
> instructions.
>
> Very straight forward. Used in followon patches.
>
> [ luto: Rename the variables from FS and GS to FSBASE and GSBASE and
>   make <asm/fsgsbase.h> safe to include on 32-bit kernels. ]
>
> v2: Use __always_inline
>
> [ chang: Revise the changelog. Place them in <asm/fsgsbase.h>. Replace
>   the macros with GAS-compatible ones. ]
>
> If GCC supports it, we can add -mfsgsbase to CFLAGS and use the builtins
> here for extra performance

Does it really get better performance?  If so, let's do it.  If not,
let's remove the comment.  And, whatever you do, please put this above
the [luto] and [chang] parts.
.
>
> [ chang: Use FSGSBASE instructions directly. Removed GAS-compatible
>   macros as the minimum required binutils (v2.21) supports the FSGSBASE
>   instructions. ]

Can you stick the "v2" revision notes below the --- or even just
remove them?  It makes the changelog a lot harder to review and it's
not really useful in the git tree.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 05/13] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions
  2019-02-01 20:53 ` [PATCH v5 05/13] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions Chang S. Bae
@ 2019-02-02  2:57   ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02  2:57 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML, Andrew Cooper

On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> The helper functions will switch on faster accesses to FSBASE and GSBASE
> when the FSGSBASE feature is enabled.
>
> Accessing user GSBASE needs a couple of SWAPGS operations. It is avoidable
> if the user GSBASE is saved at kernel entry, being updated as changes, and
> restored back at kernel exit. However, it seems to spend more cycles for
> savings and restorations. Little or no benefit was measured from
> experiments.
>
> Also, introduce __{rd,wr}gsbase_inactive() as helpers to access user GSBASE
> with SWAPGS. Note, for Xen PV, paravirt hooks can be added, since it may
> allow a very efficient but different implementation.
>
> [ Use NOKPROBE_SYMBOL instead of __kprobes ]

^^^ This line looks like it shold be deleted.

>
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Cc: Any Lutomirski <luto@kernel.org>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  arch/x86/include/asm/fsgsbase.h | 27 +++++++-------
>  arch/x86/kernel/process_64.c    | 62 +++++++++++++++++++++++++++++++--
>  2 files changed, 72 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
> index fdd1177499b4..aefd53767a5d 100644
> --- a/arch/x86/include/asm/fsgsbase.h
> +++ b/arch/x86/include/asm/fsgsbase.h
> @@ -49,35 +49,32 @@ static __always_inline void wrgsbase(unsigned long gsbase)
>         asm volatile("wrgsbase %0" :: "r" (gsbase) : "memory");
>  }
>
> +#include <asm/cpufeature.h>
> +
>  /* Helper functions for reading/writing FS/GS base */
>
>  static inline unsigned long x86_fsbase_read_cpu(void)
>  {
>         unsigned long fsbase;
>
> -       rdmsrl(MSR_FS_BASE, fsbase);
> +       if (static_cpu_has(X86_FEATURE_FSGSBASE))
> +               fsbase = rdfsbase();
> +       else
> +               rdmsrl(MSR_FS_BASE, fsbase);
>
>         return fsbase;
>  }
>
> -static inline unsigned long x86_gsbase_read_cpu_inactive(void)
> -{
> -       unsigned long gsbase;
> -
> -       rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
> -
> -       return gsbase;
> -}
> -
>  static inline void x86_fsbase_write_cpu(unsigned long fsbase)
>  {
> -       wrmsrl(MSR_FS_BASE, fsbase);
> +       if (static_cpu_has(X86_FEATURE_FSGSBASE))
> +               wrfsbase(fsbase);
> +       else
> +               wrmsrl(MSR_FS_BASE, fsbase);
>  }
>
> -static inline void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
> -{
> -       wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
> -}
> +extern unsigned long x86_gsbase_read_cpu_inactive(void);
> +extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
>
>  #endif /* CONFIG_X86_64 */
>
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index 6a62f4af9fcf..ebc55ed31fe7 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -160,6 +160,42 @@ enum which_selector {
>         GS
>  };
>
> +/*
> + * Interrupts are disabled here. Out of line to be protected
> + * from kprobes. It is not used on Xen paravirt. When paravirt
> + * support is needed, it needs to be renamed with native_ prefix.
> + */
> +static noinline unsigned long __rdgsbase_inactive(void)
> +{
> +       unsigned long gsbase, flags;
> +
> +       local_irq_save(flags);
> +       native_swapgs();
> +       gsbase = rdgsbase();
> +       native_swapgs();
> +       local_irq_restore(flags);
> +
> +       return gsbase;
> +}
> +NOKPROBE_SYMBOL(__rdgsbase_inactive);
> +
> +/*
> + * Interrupts are disabled here. Out of line to be protected
> + * from kprobes. It is not used on Xen paravirt. When paravirt
> + * support is needed, it needs to be renamed with native_ prefix.
> + */
> +static noinline void __wrgsbase_inactive(unsigned long gsbase)
> +{
> +       unsigned long flags;
> +
> +       local_irq_save(flags);
> +       native_swapgs();
> +       wrgsbase(gsbase);
> +       native_swapgs();
> +       local_irq_restore(flags);
> +}
> +NOKPROBE_SYMBOL(__wrgsbase_inactive);
> +
>  /*
>   * Saves the FS or GS base for an outgoing thread if FSGSBASE extensions are
>   * not available.  The goal is to be reasonably fast on non-FSGSBASE systems.
> @@ -338,13 +374,34 @@ static unsigned long x86_fsgsbase_read_task(struct task_struct *task,
>         return base;
>  }
>
> +unsigned long x86_gsbase_read_cpu_inactive(void)
> +{
> +       unsigned long gsbase;
> +
> +       if (static_cpu_has(X86_FEATURE_FSGSBASE))
> +               gsbase = __rdgsbase_inactive();
> +       else
> +               rdmsrl(MSR_KERNEL_GS_BASE, gsbase);
> +
> +       return gsbase;
> +}
> +
> +void x86_gsbase_write_cpu_inactive(unsigned long gsbase)
> +{
> +       if (static_cpu_has(X86_FEATURE_FSGSBASE))
> +               __wrgsbase_inactive(gsbase);
> +       else
> +               wrmsrl(MSR_KERNEL_GS_BASE, gsbase);
> +}
> +
>  unsigned long x86_fsbase_read_task(struct task_struct *task)
>  {
>         unsigned long fsbase;
>
>         if (task == current)
>                 fsbase = x86_fsbase_read_cpu();
> -       else if (task->thread.fsindex == 0)
> +       else if (static_cpu_has(X86_FEATURE_FSGSBASE) ||
> +                (task->thread.fsindex == 0))
>                 fsbase = task->thread.fsbase;
>         else
>                 fsbase = x86_fsgsbase_read_task(task, task->thread.fsindex);
> @@ -358,7 +415,8 @@ unsigned long x86_gsbase_read_task(struct task_struct *task)
>
>         if (task == current)
>                 gsbase = x86_gsbase_read_cpu_inactive();
> -       else if (task->thread.gsindex == 0)
> +       else if (static_cpu_has(X86_FEATURE_FSGSBASE) ||
> +                (task->thread.gsindex == 0))
>                 gsbase = task->thread.gsbase;
>         else
>                 gsbase = x86_fsgsbase_read_task(task, task->thread.gsindex);

These last two hunks changes do not belong in this patch.  Presumably
they belong in patch 6.

--Andy



> --
> 2.19.1
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro
  2019-02-01 20:53 ` [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro Chang S. Bae
@ 2019-02-02 17:17   ` Andy Lutomirski
  2019-02-13 18:46     ` Bae, Chang Seok
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02 17:17 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML, Dave Hansen

On Fri, Feb 1, 2019 at 12:55 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> GSBASE is used to find per-CPU data in the kernel. But when it is unknown,
> the per-CPU base can be found from the per_cpu_offset table with a CPU NR.
> The CPU NR is extracted from the limit field of the CPUNODE entry in GDT,
> or by the RDPID instruction.
>
> Also, add the GAS-compatible RDPID macro.
>
> The new macro will be used on a following patch.
>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> ---
>  arch/x86/include/asm/fsgsbase.h | 46 +++++++++++++++++++++++++++++++++
>  arch/x86/include/asm/inst.h     | 15 +++++++++++
>  2 files changed, 61 insertions(+)
>
> diff --git a/arch/x86/include/asm/fsgsbase.h b/arch/x86/include/asm/fsgsbase.h
> index aefd53767a5d..eecca2250748 100644
> --- a/arch/x86/include/asm/fsgsbase.h
> +++ b/arch/x86/include/asm/fsgsbase.h
> @@ -78,6 +78,52 @@ extern void x86_gsbase_write_cpu_inactive(unsigned long gsbase);
>
>  #endif /* CONFIG_X86_64 */
>
> +#else /* __ASSEMBLY__ */
> +
> +#ifdef CONFIG_X86_64
> +
> +#include <asm/inst.h>
> +
> +#if CONFIG_SMP

ifdef?

> +
> +/*
> + * CPU/node NR is loaded from the limit (size) field of a special segment
> + * descriptor entry in GDT.
> + */
> +.macro LOAD_CPU_AND_NODE_SEG_LIMIT reg:req
> +       movq    $__CPUNODE_SEG, \reg
> +       lsl     \reg, \reg
> +.endm
> +

Please put the alternative in here instead of in FIND_PERCPU_BASE.

> +/*
> + * Fetch the per-CPU GSBASE value for this processor and put it in @reg.
> + * We normally use %gs for accessing per-CPU data, but we are setting up
> + * %gs here and obviously can not use %gs itself to access per-CPU data.
> + */
> +.macro FIND_PERCPU_BASE reg:req
> +       /*
> +        * The CPU/node NR is initialized earlier, directly in cpu_init().
> +        * The CPU NR is extracted from it.
> +        */

This comment is unnecessary.

> +       ALTERNATIVE \
> +               "LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
> +               "RDPID  \reg", \
> +               X86_FEATURE_RDPID
> +       andq    $VDSO_CPUNODE_MASK, \reg
> +       movq    __per_cpu_offset(, \reg, 8), \reg
> +.endm
> +
> +#else
> +
> +.macro FIND_PERCPU_BASE reg:req
> +       /* Tracking the base offset value */

I don't understand this comment at all.  Please just remove it.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 07/13] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available
  2019-02-01 20:53 ` [PATCH v5 07/13] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available Chang S. Bae
@ 2019-02-02 17:28   ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2019-02-02 17:28 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Andy Lutomirski, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML

On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> Copy real FS/GSBASE values instead of approximation when FSGSBASE is
> enabled.
>
> Factoring out to save_fsgs() does not result in the same behavior because
> save_base_legacy() does not copy FS/GSBASE when the index is zero.
>
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Andi Kleen <ak@linux.intel.com>
> ---
>  arch/x86/kernel/process_64.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index d8ade9530fdb..648e43b58c69 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -477,10 +477,16 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
>         p->thread.sp = (unsigned long) fork_frame;
>         p->thread.io_bitmap_ptr = NULL;
>
> -       savesegment(gs, p->thread.gsindex);
> -       p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
>         savesegment(fs, p->thread.fsindex);
> -       p->thread.fsbase = p->thread.fsindex ? 0 : me->thread.fsbase;
> +       savesegment(gs, p->thread.gsindex);
> +       if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
> +               p->thread.fsbase = rdfsbase();
> +               p->thread.gsbase = __rdgsbase_inactive();
> +       } else {
> +               /* save_base_legacy() does not set base when index is zero. */

After looking at this a bit, I propose that we just clean this up all
the way.  Can't this whole mess be changed to:

save_fsgs(me);
p->thread.fsindex = me->thread.fsindex;
p->thread.fsbase = me->thread.fsbase;
p->thread.gsindex = me->thread.gsindex;
p->thread.gsbase = me->thread.gsbase;

This will avoid all of the horrible tracing through the logic to
figure out why the code is correct.

Sure, it'll be a few cycles slower with FSGSBASE, but this isn't
really a fast path, and if we ever really care, we can optimize it
later.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21
  2019-02-02  2:45   ` Andy Lutomirski
@ 2019-02-05  0:08     ` Andrew Morton
  0 siblings, 0 replies; 29+ messages in thread
From: Andrew Morton @ 2019-02-05  0:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML, Linux Torvalds

On Fri, 1 Feb 2019 18:45:44 -0800 Andy Lutomirski <luto@kernel.org> wrote:

> On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
> >
> > It helps to use some new instructions directly in inline assembly.
> 
> akpm, can you ack this patch?  AFAIK you are the only, or at least
> most vocal, user of ancient userspace to build new kernels.  Are you
> okay with this?

Acked-by: Andrew Morton <akpm@linux-foundation.org>

That little initiative got beaten into submission :(

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 00/13] x86: Enable FSGSBASE instructions
  2019-02-02  2:43 ` Andy Lutomirski
@ 2019-02-05  6:26   ` hpa
  0 siblings, 0 replies; 29+ messages in thread
From: hpa @ 2019-02-05  6:26 UTC (permalink / raw)
  To: Andy Lutomirski, Chang S. Bae
  Cc: Thomas Gleixner, Ingo Molnar, Andi Kleen, Markus T Metzger,
	Ravi Shankar, LKML

On February 1, 2019 6:43:25 PM PST, Andy Lutomirski <luto@kernel.org> wrote:
>Hi hpa-
>
>A while back, you were working on some patches to make modify_ldt()
>play better with this series.  What happened to them?
>
>--Andy

Looks like I need to dig them out...
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 01/13] taint: Introduce a new taint flag (insecure)
  2019-02-02  2:42   ` Andy Lutomirski
@ 2019-02-05 21:21     ` Andrew Morton
  2019-02-05 22:46       ` Randy Dunlap
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2019-02-05 21:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML

On Fri, 1 Feb 2019 18:42:29 -0800 Andy Lutomirski <luto@kernel.org> wrote:

> On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
> >
> > For testing (or root-only) purposes, the new flag will serve to tag the
> > kernel taint accurately.
> >
> > When adding a new feature support, patches need to be incrementally
> > applied and tested with temporal parameters. Currently, there is no flag
> > for this usage.
> 
> I think this should be reviewed by someone like akpm.  akpm, for
> background, this is part of an x86 patch series.  If only part of the
> series is applied, the kernel will be blatantly insecure (but still
> functional and useful for testing and bisection), and this taint flag
> will be set if this kernel is booted.  With the whole series applied,
> there are no users of the taint flag in the kernel.
> 
> Do you think this is a good idea?

What does "temporal parameters" mean?  A complete description of this
testing process would help.

I sounds a bit strange.  You mean it assumes that people will partially
apply the series to test its functionality?  That would be inconvenient.

- Can the new and now-unused taint flag be removed again at
  end-of-series?

- It would be a lot more convenient if we had some means of testing
  after the whole series is applied, on a permanent basis - some
  debugfs flag, perhaps?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 01/13] taint: Introduce a new taint flag (insecure)
  2019-02-05 21:21     ` Andrew Morton
@ 2019-02-05 22:46       ` Randy Dunlap
  2019-02-05 23:07         ` hpa
  0 siblings, 1 reply; 29+ messages in thread
From: Randy Dunlap @ 2019-02-05 22:46 UTC (permalink / raw)
  To: Andrew Morton, Andy Lutomirski
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, H . Peter Anvin,
	Andi Kleen, Markus T Metzger, Ravi Shankar, LKML

On 2/5/19 1:21 PM, Andrew Morton wrote:
> On Fri, 1 Feb 2019 18:42:29 -0800 Andy Lutomirski <luto@kernel.org> wrote:
> 
>> On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>>>
>>> For testing (or root-only) purposes, the new flag will serve to tag the
>>> kernel taint accurately.
>>>
>>> When adding a new feature support, patches need to be incrementally
>>> applied and tested with temporal parameters. Currently, there is no flag
>>> for this usage.
>>
>> I think this should be reviewed by someone like akpm.  akpm, for
>> background, this is part of an x86 patch series.  If only part of the
>> series is applied, the kernel will be blatantly insecure (but still
>> functional and useful for testing and bisection), and this taint flag
>> will be set if this kernel is booted.  With the whole series applied,
>> there are no users of the taint flag in the kernel.
>>
>> Do you think this is a good idea?
> 
> What does "temporal parameters" mean?  A complete description of this
> testing process would help.
> 
> I sounds a bit strange.  You mean it assumes that people will partially
> apply the series to test its functionality?  That would be inconvenient.

Ack.  I don't think we need to (or should) worry about that kind of
muckup.

> - Can the new and now-unused taint flag be removed again at
>   end-of-series?
> 
> - It would be a lot more convenient if we had some means of testing
>   after the whole series is applied, on a permanent basis - some
>   debugfs flag, perhaps?
> 


-- 
~Randy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 01/13] taint: Introduce a new taint flag (insecure)
  2019-02-05 22:46       ` Randy Dunlap
@ 2019-02-05 23:07         ` hpa
  0 siblings, 0 replies; 29+ messages in thread
From: hpa @ 2019-02-05 23:07 UTC (permalink / raw)
  To: Randy Dunlap, Andrew Morton, Andy Lutomirski
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Andi Kleen,
	Markus T Metzger, Ravi Shankar, LKML

On February 5, 2019 2:46:11 PM PST, Randy Dunlap <rdunlap@infradead.org> wrote:
>On 2/5/19 1:21 PM, Andrew Morton wrote:
>> On Fri, 1 Feb 2019 18:42:29 -0800 Andy Lutomirski <luto@kernel.org>
>wrote:
>> 
>>> On Fri, Feb 1, 2019 at 12:54 PM Chang S. Bae
><chang.seok.bae@intel.com> wrote:
>>>>
>>>> For testing (or root-only) purposes, the new flag will serve to tag
>the
>>>> kernel taint accurately.
>>>>
>>>> When adding a new feature support, patches need to be incrementally
>>>> applied and tested with temporal parameters. Currently, there is no
>flag
>>>> for this usage.
>>>
>>> I think this should be reviewed by someone like akpm.  akpm, for
>>> background, this is part of an x86 patch series.  If only part of
>the
>>> series is applied, the kernel will be blatantly insecure (but still
>>> functional and useful for testing and bisection), and this taint
>flag
>>> will be set if this kernel is booted.  With the whole series
>applied,
>>> there are no users of the taint flag in the kernel.
>>>
>>> Do you think this is a good idea?
>> 
>> What does "temporal parameters" mean?  A complete description of this
>> testing process would help.
>> 
>> I sounds a bit strange.  You mean it assumes that people will
>partially
>> apply the series to test its functionality?  That would be
>inconvenient.
>
>Ack.  I don't think we need to (or should) worry about that kind of
>muckup.
>
>> - Can the new and now-unused taint flag be removed again at
>>   end-of-series?
>> 
>> - It would be a lot more convenient if we had some means of testing
>>   after the whole series is applied, on a permanent basis - some
>>   debugfs flag, perhaps?
>> 

I would like to see this taint flag, though, because sometimes it is useful to write test modules (e.g. when I was testing SMAP) which are dangerous even if out of tree.

In case of an escape or pilot error gets it into the wrong kernel, it is a very good thing to have the kernel flagged.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro
  2019-02-02 17:17   ` Andy Lutomirski
@ 2019-02-13 18:46     ` Bae, Chang Seok
  0 siblings, 0 replies; 29+ messages in thread
From: Bae, Chang Seok @ 2019-02-13 18:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, Andi Kleen,
	Metzger, Markus T, Shankar, Ravi V, LKML, Dave Hansen


> On Feb 2, 2019, at 09:17, Andy Lutomirski <luto@kernel.org> wrote:
> 
> On Fri, Feb 1, 2019 at 12:55 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>> 
>> +
>> +/*
>> + * CPU/node NR is loaded from the limit (size) field of a special segment
>> + * descriptor entry in GDT.
>> + */
>> +.macro LOAD_CPU_AND_NODE_SEG_LIMIT reg:req
>> +       movq    $__CPUNODE_SEG, \reg
>> +       lsl     \reg, \reg
>> +.endm
>> +
> 
> Please put the alternative in here instead of in FIND_PERCPU_BASE.

I would like to apply your comment though, build errors will come up due to the
__CPUNODE_SEG. So, still hoping the alternative line below is straightforward enough.
Will send out a revision, that reflects all other comments, shortly.

Chang

> 
>> +/*
>> + * Fetch the per-CPU GSBASE value for this processor and put it in @reg.
>> + * We normally use %gs for accessing per-CPU data, but we are setting up
>> + * %gs here and obviously can not use %gs itself to access per-CPU data.
>> + */
>> +.macro FIND_PERCPU_BASE reg:req
>> 
>> +       ALTERNATIVE \
>> +               "LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
>> +               "RDPID  \reg", \
>> +               X86_FEATURE_RDPID
>> +       andq    $VDSO_CPUNODE_MASK, \reg
>> +       movq    __per_cpu_offset(, \reg, 8), \reg
>> +.endm

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2019-02-13 18:47 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-01 20:53 [PATCH v5 00/13] x86: Enable FSGSBASE instructions Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 01/13] taint: Introduce a new taint flag (insecure) Chang S. Bae
2019-02-02  2:42   ` Andy Lutomirski
2019-02-05 21:21     ` Andrew Morton
2019-02-05 22:46       ` Randy Dunlap
2019-02-05 23:07         ` hpa
2019-02-01 20:53 ` [PATCH v5 02/13] x86/fsgsbase/64: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 03/13] kbuild: Raise the minimum required binutils version to 2.21 Chang S. Bae
2019-02-02  2:45   ` Andy Lutomirski
2019-02-05  0:08     ` Andrew Morton
2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions Chang S. Bae
2019-02-02  2:52   ` Andy Lutomirski
2019-02-01 20:53 ` [PATCH v5 04/13] x86/fsgsbase/64: Add intrinsics/macros " Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 05/13] x86/fsgsbase/64: Enable FSGSBASE instructions in the helper functions Chang S. Bae
2019-02-02  2:57   ` Andy Lutomirski
2019-02-01 20:53 ` [PATCH v5 06/13] x86/fsgsbase/64: Preserve FS/GS state in __switch_to() if FSGSBASE is on Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 07/13] x86/fsgsbase/64: When copying a thread, use the FSGSBASE instructions if available Chang S. Bae
2019-02-02 17:28   ` Andy Lutomirski
2019-02-01 20:53 ` [PATCH v5 08/13] x86/fsgsbase/64: Introduce the FIND_PERCPU_BASE macro Chang S. Bae
2019-02-02 17:17   ` Andy Lutomirski
2019-02-13 18:46     ` Bae, Chang Seok
2019-02-01 20:53 ` [PATCH v5 09/13] x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 10/13] selftests/x86/fsgsbase: Test WRGSBASE Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 11/13] x86/fsgsbase/64: Enable FSGSBASE by default and add a chicken bit Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 12/13] x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 Chang S. Bae
2019-02-01 20:53 ` [PATCH v5 13/13] x86/fsgsbase/64: Add documentation for FSGSBASE Chang S. Bae
2019-02-01 23:02 ` [PATCH v5 00/13] x86: Enable FSGSBASE instructions Andi Kleen
2019-02-02  2:43 ` Andy Lutomirski
2019-02-05  6:26   ` hpa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).