All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing
@ 2021-06-23 12:01 Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 01/65] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
                   ` (65 more replies)
  0 siblings, 66 replies; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The main parts of this series are:

  - Simplification and removal/replacement of redundant and/or
    overengineered code.

  - Name space cleanup as the existing names were just a permanent source
    of confusion.

  - Clear seperation of user ABI and kernel internal state handling.

  - Removal of PKRU from being XSTATE managed in the kernel because PKRU
    has to be eagerly restored on context switch and keeping it in sync
    in the xstate buffer is just pointless overhead and fragile.

    The kernel still XSAVEs PKRU on context switch but the value in the
    buffer is not longer used and never restored from the buffer.

    This still needs to be cleaned up, but the series is already 40+
    patches large and the cleanup of this is not a functional problem.

    The functional issues of PKRU management are fully addressed with the
    series as is.

  - Cleanup of fpu signal restore

    - Make the fast path self contained. Handle #PF directly and skip
      the slow path on any other exception as that will just end up
      with the same result that the frame is invalid. This allows
      the compiler to optimize the slow path out for 64bit kernels
      w/o ia32 emulation.

    - Reduce code duplication and unnecessary operations

It applies on top of

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

and is also available via git:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu

This is a follow up to V3 which can be found here:

     https://lore.kernel.org/r/20210618141823.161158090@linutronix.de

Changes vs. V3:

  - Dropped the two bugfixes which are applied already and rebased on top

  - Addressed review comments (Andy, Boris)

    Patches: 13, 35, 36, 37, 46, 58, 62, 63

  - Fixed the math-emu fallout which I had stashed safely on the 32bit
    testbox (Boris)

    Patch: 28

  - Picked up tags

Thanks to everyone for review, feedback and testing (various teams
@Intel).

Note: I've not picked up any tested-by tags. It would be nice to have
them on this hopefully final version.

Thanks,

	tglx
---
 arch/x86/events/intel/lbr.c          |    6 
 arch/x86/include/asm/fpu/internal.h  |  202 ++++------
 arch/x86/include/asm/fpu/xstate.h    |   78 +++-
 arch/x86/include/asm/pgtable.h       |   57 ---
 arch/x86/include/asm/pkeys.h         |    9 
 arch/x86/include/asm/pkru.h          |   62 +++
 arch/x86/include/asm/processor.h     |    9 
 arch/x86/include/asm/special_insns.h |   14 
 arch/x86/kernel/cpu/common.c         |   34 -
 arch/x86/kernel/fpu/core.c           |  282 +++++++--------
 arch/x86/kernel/fpu/init.c           |   15 
 arch/x86/kernel/fpu/regset.c         |  223 ++++++------
 arch/x86/kernel/fpu/signal.c         |  419 ++++++++++------------
 arch/x86/kernel/fpu/xstate.c         |  645 +++++++++++++----------------------
 arch/x86/kernel/process.c            |   22 +
 arch/x86/kernel/process_64.c         |   28 +
 arch/x86/kernel/traps.c              |    5 
 arch/x86/kvm/svm/sev.c               |    1 
 arch/x86/kvm/x86.c                   |   56 +--
 arch/x86/math-emu/fpu_proto.h        |    2 
 arch/x86/math-emu/load_store.c       |    2 
 arch/x86/math-emu/reg_ld_str.c       |    2 
 arch/x86/mm/extable.c                |    2 
 arch/x86/mm/fault.c                  |    2 
 arch/x86/mm/pkeys.c                  |   22 -
 include/linux/pkeys.h                |    4 
 26 files changed, 1037 insertions(+), 1166 deletions(-)



^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 01/65] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 02/65] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
                   ` (64 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The gap handling in copy_xstate_to_kernel() is wrong when XSAVES is in use.

Using init_fpstate for copying the init state of features which are
not set in the xstate header is only correct for the legacy area, but
not for the extended features area because when XSAVES is in use then
init_fpstate is in compacted form which means the xstate offsets which
are used to copy from init_fpstate are not valid.

Fortunately this is not a real problem today because all extended
features in use have an all zeros init state, but it is wrong
nevertheless and with a potentially dynamically sized init_fpstate
this would result in access outside of the init_fpstate.

Fix this by keeping track of the last copied state in the target buffer and
explicitly zero it when there is a feature or alignment gap.

Use the compacted offset when accessing the extended feature space in
init_fpstate.

As this is not a functional issue on older kernels this is intentionally
not tagged for stable.

Fixes: b8be15d58806 ("x86/fpu/xstate: Re-enable XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Remove the AVX/SEE thinko
    Fix comments (Boris)
V2: New patch
---
 arch/x86/kernel/fpu/xstate.c |  105 ++++++++++++++++++++++++-------------------
 1 file changed, 61 insertions(+), 44 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1084,20 +1084,10 @@ static inline bool xfeatures_mxcsr_quirk
 	return true;
 }
 
-static void fill_gap(struct membuf *to, unsigned *last, unsigned offset)
+static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
+			 void *init_xstate, unsigned int size)
 {
-	if (*last >= offset)
-		return;
-	membuf_write(to, (void *)&init_fpstate.xsave + *last, offset - *last);
-	*last = offset;
-}
-
-static void copy_part(struct membuf *to, unsigned *last, unsigned offset,
-		      unsigned size, void *from)
-{
-	fill_gap(to, last, offset);
-	membuf_write(to, from, size);
-	*last = offset + size;
+	membuf_write(to, from_xstate ? xstate : init_xstate, size);
 }
 
 /*
@@ -1109,10 +1099,10 @@ static void copy_part(struct membuf *to,
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {
+	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
-	const unsigned off_mxcsr = offsetof(struct fxregs_state, mxcsr);
-	unsigned size = to.left;
-	unsigned last = 0;
+	unsigned int zerofrom;
 	int i;
 
 	/*
@@ -1122,41 +1112,68 @@ void copy_xstate_to_kernel(struct membuf
 	header.xfeatures = xsave->header.xfeatures;
 	header.xfeatures &= xfeatures_mask_user();
 
-	if (header.xfeatures & XFEATURE_MASK_FP)
-		copy_part(&to, &last, 0, off_mxcsr, &xsave->i387);
-	if (header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM))
-		copy_part(&to, &last, off_mxcsr,
-			  MXCSR_AND_FLAGS_SIZE, &xsave->i387.mxcsr);
-	if (header.xfeatures & XFEATURE_MASK_FP)
-		copy_part(&to, &last, offsetof(struct fxregs_state, st_space),
-			  128, &xsave->i387.st_space);
-	if (header.xfeatures & XFEATURE_MASK_SSE)
-		copy_part(&to, &last, xstate_offsets[XFEATURE_SSE],
-			  256, &xsave->i387.xmm_space);
-	/*
-	 * Fill xsave->i387.sw_reserved value for ptrace frame:
-	 */
-	copy_part(&to, &last, offsetof(struct fxregs_state, sw_reserved),
-		  48, xstate_fx_sw_bytes);
-	/*
-	 * Copy xregs_state->header:
-	 */
-	copy_part(&to, &last, offsetof(struct xregs_state, header),
-		  sizeof(header), &header);
+	/* Copy FP state up to MXCSR */
+	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
+		     &xinit->i387, off_mxcsr);
+
+	/* Copy MXCSR when SSE or YMM are set in the feature mask */
+	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
+		     &to, &xsave->i387.mxcsr, &xinit->i387.mxcsr,
+		     MXCSR_AND_FLAGS_SIZE);
+
+	/* Copy the remaining FP state */
+	copy_feature(header.xfeatures & XFEATURE_MASK_FP,
+		     &to, &xsave->i387.st_space, &xinit->i387.st_space,
+		     sizeof(xsave->i387.st_space));
+
+	/* Copy the SSE state - shared with YMM, but independently managed */
+	copy_feature(header.xfeatures & XFEATURE_MASK_SSE,
+		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
+		     sizeof(xsave->i387.xmm_space));
+
+	/* Zero the padding area */
+	membuf_zero(&to, sizeof(xsave->i387.padding));
+
+	/* Copy xsave->i387.sw_reserved */
+	membuf_write(&to, xstate_fx_sw_bytes, sizeof(xsave->i387.sw_reserved));
+
+	/* Copy the user space relevant state of @xsave->header */
+	membuf_write(&to, &header, sizeof(header));
+
+	zerofrom = offsetof(struct xregs_state, extended_state_area);
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
 		/*
-		 * Copy only in-use xstates:
+		 * The ptrace buffer is in non-compacted XSAVE format.
+		 * In non-compacted format disabled features still occupy
+		 * state space, but there is no state to copy from in the
+		 * compacted init_fpstate. The gap tracking will zero this
+		 * later.
+		 */
+		if (!(xfeatures_mask_user() & BIT_ULL(i)))
+			continue;
+
+		/*
+		 * If there was a feature or alignment gap, zero the space
+		 * in the destination buffer.
 		 */
-		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, i);
+		if (zerofrom < xstate_offsets[i])
+			membuf_zero(&to, xstate_offsets[i] - zerofrom);
 
-			copy_part(&to, &last, xstate_offsets[i],
-				  xstate_sizes[i], src);
-		}
+		copy_feature(header.xfeatures & BIT_ULL(i), &to,
+			     __raw_xsave_addr(xsave, i),
+			     __raw_xsave_addr(xinit, i),
+			     xstate_sizes[i]);
 
+		/*
+		 * Keep track of the last copied state in the non-compacted
+		 * target buffer for gap zeroing.
+		 */
+		zerofrom = xstate_offsets[i] + xstate_sizes[i];
 	}
-	fill_gap(&to, &last, size);
+
+	if (to.left)
+		membuf_zero(&to, to.left);
 }
 
 /*


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 02/65] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 01/65] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 03/65] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
                   ` (63 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

This cannot work and it's unclear how that ever made a difference.

init_fpstate.xsave.header.xfeatures is always 0 so get_xsave_addr() will
always return a NULL pointer, which will prevent storing the default PKRU
value in initfp_state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V2: Fix subject
---
 arch/x86/kernel/cpu/common.c |    5 -----
 arch/x86/mm/pkeys.c          |    6 ------
 2 files changed, 11 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,8 +466,6 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	struct pkru_state *pk;
-
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@@ -478,9 +476,6 @@ static __always_inline void setup_pku(st
 		return;
 
 	cr4_set_bits(X86_CR4_PKE);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (pk)
-		pk->pkru = init_pkru_value;
 	/*
 	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/fpu/internal.h>		/* init_fpstate			*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -154,7 +153,6 @@ static ssize_t init_pkru_read_file(struc
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
-	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@@ -177,10 +175,6 @@ static ssize_t init_pkru_write_file(stru
 		return -EINVAL;
 
 	WRITE_ONCE(init_pkru_value, new_init_pkru);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (!pk)
-		return -EINVAL;
-	pk->pkru = new_init_pkru;
 	return count;
 }
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 03/65] x86/fpu: Mark various FPU states __ro_after_init
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 01/65] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 02/65] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] x86/fpu: Mark various FPU state variables __ro_after_init tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 04/65] x86/fpu: Make xfeatures_mask_all __ro_after_init Thomas Gleixner
                   ` (62 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Nothing modifies these after booting.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
---
V3: Remove xfeatures_mask_all as that needs more cleanups
---
 arch/x86/kernel/fpu/init.c   |    4 ++--
 arch/x86/kernel/fpu/xstate.c |   14 +++++++++-----
 2 files changed, 11 insertions(+), 7 deletions(-)

--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -89,7 +89,7 @@ static void fpu__init_system_early_gener
 /*
  * Boot time FPU feature detection code:
  */
-unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
+unsigned int mxcsr_feature_mask __ro_after_init = 0xffffffffu;
 EXPORT_SYMBOL_GPL(mxcsr_feature_mask);
 
 static void __init fpu__init_system_mxcsr(void)
@@ -135,7 +135,7 @@ static void __init fpu__init_system_gene
  * This is inherent to the XSAVE architecture which puts all state
  * components into a single, continuous memory block:
  */
-unsigned int fpu_kernel_xstate_size;
+unsigned int fpu_kernel_xstate_size __ro_after_init;
 EXPORT_SYMBOL_GPL(fpu_kernel_xstate_size);
 
 /* Get alignment of the TYPE. */
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -61,17 +61,21 @@ static short xsave_cpuid_features[] __in
  */
 u64 xfeatures_mask_all __read_mostly;
 
-static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
  * it is always in standard format for user mode. This is the user
  * mode standard format size used for signal and ptrace frames.
  */
-unsigned int fpu_user_xstate_size;
+unsigned int fpu_user_xstate_size __ro_after_init;
 
 /*
  * Return whether the system supports a given xfeature.


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 04/65] x86/fpu: Make xfeatures_mask_all __ro_after_init
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (2 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 03/65] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 05/65] x86/fpu: Get rid of fpu__get_supported_xfeatures_mask() Thomas Gleixner
                   ` (61 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Nothing has to modify this after init.

But of course there is code which unconditionallqy masks xfeatures_mask_all
on CPU hotplug. This goes unnoticed during boot hotplug because at that
point the variable is still RW mapped.

This is broken in several ways:

  1) Masking this in post init CPU hotplug means that any
     modification of this state goes unnoticed until actual hotplug
     happens.

  2) If that ever happens then these bogus feature bits are already
     populated all over the place and the system is in inconsistent state
     vs. the compacted XSTATE offsets. If at all then this has to panic the
     machine because the inconsistency cannot be undone anymore.

Make this a one time paranoia check in xstate init code and disable xsave
when this happens.

Reported-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: New patch. Addresses the hotplug crash reported by Kan
---
 arch/x86/kernel/fpu/xstate.c |   28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -59,7 +59,7 @@ static short xsave_cpuid_features[] __in
  * This represents the full set of bits that should ever be set in a kernel
  * XSAVE buffer, both supervisor and user xstates.
  */
-u64 xfeatures_mask_all __read_mostly;
+u64 xfeatures_mask_all __ro_after_init;
 
 static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
@@ -213,19 +213,8 @@ void fpstate_sanitize_xstate(struct fpu
  */
 void fpu__init_cpu_xstate(void)
 {
-	u64 unsup_bits;
-
 	if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask_all)
 		return;
-	/*
-	 * Unsupported supervisor xstates should not be found in
-	 * the xfeatures mask.
-	 */
-	unsup_bits = xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;
-	WARN_ONCE(unsup_bits, "x86/fpu: Found unsupported supervisor xstates: 0x%llx\n",
-		  unsup_bits);
-
-	xfeatures_mask_all &= ~XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;
 
 	cr4_set_bits(X86_CR4_OSXSAVE);
 
@@ -825,6 +814,7 @@ void __init fpu__init_system_xstate(void
 {
 	unsigned int eax, ebx, ecx, edx;
 	static int on_boot_cpu __initdata = 1;
+	u64 xfeatures;
 	int err;
 	int i;
 
@@ -879,6 +869,8 @@ void __init fpu__init_system_xstate(void
 	}
 
 	xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();
+	/* Store it for paranoia check at the end */
+	xfeatures = xfeatures_mask_all;
 
 	/* Enable xstate instructions to be able to continue with initialization: */
 	fpu__init_cpu_xstate();
@@ -896,8 +888,18 @@ void __init fpu__init_system_xstate(void
 	setup_init_fpu_buf();
 	setup_xstate_comp_offsets();
 	setup_supervisor_only_offsets();
-	print_xstate_offset_size();
 
+	/*
+	 * Paranoia check whether something in the setup modified the
+	 * xfeatures mask.
+	 */
+	if (xfeatures != xfeatures_mask_all) {
+		pr_err("x86/fpu: xfeatures modified from 0x%016llx to 0x%016llx during init, disabling XSAVE\n",
+		       xfeatures, xfeatures_mask_all);
+		goto out_disable;
+	}
+
+	print_xstate_offset_size();
 	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
 		xfeatures_mask_all,
 		fpu_kernel_xstate_size,


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 05/65] x86/fpu: Get rid of fpu__get_supported_xfeatures_mask()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (3 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 04/65] x86/fpu: Make xfeatures_mask_all __ro_after_init Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 06/65] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
                   ` (60 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

This function is really not doing what the comment advertises:

 "Find supported xfeatures based on cpu features and command-line input.
  This must be called after fpu__init_parse_early_param() is called and
  xfeatures_mask is enumerated."

fpu__init_parse_early_param() does not exist anymore and the function just
returns a constant.

Remove it and fix the caller and get rid of further references to
fpu__init_parse_early_param().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: New patch. Noticed when staring at the hotplug trainwreck.
---
 arch/x86/include/asm/fpu/internal.h |    1 -
 arch/x86/kernel/cpu/common.c        |    5 ++---
 arch/x86/kernel/fpu/init.c          |   11 -----------
 arch/x86/kernel/fpu/xstate.c        |    4 +++-
 4 files changed, 5 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -45,7 +45,6 @@ extern void fpu__init_cpu_xstate(void);
 extern void fpu__init_system(struct cpuinfo_x86 *c);
 extern void fpu__init_check_bugs(void);
 extern void fpu__resume_cpu(void);
-extern u64 fpu__get_supported_xfeatures_mask(void);
 
 /*
  * Debugging facility:
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1715,9 +1715,8 @@ void print_cpu_info(struct cpuinfo_x86 *
 }
 
 /*
- * clearcpuid= was already parsed in fpu__init_parse_early_param.
- * But we need to keep a dummy __setup around otherwise it would
- * show up as an environment variable for init.
+ * clearcpuid= was already parsed in cpu_parse_early_param().  This dummy
+ * function prevents it from becoming an environment variable for init.
  */
 static __init int setup_clearcpuid(char *arg)
 {
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -216,17 +216,6 @@ static void __init fpu__init_system_xsta
 	fpu_user_xstate_size = fpu_kernel_xstate_size;
 }
 
-/*
- * Find supported xfeatures based on cpu features and command-line input.
- * This must be called after fpu__init_parse_early_param() is called and
- * xfeatures_mask is enumerated.
- */
-u64 __init fpu__get_supported_xfeatures_mask(void)
-{
-	return XFEATURE_MASK_USER_SUPPORTED |
-	       XFEATURE_MASK_SUPERVISOR_SUPPORTED;
-}
-
 /* Legacy code to initialize eager fpu mode. */
 static void __init fpu__init_system_ctx_switch(void)
 {
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -868,7 +868,9 @@ void __init fpu__init_system_xstate(void
 			xfeatures_mask_all &= ~BIT_ULL(i);
 	}
 
-	xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();
+	xfeatures_mask_all &= XFEATURE_MASK_USER_SUPPORTED |
+			      XFEATURE_MASK_SUPERVISOR_SUPPORTED;
+
 	/* Store it for paranoia check at the end */
 	xfeatures = xfeatures_mask_all;
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 06/65] x86/fpu: Remove unused get_xsave_field_ptr()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (4 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 05/65] x86/fpu: Get rid of fpu__get_supported_xfeatures_mask() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 07/65] x86/fpu: Move inlines where they belong Thomas Gleixner
                   ` (59 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/xstate.h |    1 -
 arch/x86/kernel/fpu/xstate.c      |   30 ------------------------------
 2 files changed, 31 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -998,36 +998,6 @@ void *get_xsave_addr(struct xregs_state
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
-/*
- * This wraps up the common operations that need to occur when retrieving
- * data from xsave state.  It first ensures that the current task was
- * using the FPU and retrieves the data in to a buffer.  It then calculates
- * the offset of the requested field in the buffer.
- *
- * This function is safe to call whether the FPU is in use or not.
- *
- * Note that this only works on the current task.
- *
- * Inputs:
- *	@xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
- *	XFEATURE_SSE, etc...)
- * Output:
- *	address of the state in the xsave area or NULL if the state
- *	is not present or is in its 'init state'.
- */
-const void *get_xsave_field_ptr(int xfeature_nr)
-{
-	struct fpu *fpu = &current->thread.fpu;
-
-	/*
-	 * fpu__save() takes the CPU's xstate registers
-	 * and saves them off to the 'fpu memory buffer.
-	 */
-	fpu__save(fpu);
-
-	return get_xsave_addr(&fpu->state.xsave, xfeature_nr);
-}
-
 #ifdef CONFIG_ARCH_HAS_PKEYS
 
 /*


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 07/65] x86/fpu: Move inlines where they belong
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (5 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 06/65] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 18:43   ` Bae, Chang Seok
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 08/65] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
                   ` (58 subsequent siblings)
  65 siblings, 2 replies; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

They are only used in fpstate_init() and there is no point to have them in
a header just to make reading the code harder.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |   14 --------------
 arch/x86/kernel/fpu/core.c          |   15 +++++++++++++++
 2 files changed, 15 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -86,20 +86,6 @@ extern void fpstate_init_soft(struct swr
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-static inline void fpstate_init_xstate(struct xregs_state *xsave)
-{
-	/*
-	 * XRSTORS requires these bits set in xcomp_bv, or it will
-	 * trigger #GP:
-	 */
-	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
-}
-
-static inline void fpstate_init_fxstate(struct fxregs_state *fx)
-{
-	fx->cwd = 0x37f;
-	fx->mxcsr = MXCSR_DEFAULT;
-}
 extern void fpstate_sanitize_xstate(struct fpu *fpu);
 
 #define user_insn(insn, output, input...)				\
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -181,6 +181,21 @@ void fpu__save(struct fpu *fpu)
 	fpregs_unlock();
 }
 
+static inline void fpstate_init_xstate(struct xregs_state *xsave)
+{
+	/*
+	 * XRSTORS requires these bits set in xcomp_bv, or it will
+	 * trigger #GP:
+	 */
+	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
+}
+
+static inline void fpstate_init_fxstate(struct fxregs_state *fx)
+{
+	fx->cwd = 0x37f;
+	fx->mxcsr = MXCSR_DEFAULT;
+}
+
 /*
  * Legacy x87 fpstate state init:
  */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 08/65] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (6 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 07/65] x86/fpu: Move inlines where they belong Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 09/65] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
                   ` (57 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

If the count argument is larger than the xstate size, this will happily
copy beyond the end of xstate.

Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/regset.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -117,7 +117,7 @@ int xstateregs_set(struct task_struct *t
 	/*
 	 * A whole standard-format XSAVE buffer is needed:
 	 */
-	if ((pos != 0) || (count < fpu_user_xstate_size))
+	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
 	xsave = &fpu->state.xsave;


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (7 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 08/65] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2022-07-14  4:04   ` [patch V4 09/65] " Andrei Vagin
  2021-06-23 12:01 ` [patch V4 10/65] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
                   ` (56 subsequent siblings)
  65 siblings, 2 replies; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

xstateregs_set() operates on a stopped task and tries to copy the provided
buffer into the task's fpu.state.xsave buffer.

Any error while copying or invalid state detected after copying results in
wiping the target task's FPU state completely including supervisor states.

That's just wrong. The caller supplied invalid data or has a problem with
unmapped memory, so there is absolutely no justification to corrupt the
target state.

Fix this with the following modifications:

 1) If data has to be copied from userspace, allocate a buffer and copy from
    user first.

 2) Use copy_kernel_to_xstate() unconditionally so that header checking
    works correctly.

 3) Return on error without corrupting the target state.

This prevents corrupting states and lets the caller deal with the problem
it caused in the first place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/xstate.h |    4 ---
 arch/x86/kernel/fpu/regset.c      |   44 +++++++++++++++-----------------------
 arch/x86/kernel/fpu/xstate.c      |   14 ++++++------
 3 files changed, 26 insertions(+), 36 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,8 +111,4 @@ void copy_supervisor_to_kernel(struct xr
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
-
-/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr);
-
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -2,11 +2,13 @@
 /*
  * FPU register's regset abstraction, for ptrace, core dumps, etc.
  */
+#include <linux/sched/task_stack.h>
+#include <linux/vmalloc.h>
+
 #include <asm/fpu/internal.h>
 #include <asm/fpu/signal.h>
 #include <asm/fpu/regset.h>
 #include <asm/fpu/xstate.h>
-#include <linux/sched/task_stack.h>
 
 /*
  * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
@@ -108,10 +110,10 @@ int xstateregs_set(struct task_struct *t
 		  const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
+	struct xregs_state *tmpbuf = NULL;
 	int ret;
 
-	if (!boot_cpu_has(X86_FEATURE_XSAVE))
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
 	/*
@@ -120,32 +122,22 @@ int xstateregs_set(struct task_struct *t
 	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
-	xsave = &fpu->state.xsave;
-
-	fpu__prepare_write(fpu);
-
-	if (using_compacted_format()) {
-		if (kbuf)
-			ret = copy_kernel_to_xstate(xsave, kbuf);
-		else
-			ret = copy_user_to_xstate(xsave, ubuf);
-	} else {
-		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
-		if (!ret)
-			ret = validate_user_xstate_header(&xsave->header);
+	if (!kbuf) {
+		tmpbuf = vmalloc(count);
+		if (!tmpbuf)
+			return -ENOMEM;
+
+		if (copy_from_user(tmpbuf, ubuf, count)) {
+			ret = -EFAULT;
+			goto out;
+		}
 	}
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
-	/*
-	 * In case of failure, mark all states as init:
-	 */
-	if (ret)
-		fpstate_init(&fpu->state);
+	fpu__prepare_write(fpu);
+	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
+out:
+	vfree(tmpbuf);
 	return ret;
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -543,7 +543,7 @@ int using_compacted_format(void)
 }
 
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr)
+static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
 	if (hdr->xfeatures & ~xfeatures_mask_user())
@@ -1155,7 +1155,7 @@ void copy_xstate_to_kernel(struct membuf
 }
 
 /*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVES format
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
@@ -1202,14 +1202,16 @@ int copy_kernel_to_xstate(struct xregs_s
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
+	/* mxcsr reserved bits must be masked to zero for historical reasons. */
+	xsave->i387.mxcsr &= mxcsr_feature_mask;
+
 	return 0;
 }
 
 /*
- * Convert from a ptrace or sigreturn standard-format user-space buffer to
- * kernel XSAVES format and copy to the target thread. This is called from
- * xstateregs_set(), as well as potentially from the sigreturn() and
- * rt_sigreturn() system calls.
+ * Convert from a sigreturn standard-format user-space buffer to kernel
+ * XSAVE[S] format and copy to the target thread. This is called from the
+ * sigreturn() and rt_sigreturn() system calls.
  */
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 {


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 10/65] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (8 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 09/65] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 11/65] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
                   ` (55 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Instead of masking out reserved bits, check them and reject the provided
state as invalid if not zero.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Validate MXCSR when FP|SSE|YMM are set. The quirk check is only
    correct for the copy function.
V2: New patch
---
 arch/x86/kernel/fpu/xstate.c |   19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1154,6 +1154,19 @@ void copy_xstate_to_kernel(struct membuf
 		membuf_zero(&to, to.left);
 }
 
+static inline bool mxcsr_valid(struct xstate_header *hdr, const u32 *mxcsr)
+{
+	u64 mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
+
+	/* Only check if it is in use */
+	if (hdr->xfeatures & mask) {
+		/* Reserved bits in MXCSR must be zero. */
+		if (*mxcsr & ~mxcsr_feature_mask)
+			return false;
+	}
+	return true;
+}
+
 /*
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
@@ -1172,6 +1185,9 @@ int copy_kernel_to_xstate(struct xregs_s
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
 
+	if (!mxcsr_valid(&hdr, kbuf + offsetof(struct fxregs_state, mxcsr)))
+		return -EINVAL;
+
 	for (i = 0; i < XFEATURE_MAX; i++) {
 		u64 mask = ((u64)1 << i);
 
@@ -1202,9 +1218,6 @@ int copy_kernel_to_xstate(struct xregs_s
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
-	/* mxcsr reserved bits must be masked to zero for historical reasons. */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 11/65] x86/fpu: Simplify PTRACE_GETREGS code
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (9 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 10/65] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
  2021-06-23 12:01 ` [patch V4 12/65] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
                   ` (54 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Dave Hansen <dave.hansen@linux.intel.com>

ptrace() has interfaces that let a ptracer inspect a ptracee's register state.
This includes XSAVE state.  The ptrace() ABI includes a hardware-format XSAVE
buffer for both the SETREGS and GETREGS interfaces.

In the old days, the kernel buffer and the ptrace() ABI buffer were the
same boring non-compacted format.  But, since the advent of supervisor
states and the compacted format, the kernel buffer has diverged from the
format presented in the ABI.

This leads to two paths in the kernel:
1. Effectively a verbatim copy_to_user() which just copies the kernel buffer
   out to userspace.  This is used when the kernel buffer is kept in the
   non-compacted form which means that it shares a format with the ptrace
   ABI.
2. A one-state-at-a-time path: copy_xstate_to_kernel().  This is theoretically
   slower since it does a bunch of piecemeal copies.

Remove the verbatim copy case.  Speed probably does not matter in this path,
and the vast majority of new hardware will use the one-state-at-a-time path
anyway.  This ensures greater testing for the "slow" path.

This also makes enabling PKRU in this interface easier since a single path
can be patched instead of two.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/regset.c |   24 +++---------------------
 arch/x86/kernel/fpu/xstate.c |    6 +++---
 2 files changed, 6 insertions(+), 24 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -77,32 +77,14 @@ int xstateregs_get(struct task_struct *t
 		struct membuf to)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
 
-	if (!boot_cpu_has(X86_FEATURE_XSAVE))
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	xsave = &fpu->state.xsave;
-
 	fpu__prepare_read(fpu);
 
-	if (using_compacted_format()) {
-		copy_xstate_to_kernel(to, xsave);
-		return 0;
-	} else {
-		fpstate_sanitize_xstate(fpu);
-		/*
-		 * Copy the 48 bytes defined by the software into the xsave
-		 * area in the thread struct, so that we can copy the whole
-		 * area to user using one user_regset_copyout().
-		 */
-		memcpy(&xsave->i387.sw_reserved, xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
-
-		/*
-		 * Copy the xstate memory layout.
-		 */
-		return membuf_write(&to, xsave, fpu_user_xstate_size);
-	}
+	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	return 0;
 }
 
 int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1069,11 +1069,11 @@ static void copy_feature(bool from_xstat
 }
 
 /*
- * Convert from kernel XSAVES compacted format to standard format and copy
- * to a kernel-space ptrace buffer.
+ * Convert from kernel XSAVE or XSAVES compacted format to UABI
+ * non-compacted format and copy to a kernel-space ptrace buffer.
  *
  * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVES.
+ * from xstateregs_get() and there we check the CPU has XSAVE.
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 12/65] x86/fpu: Rewrite xfpregs_set()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (10 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 11/65] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  2021-06-23 12:01 ` [patch V4 13/65] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
                   ` (53 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Andy Lutomirski <luto@kernel.org>

xfpregs_set() was incomprehensible.  Almost all of the complexity was due
to trying to support nonsensically sized writes or -EFAULT errors that
would have partially or completely overwritten the destination before
failing.  Nonsensically sized input would only have been possible using
PTRACE_SETREGSET on REGSET_XFP.  Fortunately, it appears (based on Debian
code search results) that no one uses that API at all, let alone with the
wrong sized buffer.  Failed user access can be handled more cleanly by
first copying to kernel memory.

Just rewrite it to require sensible input.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V2: New patch picked up from Andy
---
 arch/x86/kernel/fpu/regset.c |   39 ++++++++++++++++++++++++---------------
 1 file changed, 24 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -47,30 +47,39 @@ int xfpregs_set(struct task_struct *targ
 		const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
+	struct user32_fxsr_struct newstate;
 	int ret;
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
+	BUILD_BUG_ON(sizeof(newstate) != sizeof(struct fxregs_state));
+
+	if (!cpu_feature_enabled(X86_FEATURE_FXSR))
 		return -ENODEV;
 
+	/* No funny business with partial or oversized writes is permitted. */
+	if (pos != 0 || count != sizeof(newstate))
+		return -EINVAL;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newstate, 0, -1);
+	if (ret)
+		return ret;
+
+	/* Mask invalid MXCSR bits (for historical reasons). */
+	newstate.mxcsr &= mxcsr_feature_mask;
+
 	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
 
-	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-				 &fpu->state.fxsave, 0, -1);
+	/* Copy the state  */
+	memcpy(&fpu->state.fxsave, &newstate, sizeof(newstate));
+
+	/* Clear xmm8..15 */
+	BUILD_BUG_ON(sizeof(fpu->state.fxsave.xmm_space) != 16 * 16);
+	memset(&fpu->state.fxsave.xmm_space[8], 0, 8 * 16);
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	fpu->state.fxsave.mxcsr &= mxcsr_feature_mask;
-
-	/*
-	 * update the header bits in the xsave header, indicating the
-	 * presence of FP and SSE state.
-	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
+	/* Mark FP and SSE as in use when XSAVE is enabled */
+	if (use_xsave())
 		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FPSSE;
 
-	return ret;
+	return 0;
 }
 
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 13/65] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (11 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 12/65] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  2021-06-23 12:01 ` [patch V4 14/65] x86/fpu: Clean up fpregs_set() Thomas Gleixner
                   ` (52 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Andy Lutomirski <luto@kernel.org>

There is no benefit from accepting and silently changing an invalid MXCSR
value supplied via ptrace().  Instead, return -EINVAL on invalid input.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch. Picked up from Andy.
---
 arch/x86/kernel/fpu/regset.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
---
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -64,8 +64,9 @@ int xfpregs_set(struct task_struct *targ
 	if (ret)
 		return ret;
 
-	/* Mask invalid MXCSR bits (for historical reasons). */
-	newstate.mxcsr &= mxcsr_feature_mask;
+	/* Do not allow an invalid MXCSR value. */
+	if (newstate.mxcsr & ~mxcsr_feature_mask)
+		return -EINVAL;
 
 	fpu__prepare_write(fpu);
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 14/65] x86/fpu: Clean up fpregs_set()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (12 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 13/65] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  2021-06-23 12:01 ` [patch V4 15/65] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
                   ` (51 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Andy Lutomirski <luto@kernel.org>

fpregs_set() has unnecessary complexity to support short or nonzero-offset
writes and to handle the case in which a copy from userspace overwrites
some of the target buffer and then fails.  Support for partial writes is
useless -- just require that the write have offset 0 and the correct size,
and copy into a temporary kernel buffer to avoid clobbering the state if
the user access fails.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V2: New patch. Picked up from Andy
---
 arch/x86/kernel/fpu/regset.c |   31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)
---
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -304,31 +304,32 @@ int fpregs_set(struct task_struct *targe
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
+	/* No funny business with partial or oversized writes is permitted. */
+	if (pos != 0 || count != sizeof(struct user_i387_ia32_struct))
+		return -EINVAL;
 
-	if (!boot_cpu_has(X86_FEATURE_FPU))
+	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
-		return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-					  &fpu->state.fsave, 0,
-					  -1);
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
+	if (ret)
+		return ret;
 
-	if (pos > 0 || count < sizeof(env))
-		convert_from_fxsr(&env, target);
+	fpu__prepare_write(fpu);
 
-	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
-	if (!ret)
-		convert_to_fxsr(&target->thread.fpu.state.fxsave, &env);
+	if (cpu_feature_enabled(X86_FEATURE_FXSR))
+		convert_to_fxsr(&fpu->state.fxsave, &env);
+	else
+		memcpy(&fpu->state.fsave, &env, sizeof(env));
 
 	/*
-	 * update the header bit in the xsave header, indicating the
+	 * Update the header bit in the xsave header, indicating the
 	 * presence of FP.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
+	if (cpu_feature_enabled(X86_FEATURE_XSAVE))
 		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FP;
-	return ret;
+
+	return 0;
 }
 
 #endif	/* CONFIG_X86_32 || CONFIG_IA32_EMULATION */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 15/65] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (13 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 14/65] x86/fpu: Clean up fpregs_set() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-24 15:09   ` [PATCH] x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 16/65] x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get() Thomas Gleixner
                   ` (50 subsequent siblings)
  65 siblings, 2 replies; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

When xsave with init state optimization is used then a component's state
in the task's xsave buffer can be stale when the corresponding feature bit
is not set.

fpregs_get() and xfpregs_get() invoke fpstate_sanitize_xstate() to update
the task's xsave buffer before retrieving the FX or FP state. That's just
duplicated code as copy_xstate_to_kernel() already handles this correctly.

Add a copy mode argument to the function which allows to restrict the state
copy to the FP and SSE features.

Also rename the function to copy_xstate_to_uabi_buf() so the name reflects
what it is doing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Rename to copy_xstate_to_uabi_buf() - Boris
V2: New patch
---
 arch/x86/include/asm/fpu/xstate.h |   12 +++++++++-
 arch/x86/kernel/fpu/regset.c      |    2 -
 arch/x86/kernel/fpu/xstate.c      |   42 ++++++++++++++++++++++++++++----------
 3 files changed, 42 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -103,12 +103,20 @@ extern void __init update_regset_xstate_
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
-struct membuf;
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
+enum xstate_copy_mode {
+	XSTATE_COPY_FP,
+	XSTATE_COPY_FX,
+	XSTATE_COPY_XSAVE,
+};
+
+struct membuf;
+void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+			     enum xstate_copy_mode mode);
+
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -94,7 +94,7 @@ int xstateregs_get(struct task_struct *t
 
 	fpu__prepare_read(fpu);
 
-	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1068,14 +1068,20 @@ static void copy_feature(bool from_xstat
 	membuf_write(to, from_xstate ? xstate : init_xstate, size);
 }
 
-/*
- * Convert from kernel XSAVE or XSAVES compacted format to UABI
- * non-compacted format and copy to a kernel-space ptrace buffer.
+/**
+ * copy_xstate_to_uabi_buf - Copy kernel saved xstate to a UABI buffer
+ * @to:		membuf descriptor
+ * @xsave:	The kernel xstate buffer to copy from
+ * @copy_mode:	The requested copy mode
  *
- * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVE.
+ * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
+ * format, i.e. from the kernel internal hardware dependent storage format
+ * to the requested @mode. UABI XSTATE is always uncompacted!
+ *
+ * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
+void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+			     enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
 	struct xregs_state *xinit = &init_fpstate.xsave;
@@ -1083,12 +1089,22 @@ void copy_xstate_to_kernel(struct membuf
 	unsigned int zerofrom;
 	int i;
 
-	/*
-	 * The destination is a ptrace buffer; we put in only user xstates:
-	 */
-	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
-	header.xfeatures &= xfeatures_mask_user();
+
+	/* Mask out the feature bits depending on copy mode */
+	switch (copy_mode) {
+	case XSTATE_COPY_FP:
+		header.xfeatures &= XFEATURE_MASK_FP;
+		break;
+
+	case XSTATE_COPY_FX:
+		header.xfeatures &= XFEATURE_MASK_FP | XFEATURE_MASK_SSE;
+		break;
+
+	case XSTATE_COPY_XSAVE:
+		header.xfeatures &= xfeatures_mask_user();
+		break;
+	}
 
 	/* Copy FP state up to MXCSR */
 	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
@@ -1109,6 +1125,9 @@ void copy_xstate_to_kernel(struct membuf
 		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
 		     sizeof(xsave->i387.xmm_space));
 
+	if (copy_mode != XSTATE_COPY_XSAVE)
+		goto out;
+
 	/* Zero the padding area */
 	membuf_zero(&to, sizeof(xsave->i387.padding));
 
@@ -1150,6 +1169,7 @@ void copy_xstate_to_kernel(struct membuf
 		zerofrom = xstate_offsets[i] + xstate_sizes[i];
 	}
 
+out:
 	if (to.left)
 		membuf_zero(&to, to.left);
 }


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 16/65] x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (14 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 15/65] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 17/65] x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get() Thomas Gleixner
                   ` (49 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the
FX state when XSAVE* is in use. This avoids overwriting the FPU state
buffer with fpstate_sanitize_xstate() which is error prone and duplicated
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Adopted to function rename
V2: New patch
---
 arch/x86/kernel/fpu/regset.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -33,13 +33,18 @@ int xfpregs_get(struct task_struct *targ
 {
 	struct fpu *fpu = &target->thread.fpu;
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
+	if (!cpu_feature_enabled(X86_FEATURE_FXSR))
 		return -ENODEV;
 
 	fpu__prepare_read(fpu);
-	fpstate_sanitize_xstate(fpu);
 
-	return membuf_write(&to, &fpu->state.fxsave, sizeof(struct fxregs_state));
+	if (!use_xsave()) {
+		return membuf_write(&to, &fpu->state.fxsave,
+				    sizeof(fpu->state.fxsave));
+	}
+
+	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_FX);
+	return 0;
 }
 
 int xfpregs_set(struct task_struct *target, const struct user_regset *regset,


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 17/65] x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (15 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 16/65] x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 18/65] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
                   ` (48 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the
FX state when XSAVE* is in use. This avoids to overwrite the FPU state
buffer with fpstate_sanitize_xstate() which is error prone and duplicated
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Adopted to function rename
V2: New patch
---
 arch/x86/kernel/fpu/regset.c |   30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -211,10 +211,10 @@ static inline u32 twd_fxsr_to_i387(struc
  * FXSR floating point environment conversions.
  */
 
-void
-convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+static void __convert_from_fxsr(struct user_i387_ia32_struct *env,
+				struct task_struct *tsk,
+				struct fxregs_state *fxsave)
 {
-	struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
 	struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
 	struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
 	int i;
@@ -248,6 +248,12 @@ convert_from_fxsr(struct user_i387_ia32_
 		memcpy(&to[i], &from[i], sizeof(to[0]));
 }
 
+void
+convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+{
+	__convert_from_fxsr(env, tsk, &tsk->thread.fpu.state.fxsave);
+}
+
 void convert_to_fxsr(struct fxregs_state *fxsave,
 		     const struct user_i387_ia32_struct *env)
 
@@ -280,25 +286,29 @@ int fpregs_get(struct task_struct *targe
 {
 	struct fpu *fpu = &target->thread.fpu;
 	struct user_i387_ia32_struct env;
+	struct fxregs_state fxsave, *fx;
 
 	fpu__prepare_read(fpu);
 
-	if (!boot_cpu_has(X86_FEATURE_FPU))
+	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return fpregs_soft_get(target, regset, to);
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR)) {
+	if (!cpu_feature_enabled(X86_FEATURE_FXSR)) {
 		return membuf_write(&to, &fpu->state.fsave,
 				    sizeof(struct fregs_state));
 	}
 
-	fpstate_sanitize_xstate(fpu);
+	if (use_xsave()) {
+		struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) };
 
-	if (to.left == sizeof(env)) {
-		convert_from_fxsr(to.p, target);
-		return 0;
+		/* Handle init state optimized xstate correctly */
+		copy_xstate_to_uabi_buf(mb, &fpu->state.xsave, XSTATE_COPY_FP);
+		fx = &fxsave;
+	} else {
+		fx = &fpu->state.fxsave;
 	}
 
-	convert_from_fxsr(&env, target);
+	__convert_from_fxsr(&env, target, fx);
 	return membuf_write(&to, &env, sizeof(env));
 }
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 18/65] x86/fpu: Remove fpstate_sanitize_xstate()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (16 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 17/65] x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 19/65] x86/fpu/regset: Move fpu__read_begin() into regset Thomas Gleixner
                   ` (47 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

No more users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/fpu/internal.h |    2 
 arch/x86/kernel/fpu/xstate.c        |   79 ------------------------------------
 2 files changed, 81 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -86,8 +86,6 @@ extern void fpstate_init_soft(struct swr
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-extern void fpstate_sanitize_xstate(struct fpu *fpu);
-
 #define user_insn(insn, output, input...)				\
 ({									\
 	int err;							\
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -129,85 +129,6 @@ static bool xfeature_is_supervisor(int x
 }
 
 /*
- * When executing XSAVEOPT (or other optimized XSAVE instructions), if
- * a processor implementation detects that an FPU state component is still
- * (or is again) in its initialized state, it may clear the corresponding
- * bit in the header.xfeatures field, and can skip the writeout of registers
- * to the corresponding memory layout.
- *
- * This means that when the bit is zero, the state component might still contain
- * some previous - non-initialized register state.
- *
- * Before writing xstate information to user-space we sanitize those components,
- * to always ensure that the memory layout of a feature will be in the init state
- * if the corresponding header bit is zero. This is to ensure that user-space doesn't
- * see some stale state in the memory layout during signal handling, debugging etc.
- */
-void fpstate_sanitize_xstate(struct fpu *fpu)
-{
-	struct fxregs_state *fx = &fpu->state.fxsave;
-	int feature_bit;
-	u64 xfeatures;
-
-	if (!use_xsaveopt())
-		return;
-
-	xfeatures = fpu->state.xsave.header.xfeatures;
-
-	/*
-	 * None of the feature bits are in init state. So nothing else
-	 * to do for us, as the memory layout is up to date.
-	 */
-	if ((xfeatures & xfeatures_mask_all) == xfeatures_mask_all)
-		return;
-
-	/*
-	 * FP is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_FP)) {
-		fx->cwd = 0x37f;
-		fx->swd = 0;
-		fx->twd = 0;
-		fx->fop = 0;
-		fx->rip = 0;
-		fx->rdp = 0;
-		memset(fx->st_space, 0, sizeof(fx->st_space));
-	}
-
-	/*
-	 * SSE is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_SSE))
-		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
-
-	/*
-	 * First two features are FPU and SSE, which above we handled
-	 * in a special way already:
-	 */
-	feature_bit = 0x2;
-	xfeatures = (xfeatures_mask_user() & ~xfeatures) >> 2;
-
-	/*
-	 * Update all the remaining memory layouts according to their
-	 * standard xstate layout, if their header bit is in the init
-	 * state:
-	 */
-	while (xfeatures) {
-		if (xfeatures & 0x1) {
-			int offset = xstate_comp_offsets[feature_bit];
-			int size = xstate_sizes[feature_bit];
-
-			memcpy((void *)fx + offset,
-			       (void *)&init_fpstate.xsave + offset,
-			       size);
-		}
-
-		xfeatures >>= 1;
-		feature_bit++;
-	}
-}
-
-/*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
  */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 19/65] x86/fpu/regset: Move fpu__read_begin() into regset
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (17 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 18/65] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 20/65] x86/fpu: Move fpu__write_begin() to regset Thomas Gleixner
                   ` (46 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The function can only be used from the regset get() callbacks safely. So
there is no reason to have it globaly exposed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    1 -
 arch/x86/kernel/fpu/core.c          |   20 --------------------
 arch/x86/kernel/fpu/regset.c        |   22 +++++++++++++++++++---
 3 files changed, 19 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -26,7 +26,6 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__prepare_read(struct fpu *fpu);
 extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -282,26 +282,6 @@ static void fpu__initialize(struct fpu *
 }
 
 /*
- * This function must be called before we read a task's fpstate.
- *
- * There's two cases where this gets called:
- *
- * - for the current task (when coredumping), in which case we have
- *   to save the latest FPU registers into the fpstate,
- *
- * - or it's called for stopped tasks (ptrace), in which case the
- *   registers were already saved by the context-switch code when
- *   the task scheduled out.
- *
- * If the task has used the FPU before then save it.
- */
-void fpu__prepare_read(struct fpu *fpu)
-{
-	if (fpu == &current->thread.fpu)
-		fpu__save(fpu);
-}
-
-/*
  * This function must be called before we write a task's fpstate.
  *
  * Invalidate any cached FPU registers.
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -28,6 +28,22 @@ int regset_xregset_fpregs_active(struct
 		return 0;
 }
 
+/*
+ * The regset get() functions are invoked from:
+ *
+ *   - coredump to dump the current task's fpstate. If the current task
+ *     owns the FPU then the memory state has to be synchronized and the
+ *     FPU register state preserved. Otherwise fpstate is already in sync.
+ *
+ *   - ptrace to dump fpstate of a stopped task, in which case the registers
+ *     have already been saved to fpstate on context switch.
+ */
+static void sync_fpstate(struct fpu *fpu)
+{
+	if (fpu == &current->thread.fpu)
+		fpu__save(fpu);
+}
+
 int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
@@ -36,7 +52,7 @@ int xfpregs_get(struct task_struct *targ
 	if (!cpu_feature_enabled(X86_FEATURE_FXSR))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	sync_fpstate(fpu);
 
 	if (!use_xsave()) {
 		return membuf_write(&to, &fpu->state.fxsave,
@@ -96,7 +112,7 @@ int xstateregs_get(struct task_struct *t
 	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	sync_fpstate(fpu);
 
 	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
 	return 0;
@@ -287,7 +303,7 @@ int fpregs_get(struct task_struct *targe
 	struct user_i387_ia32_struct env;
 	struct fxregs_state fxsave, *fx;
 
-	fpu__prepare_read(fpu);
+	sync_fpstate(fpu);
 
 	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return fpregs_soft_get(target, regset, to);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 20/65] x86/fpu: Move fpu__write_begin() to regset
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (18 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 19/65] x86/fpu/regset: Move fpu__read_begin() into regset Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 21/65] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
                   ` (45 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The only usecase for fpu__write_begin is the set() callback of regset, so
the function is pointlessly global.

Move it to the regset code and rename it to fpu_force_restore() which is
exactly decribing what the function does.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    1 -
 arch/x86/kernel/fpu/core.c          |   24 ------------------------
 arch/x86/kernel/fpu/regset.c        |   25 ++++++++++++++++++++++---
 3 files changed, 22 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -26,7 +26,6 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -282,30 +282,6 @@ static void fpu__initialize(struct fpu *
 }
 
 /*
- * This function must be called before we write a task's fpstate.
- *
- * Invalidate any cached FPU registers.
- *
- * After this function call, after registers in the fpstate are
- * modified and the child task has woken up, the child task will
- * restore the modified FPU state from the modified context. If we
- * didn't clear its cached status here then the cached in-registers
- * state pending on its former CPU could be restored, corrupting
- * the modifications.
- */
-void fpu__prepare_write(struct fpu *fpu)
-{
-	/*
-	 * Only stopped child tasks can be used to modify the FPU
-	 * state in the fpstate buffer:
-	 */
-	WARN_ON_FPU(fpu == &current->thread.fpu);
-
-	/* Invalidate any cached state: */
-	__fpu_invalidate_fpregs_state(fpu);
-}
-
-/*
  * Drops current FPU state: deactivates the fpregs and
  * the fpstate. NOTE: it still leaves previous contents
  * in the fpregs in the eager-FPU case.
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -44,6 +44,25 @@ static void sync_fpstate(struct fpu *fpu
 		fpu__save(fpu);
 }
 
+/*
+ * Invalidate cached FPU registers before modifying the stopped target
+ * task's fpstate.
+ *
+ * This forces the target task on resume to restore the FPU registers from
+ * modified fpstate. Otherwise the task might skip the restore and operate
+ * with the cached FPU registers which discards the modifications.
+ */
+static void fpu_force_restore(struct fpu *fpu)
+{
+	/*
+	 * Only stopped child tasks can be used to modify the FPU
+	 * state in the fpstate buffer:
+	 */
+	WARN_ON_FPU(fpu == &current->thread.fpu);
+
+	__fpu_invalidate_fpregs_state(fpu);
+}
+
 int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
@@ -88,7 +107,7 @@ int xfpregs_set(struct task_struct *targ
 	if (newstate.mxcsr & ~mxcsr_feature_mask)
 		return -EINVAL;
 
-	fpu__prepare_write(fpu);
+	fpu_force_restore(fpu);
 
 	/* Copy the state  */
 	memcpy(&fpu->state.fxsave, &newstate, sizeof(newstate));
@@ -146,7 +165,7 @@ int xstateregs_set(struct task_struct *t
 		}
 	}
 
-	fpu__prepare_write(fpu);
+	fpu_force_restore(fpu);
 	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
 out:
@@ -346,7 +365,7 @@ int fpregs_set(struct task_struct *targe
 	if (ret)
 		return ret;
 
-	fpu__prepare_write(fpu);
+	fpu_force_restore(fpu);
 
 	if (cpu_feature_enabled(X86_FEATURE_FXSR))
 		convert_to_fxsr(&fpu->state.fxsave, &env);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 21/65] x86/fpu: Get rid of using_compacted_format()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (19 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 20/65] x86/fpu: Move fpu__write_begin() to regset Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 22/65] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
                   ` (44 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

This function is pointlessly global and a complete misnomer because it's
usage is related to both supervisor state checks and compacted format
checks. Remove it and just make the conditions check the XSAVES feature.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V2: New patch.
---
 arch/x86/include/asm/fpu/xstate.h |    1 -
 arch/x86/kernel/fpu/xstate.c      |   22 ++++------------------
 2 files changed, 4 insertions(+), 19 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -449,20 +449,6 @@ int xfeature_size(int xfeature_nr)
 	return eax;
 }
 
-/*
- * 'XSAVES' implies two different things:
- * 1. saving of supervisor/system state
- * 2. using the compacted format
- *
- * Use this function when dealing with the compacted format so
- * that it is obvious which aspect of 'XSAVES' is being handled
- * by the calling code.
- */
-int using_compacted_format(void)
-{
-	return boot_cpu_has(X86_FEATURE_XSAVES);
-}
-
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
@@ -581,9 +567,9 @@ static void do_extra_xstate_size_checks(
 		check_xstate_against_struct(i);
 		/*
 		 * Supervisor state components can be managed only by
-		 * XSAVES, which is compacted-format only.
+		 * XSAVES.
 		 */
-		if (!using_compacted_format())
+		if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
 			XSTATE_WARN_ON(xfeature_is_supervisor(i));
 
 		/* Align from the end of the previous feature */
@@ -593,9 +579,9 @@ static void do_extra_xstate_size_checks(
 		 * The offset of a given state in the non-compacted
 		 * format is given to us in a CPUID leaf.  We check
 		 * them for being ordered (increasing offsets) in
-		 * setup_xstate_features().
+		 * setup_xstate_features(). XSAVES uses compacted format.
 		 */
-		if (!using_compacted_format())
+		if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
 			paranoid_xstate_size = xfeature_uncompacted_offset(i);
 		/*
 		 * The compacted-format offset always depends on where


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 22/65] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (20 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 21/65] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
  2021-06-23 12:01 ` [patch V4 23/65] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
                   ` (43 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Dave Hansen <dave.hansen@linux.intel.com>

PKRU is being removed from the kernel XSAVE/FPU buffers.  This removal
will probably include warnings for code that look up PKRU in those
buffers.

KVM currently looks up the location of PKRU but doesn't even use the
pointer that it gets back.  Rework the code to avoid calling
get_xsave_addr() except in cases where its result is actually used.

This makes the code more clear and also avoids the inevitable PKRU
warnings.

This is probably a good cleanup and could go upstream idependently
of any PKRU rework.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kvm/x86.c |   41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4589,20 +4589,21 @@ static void fill_xsave(u8 *dest, struct
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *src = get_xsave_addr(xsave, xfeature_nr);
+		void *src;
 
-		if (src) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(dest + offset, &vcpu->arch.pkru,
-				       sizeof(vcpu->arch.pkru));
-			else
-				memcpy(dest + offset, src, size);
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
 
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(dest + offset, &vcpu->arch.pkru,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			src = get_xsave_addr(xsave, xfeature_nr);
+			if (src)
+				memcpy(dest + offset, src, size);
 		}
 
 		valid -= xfeature_mask;
@@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *dest = get_xsave_addr(xsave, xfeature_nr);
 
-		if (dest) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(&vcpu->arch.pkru, src + offset,
-				       sizeof(vcpu->arch.pkru));
-			else
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
+
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(&vcpu->arch.pkru, src + offset,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			void *dest = get_xsave_addr(xsave, xfeature_nr);
+
+			if (dest)
 				memcpy(dest, src + offset, size);
 		}
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 23/65] x86/fpu: Cleanup arch_set_user_pkey_access()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (21 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 22/65] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 24/65] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
                   ` (42 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The function does a sanity check with a WARN_ON_ONCE() but happily proceeds
when the pkey argument is out of range.

Clean it up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/xstate.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -912,11 +912,10 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
  * rights for @pkey to @init_val.
  */
 int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
-		unsigned long init_val)
+			      unsigned long init_val)
 {
-	u32 old_pkru;
-	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
-	u32 new_pkru_bits = 0;
+	u32 old_pkru, new_pkru_bits = 0;
+	int pkey_shift;
 
 	/*
 	 * This check implies XSAVE support.  OSPKE only gets
@@ -930,7 +929,8 @@ int arch_set_user_pkey_access(struct tas
 	 * values originating from in-kernel users.  Complain
 	 * if a bad value is observed.
 	 */
-	WARN_ON_ONCE(pkey >= arch_max_pkey());
+	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
+		return -EINVAL;
 
 	/* Set the bits we need in PKRU:  */
 	if (init_val & PKEY_DISABLE_ACCESS)
@@ -939,6 +939,7 @@ int arch_set_user_pkey_access(struct tas
 		new_pkru_bits |= PKRU_WD_BIT;
 
 	/* Shift the bits in to the correct place in PKRU for pkey: */
+	pkey_shift = pkey * PKRU_BITS_PER_PKEY;
 	new_pkru_bits <<= pkey_shift;
 
 	/* Get old PKRU and mask off any old bits in place: */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 24/65] x86/fpu: Get rid of copy_supervisor_to_kernel()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (22 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 23/65] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 25/65] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
                   ` (41 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

If the fast path of restoring the FPU state on sigreturn fails or is not
taken and the current task's FPU is active then the FPU has to be
deactivated for the slow path to allow a safe update of the tasks FPU
memory state.

With supervisor states enabled, this requires to save the supervisor state
in the memory state first. Supervisor states require XSAVES so saving only
the supervisor state requires to reshuffle the memory buffer because XSAVES
uses the compacted format and therefore stores the supervisor states at the
beginning of the memory state. That's just an overengineered optimization.

Get rid of it and save the full state for this case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/xstate.h |    1 
 arch/x86/kernel/fpu/signal.c      |   13 +++++---
 arch/x86/kernel/fpu/xstate.c      |   55 --------------------------------------
 3 files changed, 8 insertions(+), 61 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,7 +104,6 @@ void *get_xsave_addr(struct xregs_state
 int xfeature_size(int xfeature_nr);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -401,15 +401,18 @@ static int __fpu__restore_sig(void __use
 	 * the optimisation).
 	 */
 	fpregs_lock();
-
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-
 		/*
-		 * Supervisor states are not modified by user space input.  Save
-		 * current supervisor states first and invalidate the FPU regs.
+		 * If supervisor states are available then save the
+		 * hardware state in current's fpstate so that the
+		 * supervisor state is preserved. Save the full state for
+		 * simplicity. There is no point in optimizing this by only
+		 * saving the supervisor states and then shuffle them to
+		 * the right place in memory. This is the slow path and the
+		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_supervisor_to_kernel(&fpu->state.xsave);
+			copy_xregs_to_kernel(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1197,61 +1197,6 @@ int copy_user_to_xstate(struct xregs_sta
 	return 0;
 }
 
-/*
- * Save only supervisor states to the kernel buffer.  This blows away all
- * old states, and is intended to be used only in __fpu__restore_sig(), where
- * user states are restored from the user buffer.
- */
-void copy_supervisor_to_kernel(struct xregs_state *xstate)
-{
-	struct xstate_header *header;
-	u64 max_bit, min_bit;
-	u32 lmask, hmask;
-	int err, i;
-
-	if (WARN_ON(!boot_cpu_has(X86_FEATURE_XSAVES)))
-		return;
-
-	if (!xfeatures_mask_supervisor())
-		return;
-
-	max_bit = __fls(xfeatures_mask_supervisor());
-	min_bit = __ffs(xfeatures_mask_supervisor());
-
-	lmask = xfeatures_mask_supervisor();
-	hmask = xfeatures_mask_supervisor() >> 32;
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* We should never fault when copying to a kernel buffer: */
-	if (WARN_ON_FPU(err))
-		return;
-
-	/*
-	 * At this point, the buffer has only supervisor states and must be
-	 * converted back to normal kernel format.
-	 */
-	header = &xstate->header;
-	header->xcomp_bv |= xfeatures_mask_all;
-
-	/*
-	 * This only moves states up in the buffer.  Start with
-	 * the last state and move backwards so that states are
-	 * not overwritten until after they are moved.  Note:
-	 * memmove() allows overlapping src/dst buffers.
-	 */
-	for (i = max_bit; i >= min_bit; i--) {
-		u8 *xbuf = (u8 *)xstate;
-
-		if (!((header->xfeatures >> i) & 1))
-			continue;
-
-		/* Move xfeature 'i' into its normal location */
-		memmove(xbuf + xstate_comp_offsets[i],
-			xbuf + xstate_supervisor_only_offsets[i],
-			xstate_sizes[i]);
-	}
-}
-
 /**
  * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
  *                                       an xsave area


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 25/65] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (23 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 24/65] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 26/65] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
                   ` (40 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_kernel() to os_xsave()
	copy_kernel_to_xregs() to os_xrstor()

These are truly low level wrappers around the actual instructions
XSAVE[OPT]/XRSTOR and XSAVES/XRSTORS with the twist that the selection
based on the available CPU features happens with an alternative to avoid
conditionals all over the place and to provide the best performance for hot
paths.

The os_ prefix tells that this is the OS selected mechanism.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Rename (Boris)
---
 arch/x86/include/asm/fpu/internal.h |   17 +++++++++++------
 arch/x86/kernel/fpu/core.c          |    7 +++----
 arch/x86/kernel/fpu/signal.c        |   21 +++++++++++----------
 arch/x86/kernel/fpu/xstate.c        |    2 +-
 4 files changed, 26 insertions(+), 21 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -263,7 +263,7 @@ static inline void fxsave_to_kernel(stru
  * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
-static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
+static inline void os_xrstor_booting(struct xregs_state *xstate)
 {
 	u64 mask = -1;
 	u32 lmask = mask;
@@ -286,8 +286,11 @@ static inline void copy_kernel_to_xregs_
 
 /*
  * Save processor xstate to xsave area.
+ *
+ * Uses either XSAVE or XSAVEOPT or XSAVES depending on the CPU features
+ * and command line options. The choice is permanent until the next reboot.
  */
-static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
+static inline void os_xsave(struct xregs_state *xstate)
 {
 	u64 mask = xfeatures_mask_all;
 	u32 lmask = mask;
@@ -304,8 +307,10 @@ static inline void copy_xregs_to_kernel(
 
 /*
  * Restore processor xstate from xsave area.
+ *
+ * Uses XRSTORS when XSAVES is used, XRSTOR otherwise.
  */
-static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask)
+static inline void os_xrstor(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
@@ -366,13 +371,13 @@ static inline int copy_user_to_xregs(str
  * Restore xstate from kernel space xsave area, return an error code instead of
  * an exception.
  */
-static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
+static inline int os_xrstor_safe(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
 
-	if (static_cpu_has(X86_FEATURE_XSAVES))
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES))
 		XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
 	else
 		XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
@@ -385,7 +390,7 @@ extern int copy_fpregs_to_fpstate(struct
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
-		copy_kernel_to_xregs(&fpstate->xsave, mask);
+		os_xrstor(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
 			copy_kernel_to_fxregs(&fpstate->fxsave);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -95,7 +95,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
 int copy_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
-		copy_xregs_to_kernel(&fpu->state.xsave);
+		os_xsave(&fpu->state.xsave);
 
 		/*
 		 * AVX512 state is tracked here because its use is
@@ -358,7 +358,7 @@ void fpu__drop(struct fpu *fpu)
 static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
 {
 	if (use_xsave())
-		copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
+		os_xrstor(&init_fpstate.xsave, features_mask);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		copy_kernel_to_fxregs(&init_fpstate.fxsave);
 	else
@@ -389,8 +389,7 @@ static void fpu__clear(struct fpu *fpu,
 	if (user_only) {
 		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
 		    xfeatures_mask_supervisor())
-			copy_kernel_to_xregs(&fpu->state.xsave,
-					     xfeatures_mask_supervisor());
+			os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
 		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
 	} else {
 		copy_init_fpstate_to_fpregs(xfeatures_mask_all);
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -261,14 +261,14 @@ static int copy_user_to_fpregs_zeroing(v
 
 			r = copy_user_to_fxregs(buf);
 			if (!r)
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
 			r = copy_user_to_xregs(buf, xbv);
 			if (!r && unlikely(init_bv))
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
 		}
 	} else if (use_fxsr()) {
@@ -356,9 +356,10 @@ static int __fpu__restore_sig(void __use
 			 * has been copied to the kernel one.
 			 */
 			if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
-			    xfeatures_mask_supervisor())
-				copy_kernel_to_xregs(&fpu->state.xsave,
-						     xfeatures_mask_supervisor());
+			    xfeatures_mask_supervisor()) {
+				os_xrstor(&fpu->state.xsave,
+					  xfeatures_mask_supervisor());
+			}
 			fpregs_mark_activate();
 			fpregs_unlock();
 			return 0;
@@ -412,7 +413,7 @@ static int __fpu__restore_sig(void __use
 		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_xregs_to_kernel(&fpu->state.xsave);
+			os_xsave(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
@@ -430,14 +431,14 @@ static int __fpu__restore_sig(void __use
 
 		fpregs_lock();
 		if (unlikely(init_bv))
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			os_xrstor(&init_fpstate.xsave, init_bv);
 
 		/*
 		 * Restore previously saved supervisor xstates along with
 		 * copied-in user xstates.
 		 */
-		ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
-					       user_xfeatures | xfeatures_mask_supervisor());
+		ret = os_xrstor_safe(&fpu->state.xsave,
+				     user_xfeatures | xfeatures_mask_supervisor());
 
 	} else if (use_fxsr()) {
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
@@ -454,7 +455,7 @@ static int __fpu__restore_sig(void __use
 			u64 init_bv;
 
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			os_xrstor(&init_fpstate.xsave, init_bv);
 		}
 
 		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -400,7 +400,7 @@ static void __init setup_init_fpu_buf(vo
 	/*
 	 * Init all the features state with header.xfeatures being 0x0
 	 */
-	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
+	os_xrstor_booting(&init_fpstate.xsave);
 
 	/*
 	 * All components are now in init state. Read the state back so


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 26/65] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (24 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 25/65] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 27/65] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
                   ` (39 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_user() to xsave_to_user_sigframe()
	copy_user_to_xregs() to xrstor_from_user_sigframe()

so it's entirely clear what this is about. This is also a clear indicator
of the potentially different storage format because this is user ABI and
cannot use compacted format.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/kernel/fpu/signal.c        |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -328,7 +328,7 @@ static inline void os_xrstor(struct xreg
  * backward compatibility for old applications which don't understand
  * compacted format of xsave area.
  */
-static inline int copy_xregs_to_user(struct xregs_state __user *buf)
+static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
 	u64 mask = xfeatures_mask_user();
 	u32 lmask = mask;
@@ -353,7 +353,7 @@ static inline int copy_xregs_to_user(str
 /*
  * Restore xstate from user space xsave area.
  */
-static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
+static inline int xrstor_from_user_sigframe(struct xregs_state __user *buf, u64 mask)
 {
 	struct xregs_state *xstate = ((__force struct xregs_state *)buf);
 	u32 lmask = mask;
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -129,7 +129,7 @@ static inline int copy_fpregs_to_sigfram
 	int err;
 
 	if (use_xsave())
-		err = copy_xregs_to_user(buf);
+		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
 		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
 	else
@@ -266,7 +266,7 @@ static int copy_user_to_fpregs_zeroing(v
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
-			r = copy_user_to_xregs(buf, xbv);
+			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
 				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 27/65] x86/fpu: Rename fxregs related copy functions
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (25 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 26/65] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] x86/fpu: Rename fxregs-related " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 28/65] x86/math-emu: Rename frstor() Thomas Gleixner
                   ` (38 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The function names for fxsave/fxrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_fxregs_to_kernel() to fxsave()
	copy_kernel_to_fxregs() to fxrstor()
	copy_fxregs_to_user() to fxsave_to_user_sigframe()
	copy_user_to_fxregs() to fxrstor_from_user_sigframe()

so it's clear what these are doing. All these functions are really low
level wrappers around the equaly named instructions, so mapping to the
documentation is just natural.

While at it replace the static_cpu_has(X86_FEATURE_FXSR) with use_fxsr() to
be consistent with the rest of the code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Replace the static_cpu_has() in the code and not just talk about it (Boris)
V3: Rename (Boris)
---
 arch/x86/include/asm/fpu/internal.h |   18 +++++-------------
 arch/x86/kernel/fpu/core.c          |    6 +++---
 arch/x86/kernel/fpu/signal.c        |   10 +++++-----
 3 files changed, 13 insertions(+), 21 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -129,7 +129,7 @@ static inline int copy_fregs_to_user(str
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
 
-static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)
+static inline int fxsave_to_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxsave %[fx], [fx] "=m" (*fx), "m" (*fx));
@@ -138,7 +138,7 @@ static inline int copy_fxregs_to_user(st
 
 }
 
-static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
+static inline void fxrstor(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		kernel_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -146,7 +146,7 @@ static inline void copy_kernel_to_fxregs
 		kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
+static inline int fxrstor_safe(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return kernel_insn_err(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -154,7 +154,7 @@ static inline int copy_kernel_to_fxregs_
 		return kernel_insn_err(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
+static inline int fxrstor_from_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -177,14 +177,6 @@ static inline int copy_user_to_fregs(str
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_fxregs_to_kernel(struct fpu *fpu)
-{
-	if (IS_ENABLED(CONFIG_X86_32))
-		asm volatile( "fxsave %[fx]" : [fx] "=m" (fpu->state.fxsave));
-	else
-		asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
-}
-
 static inline void fxsave(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
@@ -391,7 +383,7 @@ static inline void __copy_kernel_to_fpre
 		os_xrstor(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
-			copy_kernel_to_fxregs(&fpstate->fxsave);
+			fxrstor(&fpstate->fxsave);
 		else
 			copy_kernel_to_fregs(&fpstate->fsave);
 	}
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -107,7 +107,7 @@ int copy_fpregs_to_fpstate(struct fpu *f
 	}
 
 	if (likely(use_fxsr())) {
-		copy_fxregs_to_kernel(fpu);
+		fxsave(&fpu->state.fxsave);
 		return 1;
 	}
 
@@ -315,8 +315,8 @@ static inline void copy_init_fpstate_to_
 {
 	if (use_xsave())
 		os_xrstor(&init_fpstate.xsave, features_mask);
-	else if (static_cpu_has(X86_FEATURE_FXSR))
-		copy_kernel_to_fxregs(&init_fpstate.fxsave);
+	else if (use_fxsr())
+		fxrstor(&init_fpstate.fxsave);
 	else
 		copy_kernel_to_fregs(&init_fpstate.fsave);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -64,7 +64,7 @@ static inline int save_fsave_header(stru
 
 		fpregs_lock();
 		if (!test_thread_flag(TIF_NEED_FPU_LOAD))
-			copy_fxregs_to_kernel(&tsk->thread.fpu);
+			fxsave(&tsk->thread.fpu.state.fxsave);
 		fpregs_unlock();
 
 		convert_from_fxsr(&env, tsk);
@@ -131,7 +131,7 @@ static inline int copy_fpregs_to_sigfram
 	if (use_xsave())
 		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
-		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
+		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
 		err = copy_fregs_to_user((struct fregs_state __user *) buf);
 
@@ -259,7 +259,7 @@ static int copy_user_to_fpregs_zeroing(v
 		if (fx_only) {
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
 
-			r = copy_user_to_fxregs(buf);
+			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
@@ -272,7 +272,7 @@ static int copy_user_to_fpregs_zeroing(v
 			return r;
 		}
 	} else if (use_fxsr()) {
-		return copy_user_to_fxregs(buf);
+		return fxrstor_from_user_sigframe(buf);
 	} else
 		return copy_user_to_fregs(buf);
 }
@@ -458,7 +458,7 @@ static int __fpu__restore_sig(void __use
 			os_xrstor(&init_fpstate.xsave, init_bv);
 		}
 
-		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
+		ret = fxrstor_safe(&fpu->state.fxsave);
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 28/65] x86/math-emu: Rename frstor()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (26 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 27/65] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 29/65] x86/fpu: Rename fregs related copy functions Thomas Gleixner
                   ` (37 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

This is in the way of renaming the low level hardware accessors to match
the instruction name. Prepend it with FPU_ which is consistent vs. the
rest of the emulation code.

No functional change.

Reported-by Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Bring back the lost fix.
---
 arch/x86/math-emu/fpu_proto.h  |    2 +-
 arch/x86/math-emu/load_store.c |    2 +-
 arch/x86/math-emu/reg_ld_str.c |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/math-emu/fpu_proto.h
+++ b/arch/x86/math-emu/fpu_proto.h
@@ -144,7 +144,7 @@ extern int FPU_store_int16(FPU_REG *st0_
 extern int FPU_store_bcd(FPU_REG *st0_ptr, u_char st0_tag, u_char __user *d);
 extern int FPU_round_to_int(FPU_REG *r, u_char tag);
 extern u_char __user *fldenv(fpu_addr_modes addr_modes, u_char __user *s);
-extern void frstor(fpu_addr_modes addr_modes, u_char __user *data_address);
+extern void FPU_frstor(fpu_addr_modes addr_modes, u_char __user *data_address);
 extern u_char __user *fstenv(fpu_addr_modes addr_modes, u_char __user *d);
 extern void fsave(fpu_addr_modes addr_modes, u_char __user *data_address);
 extern int FPU_tagof(FPU_REG *ptr);
--- a/arch/x86/math-emu/load_store.c
+++ b/arch/x86/math-emu/load_store.c
@@ -240,7 +240,7 @@ int FPU_load_store(u_char type, fpu_addr
 		   fix-up operations. */
 		return 1;
 	case 022:		/* frstor m94/108byte */
-		frstor(addr_modes, (u_char __user *) data_address);
+		FPU_frstor(addr_modes, (u_char __user *) data_address);
 		/* Ensure that the values just loaded are not changed by
 		   fix-up operations. */
 		return 1;
--- a/arch/x86/math-emu/reg_ld_str.c
+++ b/arch/x86/math-emu/reg_ld_str.c
@@ -1117,7 +1117,7 @@ u_char __user *fldenv(fpu_addr_modes add
 	return s;
 }
 
-void frstor(fpu_addr_modes addr_modes, u_char __user *data_address)
+void FPU_frstor(fpu_addr_modes addr_modes, u_char __user *data_address)
 {
 	int i, regnr;
 	u_char __user *s = fldenv(addr_modes, data_address);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 29/65] x86/fpu: Rename fregs related copy functions
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (27 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 28/65] x86/math-emu: Rename frstor() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] x86/fpu: Rename fregs-related " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 30/65] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
                   ` (36 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The function names for fnsave/fnrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_kernel_to_fregs() to fnrstor()
	copy_fregs_to_user()   to fnsave_to_user_sigframe()
	copy_user_to_fregs()   to frstor_from_user_sigframe()

so it's clear what these are doing. All these functions are really low
level wrappers around the equaly named instructions, so mapping to the
documentation is just natural.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Rename (Boris)
---
 arch/x86/include/asm/fpu/internal.h |   10 +++++-----
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |    6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -124,7 +124,7 @@ static inline void fpstate_init_soft(str
 		     _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_fprestore)	\
 		     : output : input)
 
-static inline int copy_fregs_to_user(struct fregs_state __user *fx)
+static inline int fnsave_to_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
@@ -162,17 +162,17 @@ static inline int fxrstor_from_user_sigf
 		return user_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_kernel_to_fregs(struct fregs_state *fx)
+static inline void frstor(struct fregs_state *fx)
 {
 	kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fregs_err(struct fregs_state *fx)
+static inline int frstor_safe(struct fregs_state *fx)
 {
 	return kernel_insn_err(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fregs(struct fregs_state __user *fx)
+static inline int frstor_from_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
@@ -385,7 +385,7 @@ static inline void __copy_kernel_to_fpre
 		if (use_fxsr())
 			fxrstor(&fpstate->fxsave);
 		else
-			copy_kernel_to_fregs(&fpstate->fsave);
+			frstor(&fpstate->fsave);
 	}
 }
 
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -318,7 +318,7 @@ static inline void copy_init_fpstate_to_
 	else if (use_fxsr())
 		fxrstor(&init_fpstate.fxsave);
 	else
-		copy_kernel_to_fregs(&init_fpstate.fsave);
+		frstor(&init_fpstate.fsave);
 
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -133,7 +133,7 @@ static inline int copy_fpregs_to_sigfram
 	else if (use_fxsr())
 		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
-		err = copy_fregs_to_user((struct fregs_state __user *) buf);
+		err = fnsave_to_user_sigframe((struct fregs_state __user *) buf);
 
 	if (unlikely(err) && __clear_user(buf, fpu_user_xstate_size))
 		err = -EFAULT;
@@ -274,7 +274,7 @@ static int copy_user_to_fpregs_zeroing(v
 	} else if (use_fxsr()) {
 		return fxrstor_from_user_sigframe(buf);
 	} else
-		return copy_user_to_fregs(buf);
+		return frstor_from_user_sigframe(buf);
 }
 
 static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
@@ -465,7 +465,7 @@ static int __fpu__restore_sig(void __use
 			goto out;
 
 		fpregs_lock();
-		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
+		ret = frstor_safe(&fpu->state.fsave);
 	}
 	if (!ret)
 		fpregs_mark_activate();


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 30/65] x86/fpu: Rename xstate copy functions which are related to UABI
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (28 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 29/65] x86/fpu: Rename fregs related copy functions Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 31/65] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
                   ` (35 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Rename them to reflect that these functions deal with user space format
XSAVE buffers.

      copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate()
      copy_user_to_xstate()   -> copy_sigframe_from_user_to_xstate()

Again a clear statement that these functions deal with user space ABI.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/xstate.h |    4 ++--
 arch/x86/kernel/fpu/regset.c      |    2 +-
 arch/x86/kernel/fpu/signal.c      |    2 +-
 arch/x86/kernel/fpu/xstate.c      |    5 +++--
 4 files changed, 7 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -102,8 +102,8 @@ extern void __init update_regset_xstate_
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -167,7 +167,7 @@ int xstateregs_set(struct task_struct *t
 	}
 
 	fpu_force_restore(fpu);
-	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
+	ret = copy_uabi_from_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
 out:
 	vfree(tmpbuf);
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -422,7 +422,7 @@ static int __fpu__restore_sig(void __use
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
-		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
+		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			goto out;
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1099,7 +1099,7 @@ static inline bool mxcsr_valid(struct xs
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 {
 	unsigned int offset, size;
 	int i;
@@ -1154,7 +1154,8 @@ int copy_kernel_to_xstate(struct xregs_s
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
  */
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
+				      const void __user *ubuf)
 {
 	unsigned int offset, size;
 	int i;


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 31/65] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (29 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 30/65] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:01 ` [patch V4 32/65] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
                   ` (34 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

copy_uabi_from_user_to_xstate() and copy_uabi_from_kernel_to_xstate() are
almost identical except for the copy function.

Unify them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Fixed MXCSR thinkos and simplified the handling of MXCSR
---
 arch/x86/kernel/fpu/xstate.c |  137 ++++++++++++++-----------------------------
 1 file changed, 47 insertions(+), 90 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -953,23 +953,6 @@ int arch_set_user_pkey_access(struct tas
 }
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
-/*
- * Weird legacy quirk: SSE and YMM states store information in the
- * MXCSR and MXCSR_FLAGS fields of the FP area. That means if the FP
- * area is marked as unused in the xfeatures header, we need to copy
- * MXCSR and MXCSR_FLAGS if either SSE or YMM are in use.
- */
-static inline bool xfeatures_mxcsr_quirk(u64 xfeatures)
-{
-	if (!(xfeatures & (XFEATURE_MASK_SSE|XFEATURE_MASK_YMM)))
-		return false;
-
-	if (xfeatures & XFEATURE_MASK_FP)
-		return false;
-
-	return true;
-}
-
 static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
 			 void *init_xstate, unsigned int size)
 {
@@ -1082,39 +1065,53 @@ void copy_xstate_to_uabi_buf(struct memb
 		membuf_zero(&to, to.left);
 }
 
-static inline bool mxcsr_valid(struct xstate_header *hdr, const u32 *mxcsr)
+static int copy_from_buffer(void *dst, unsigned int offset, unsigned int size,
+			    const void *kbuf, const void __user *ubuf)
 {
-	u64 mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
-
-	/* Only check if it is in use */
-	if (hdr->xfeatures & mask) {
-		/* Reserved bits in MXCSR must be zero. */
-		if (*mxcsr & ~mxcsr_feature_mask)
-			return false;
+	if (kbuf) {
+		memcpy(dst, kbuf + offset, size);
+	} else {
+		if (copy_from_user(dst, ubuf + offset, size))
+			return -EFAULT;
 	}
-	return true;
+	return 0;
 }
 
-/*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
- * and copy to the target thread. This is called from xstateregs_set().
- */
-int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+
+static int copy_uabi_to_xstate(struct xregs_state *xsave, const void *kbuf,
+			       const void __user *ubuf)
 {
 	unsigned int offset, size;
-	int i;
 	struct xstate_header hdr;
+	u64 mask;
+	int i;
 
 	offset = offsetof(struct xregs_state, header);
-	size = sizeof(hdr);
-
-	memcpy(&hdr, kbuf + offset, size);
+	if (copy_from_buffer(&hdr, offset, sizeof(hdr), kbuf, ubuf))
+		return -EFAULT;
 
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
 
-	if (!mxcsr_valid(&hdr, kbuf + offsetof(struct fxregs_state, mxcsr)))
-		return -EINVAL;
+	/* Validate MXCSR when any of the related features is in use */
+	mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
+	if (hdr.xfeatures & mask) {
+		u32 mxcsr[2];
+
+		offset = offsetof(struct fxregs_state, mxcsr);
+		if (copy_from_buffer(mxcsr, offset, sizeof(mxcsr), kbuf, ubuf))
+			return -EFAULT;
+
+		/* Reserved bits in MXCSR must be zero. */
+		if (mxcsr[0] & ~mxcsr_feature_mask)
+			return -EINVAL;
+
+		/* SSE and YMM require MXCSR even when FP is not in use. */
+		if (!(hdr.xfeatures & XFEATURE_MASK_FP)) {
+			xsave->i387.mxcsr = mxcsr[0];
+			xsave->i387.mxcsr_mask = mxcsr[1];
+		}
+	}
 
 	for (i = 0; i < XFEATURE_MAX; i++) {
 		u64 mask = ((u64)1 << i);
@@ -1125,16 +1122,11 @@ int copy_uabi_from_kernel_to_xstate(stru
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
 
-			memcpy(dst, kbuf + offset, size);
+			if (copy_from_buffer(dst, offset, size, kbuf, ubuf))
+				return -EFAULT;
 		}
 	}
 
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		memcpy(&xsave->i387.mxcsr, kbuf + offset, size);
-	}
-
 	/*
 	 * The state that came in from userspace was user-state only.
 	 * Mask all the user states out of 'xfeatures':
@@ -1150,6 +1142,16 @@ int copy_uabi_from_kernel_to_xstate(stru
 }
 
 /*
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S]
+ * format and copy to the target thread. This is called from
+ * xstateregs_set().
+ */
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+{
+	return copy_uabi_to_xstate(xsave, kbuf, NULL);
+}
+
+/*
  * Convert from a sigreturn standard-format user-space buffer to kernel
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
@@ -1157,52 +1159,7 @@ int copy_uabi_from_kernel_to_xstate(stru
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
 				      const void __user *ubuf)
 {
-	unsigned int offset, size;
-	int i;
-	struct xstate_header hdr;
-
-	offset = offsetof(struct xregs_state, header);
-	size = sizeof(hdr);
-
-	if (copy_from_user(&hdr, ubuf + offset, size))
-		return -EFAULT;
-
-	if (validate_user_xstate_header(&hdr))
-		return -EINVAL;
-
-	for (i = 0; i < XFEATURE_MAX; i++) {
-		u64 mask = ((u64)1 << i);
-
-		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, i);
-
-			offset = xstate_offsets[i];
-			size = xstate_sizes[i];
-
-			if (copy_from_user(dst, ubuf + offset, size))
-				return -EFAULT;
-		}
-	}
-
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		if (copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
-			return -EFAULT;
-	}
-
-	/*
-	 * The state that came in from userspace was user-state only.
-	 * Mask all the user states out of 'xfeatures':
-	 */
-	xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR_ALL;
-
-	/*
-	 * Add back in the features that came in from userspace:
-	 */
-	xsave->header.xfeatures |= hdr.xfeatures;
-
-	return 0;
+	return copy_uabi_to_xstate(xsave, NULL, ubuf);
 }
 
 /**


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 32/65] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (30 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 31/65] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
@ 2021-06-23 12:01 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 33/65] x86/fpu: Get rid of the FNSAVE optimization Thomas Gleixner
                   ` (33 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:01 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

A copy is guaranteed to leave the source intact, which is not the case when
FNSAVE is used as that reinitilizes the registers.

Save does not make such guarantees and it matches what this is about,
i.e. to save the state for a later restore.

Rename it to save_fpregs_to_fpstate(). 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/kernel/fpu/core.c          |   10 +++++-----
 arch/x86/kvm/x86.c                  |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -375,7 +375,7 @@ static inline int os_xrstor_safe(struct
 	return err;
 }
 
-extern int copy_fpregs_to_fpstate(struct fpu *fpu);
+extern int save_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
@@ -507,7 +507,7 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!copy_fpregs_to_fpstate(old_fpu))
+		if (!save_fpregs_to_fpstate(old_fpu))
 			old_fpu->last_cpu = -1;
 		else
 			old_fpu->last_cpu = cpu;
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -92,7 +92,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
  * Modern FPU state can be kept in registers, if there are
  * no pending FP exceptions.
  */
-int copy_fpregs_to_fpstate(struct fpu *fpu)
+int save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(&fpu->state.xsave);
@@ -119,7 +119,7 @@ int copy_fpregs_to_fpstate(struct fpu *f
 
 	return 0;
 }
-EXPORT_SYMBOL(copy_fpregs_to_fpstate);
+EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -137,7 +137,7 @@ void kernel_fpu_begin_mask(unsigned int
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
 		 */
-		copy_fpregs_to_fpstate(&current->thread.fpu);
+		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
 
@@ -172,7 +172,7 @@ void fpu__save(struct fpu *fpu)
 	trace_x86_fpu_before_save(fpu);
 
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!copy_fpregs_to_fpstate(fpu)) {
+		if (!save_fpregs_to_fpstate(fpu)) {
 			copy_kernel_to_fpregs(&fpu->state);
 		}
 	}
@@ -255,7 +255,7 @@ int fpu__copy(struct task_struct *dst, s
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!copy_fpregs_to_fpstate(dst_fpu))
+	else if (!save_fpregs_to_fpstate(dst_fpu))
 		copy_kernel_to_fpregs(&dst_fpu->state);
 
 	fpregs_unlock();
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9618,7 +9618,7 @@ static void kvm_save_current_fpu(struct
 		memcpy(&fpu->state, &current->thread.fpu.state,
 		       fpu_kernel_xstate_size);
 	else
-		copy_fpregs_to_fpstate(fpu);
+		save_fpregs_to_fpstate(fpu);
 }
 
 /* Swap (qemu) user FPU context for the guest FPU context. */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 33/65] x86/fpu: Get rid of the FNSAVE optimization
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (31 preceding siblings ...)
  2021-06-23 12:01 ` [patch V4 32/65] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 34/65] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate() Thomas Gleixner
                   ` (32 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The FNSAVE support requires conditionals in quite some call paths because
FNSAVE reinitializes the FPU hardware. If the save has to preserve the FPU
register state then the caller has to conditionally restore it from memory
when FNSAVE is in use.

This also requires a conditional in context switch because the restore
avoidance optimization cannot work with FNSAVE. As this only affects 20+
years old CPUs there is really no reason to keep this optimization
effective for FNSAVE. It's about time to not optimize for antiques anymore.

Just unconditionally FRSTOR the save content to the registers and clean up
the conditionals all over the place.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: New patch
---
 arch/x86/include/asm/fpu/internal.h |   18 +++++++-----
 arch/x86/kernel/fpu/core.c          |   54 +++++++++++++++---------------------
 2 files changed, 34 insertions(+), 38 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -83,6 +83,7 @@ extern void fpstate_init_soft(struct swr
 #else
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
+extern void save_fpregs_to_fpstate(struct fpu *fpu);
 
 #define user_insn(insn, output, input...)				\
 ({									\
@@ -375,8 +376,6 @@ static inline int os_xrstor_safe(struct
 	return err;
 }
 
-extern int save_fpregs_to_fpstate(struct fpu *fpu);
-
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
@@ -507,12 +506,17 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!save_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
-		else
-			old_fpu->last_cpu = cpu;
+		save_fpregs_to_fpstate(old_fpu);
+		/*
+		 * The save operation preserved register state, so the
+		 * fpu_fpregs_owner_ctx is still @old_fpu. Store the
+		 * current CPU number in @old_fpu, so the next return
+		 * to user space can avoid the FPU register restore
+		 * when is returns on the same CPU and still owns the
+		 * context.
+		 */
+		old_fpu->last_cpu = cpu;
 
-		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
 	}
 }
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -83,16 +83,20 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
+ * Save the FPU register state in fpu->state. The register state is
+ * preserved.
  *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
+ * Must be called with fpregs_lock() held.
+ *
+ * The legacy FNSAVE instruction clears all FPU state unconditionally, so
+ * register state has to be reloaded. That might be a pointless exercise
+ * when the FPU is going to be used by another task right after that. But
+ * this only affects 20+ years old 32bit systems and avoids conditionals all
+ * over the place.
+ *
+ * FXSAVE and all XSAVE variants preserve the FPU register state.
  */
-int save_fpregs_to_fpstate(struct fpu *fpu)
+void save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(&fpu->state.xsave);
@@ -103,21 +107,20 @@ int save_fpregs_to_fpstate(struct fpu *f
 		 */
 		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
 			fpu->avx512_timestamp = jiffies;
-		return 1;
+		return;
 	}
 
 	if (likely(use_fxsr())) {
 		fxsave(&fpu->state.fxsave);
-		return 1;
+		return;
 	}
 
 	/*
 	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
+	 * so we have to reload them from the memory state.
 	 */
 	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
-
-	return 0;
+	frstor(&fpu->state.fsave);
 }
 EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
@@ -133,10 +136,6 @@ void kernel_fpu_begin_mask(unsigned int
 	if (!(current->flags & PF_KTHREAD) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
 		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
@@ -171,11 +170,8 @@ void fpu__save(struct fpu *fpu)
 	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!save_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
-		}
-	}
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		save_fpregs_to_fpstate(fpu);
 
 	trace_x86_fpu_after_save(fpu);
 	fpregs_unlock();
@@ -244,20 +240,16 @@ int fpu__copy(struct task_struct *dst, s
 	memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
 
 	/*
-	 * If the FPU registers are not current just memcpy() the state.
-	 * Otherwise save current FPU registers directly into the child's FPU
-	 * context, without any memory-to-memory copying.
-	 *
-	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to load them back. )
+	 * If the FPU registers are not owned by current just memcpy() the
+	 * state.  Otherwise save the FPU registers directly into the
+	 * child's FPU context, without any memory-to-memory copying.
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!save_fpregs_to_fpstate(dst_fpu))
-		copy_kernel_to_fpregs(&dst_fpu->state);
-
+	else
+		save_fpregs_to_fpstate(dst_fpu);
 	fpregs_unlock();
 
 	set_tsk_thread_flag(dst, TIF_NEED_FPU_LOAD);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 34/65] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (32 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 33/65] x86/fpu: Get rid of the FNSAVE optimization Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 35/65] x86/fpu: Rename initstate copy functions Thomas Gleixner
                   ` (31 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

This is not a copy functionality. It restores the register state from the
supplied kernel buffer.

No functional changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    8 ++++----
 arch/x86/kvm/x86.c                  |    4 ++--
 arch/x86/mm/extable.c               |    2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -376,7 +376,7 @@ static inline int os_xrstor_safe(struct
 	return err;
 }
 
-static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
+static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
 		os_xrstor(&fpstate->xsave, mask);
@@ -388,7 +388,7 @@ static inline void __copy_kernel_to_fpre
 	}
 }
 
-static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
+static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
 	/*
 	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
@@ -403,7 +403,7 @@ static inline void copy_kernel_to_fpregs
 			: : [addr] "m" (fpstate));
 	}
 
-	__copy_kernel_to_fpregs(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
@@ -474,7 +474,7 @@ static inline void __fpregs_load_activat
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		copy_kernel_to_fpregs(&fpu->state);
+		restore_fpregs_from_fpstate(&fpu->state);
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9634,7 +9634,7 @@ static void kvm_load_guest_fpu(struct kv
 	 */
 	if (vcpu->arch.guest_fpu)
 		/* PKRU is separately restored in kvm_x86_ops.run. */
-		__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state,
+		__restore_fpregs_from_fpstate(&vcpu->arch.guest_fpu->state,
 					~XFEATURE_MASK_PKRU);
 
 	fpregs_mark_activate();
@@ -9655,7 +9655,7 @@ static void kvm_put_guest_fpu(struct kvm
 	if (vcpu->arch.guest_fpu)
 		kvm_save_current_fpu(vcpu->arch.guest_fpu);
 
-	copy_kernel_to_fpregs(&vcpu->arch.user_fpu->state);
+	restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state);
 
 	fpregs_mark_activate();
 	fpregs_unlock();
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__copy_kernel_to_fpregs(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, -1);
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 35/65] x86/fpu: Rename initstate copy functions
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (33 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 34/65] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 36/65] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
                   ` (30 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Again this not a copy. It's restoring register state from kernel memory.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/core.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -303,7 +303,7 @@ void fpu__drop(struct fpu *fpu)
  * Clear FPU registers by setting them up from the init fpstate.
  * Caller must do fpregs_[un]lock() around it.
  */
-static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
+static inline void restore_fpregs_from_init_fpstate(u64 features_mask)
 {
 	if (use_xsave())
 		os_xrstor(&init_fpstate.xsave, features_mask);
@@ -338,9 +338,9 @@ static void fpu__clear(struct fpu *fpu,
 		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
 		    xfeatures_mask_supervisor())
 			os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
-		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+		restore_fpregs_from_init_fpstate(xfeatures_mask_user());
 	} else {
-		copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+		restore_fpregs_from_init_fpstate(xfeatures_mask_all);
 	}
 
 	fpregs_mark_activate();


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 36/65] x86/fpu: Rename "dynamic" XSTATEs to "independent"
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (34 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 35/65] x86/fpu: Rename initstate copy functions Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 37/65] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
                   ` (29 subsequent siblings)
  65 siblings, 0 replies; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Andy Lutomirski <luto@kernel.org>

The salient feature of "dynamic" XSTATEs is that they are not part of the
main task XSTATE buffer.  The fact that they are dynamically allocated is
irrelevant and will become quite confusing when user math XSTATEs start
being dynamically allocated.  Rename them to "independent" because they
are independent of the main XSTATE code.

This is just a search-and-replace with some whitespace updates to keep
things aligned.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1eecb0e4f3e07828ebe5d737ec77dc3b708fad2d.1623388344.git.luto@kernel.org
---
V4: Fix the comments
---
 arch/x86/events/intel/lbr.c       |    6 +--
 arch/x86/include/asm/fpu/xstate.h |   22 ++++++-------
 arch/x86/kernel/fpu/xstate.c      |   62 +++++++++++++++++++-------------------
 3 files changed, 45 insertions(+), 45 deletions(-)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_dynamic_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_dynamic_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_dynamic_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -42,21 +42,21 @@
  * and its size may be huge. Saving/restoring such supervisor state components
  * at each context switch can cause high CPU and space overhead, which should
  * be avoided. Such supervisor state components should only be saved/restored
- * on demand. The on-demand dynamic supervisor features are set in this mask.
+ * on demand. The on-demand supervisor features are set in this mask.
  *
- * Unlike the existing supported supervisor features, a dynamic supervisor
+ * Unlike the existing supported supervisor features, a independent supervisor
  * feature does not allocate a buffer in task->fpu, and the corresponding
  * supervisor state component cannot be saved/restored at each context switch.
  *
- * To support a dynamic supervisor feature, a developer should follow the
+ * To support a independent supervisor feature, a developer should follow the
  * dos and don'ts as below:
  * - Do dynamically allocate a buffer for the supervisor state component.
  * - Do manually invoke the XSAVES/XRSTORS instruction to save/restore the
  *   state component to/from the buffer.
- * - Don't set the bit corresponding to the dynamic supervisor feature in
+ * - Don't set the bit corresponding to the independent supervisor feature in
  *   IA32_XSS at run time, since it has been set at boot time.
  */
-#define XFEATURE_MASK_DYNAMIC (XFEATURE_MASK_LBR)
+#define XFEATURE_MASK_INDEPENDENT (XFEATURE_MASK_LBR)
 
 /*
  * Unsupported supervisor features. When a supervisor feature in this mask is
@@ -66,7 +66,7 @@
 
 /* All supervisor states including supported and unsupported states. */
 #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
-				      XFEATURE_MASK_DYNAMIC | \
+				      XFEATURE_MASK_INDEPENDENT | \
 				      XFEATURE_MASK_SUPERVISOR_UNSUPPORTED)
 
 #ifdef CONFIG_X86_64
@@ -87,12 +87,12 @@ static inline u64 xfeatures_mask_user(vo
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_dynamic(void)
+static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
-		return XFEATURE_MASK_DYNAMIC & ~XFEATURE_MASK_LBR;
+		return XFEATURE_MASK_INDEPENDENT & ~XFEATURE_MASK_LBR;
 
-	return XFEATURE_MASK_DYNAMIC;
+	return XFEATURE_MASK_INDEPENDENT;
 }
 
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
@@ -104,8 +104,8 @@ void *get_xsave_addr(struct xregs_state
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
+void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
+void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
 
 enum xstate_copy_mode {
 	XSTATE_COPY_FP,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -151,7 +151,7 @@ void fpu__init_cpu_xstate(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() |
-				     xfeatures_mask_dynamic());
+				     xfeatures_mask_independent());
 	}
 }
 
@@ -551,7 +551,7 @@ static void check_xstate_against_struct(
  * how large the XSAVE buffer needs to be.  We are recalculating
  * it to be safe.
  *
- * Dynamic XSAVE features allocate their own buffers and are not
+ * Independent XSAVE features allocate their own buffers and are not
  * covered by these checks. Only the size of the buffer for task->fpu
  * is checked here.
  */
@@ -617,18 +617,18 @@ static unsigned int __init get_xsaves_si
 }
 
 /*
- * Get the total size of the enabled xstates without the dynamic supervisor
+ * Get the total size of the enabled xstates without the independent supervisor
  * features.
  */
-static unsigned int __init get_xsaves_size_no_dynamic(void)
+static unsigned int __init get_xsaves_size_no_independent(void)
 {
-	u64 mask = xfeatures_mask_dynamic();
+	u64 mask = xfeatures_mask_independent();
 	unsigned int size;
 
 	if (!mask)
 		return get_xsaves_size();
 
-	/* Disable dynamic features. */
+	/* Disable independent features. */
 	wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor());
 
 	/*
@@ -637,7 +637,7 @@ static unsigned int __init get_xsaves_si
 	 */
 	size = get_xsaves_size();
 
-	/* Re-enable dynamic features so XSAVES will work on them again. */
+	/* Re-enable independent features so XSAVES will work on them again. */
 	wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() | mask);
 
 	return size;
@@ -680,7 +680,7 @@ static int __init init_xstate_size(void)
 	xsave_size = get_xsave_size();
 
 	if (boot_cpu_has(X86_FEATURE_XSAVES))
-		possible_xstate_size = get_xsaves_size_no_dynamic();
+		possible_xstate_size = get_xsaves_size_no_independent();
 	else
 		possible_xstate_size = xsave_size;
 
@@ -837,7 +837,7 @@ void fpu__resume_cpu(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor()  |
-				     xfeatures_mask_dynamic());
+				     xfeatures_mask_independent());
 	}
 }
 
@@ -1163,34 +1163,34 @@ int copy_sigframe_from_user_to_xstate(st
 }
 
 /**
- * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
- *                                       an xsave area
+ * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
+ *                                           an xsave area
  * @xstate: A pointer to an xsave area
- * @mask: Represent the dynamic supervisor features saved into the xsave area
+ * @mask: Represent the independent supervisor features saved into the xsave area
  *
- * Only the dynamic supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_DYNAMIC for the details of dynamic
- * supervisor feature). Besides the dynamic supervisor states, the legacy
+ * Only the independent supervisor states sets in the mask are saved into the xsave
+ * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
+ * supervisor feature). Besides the independent supervisor states, the legacy
  * region and XSAVE header are also saved into the xsave area. The supervisor
  * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
  * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
  *
  * The xsave area must be 64-bytes aligned.
  */
-void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 dynamic_mask = xfeatures_mask_dynamic() & mask;
+	u64 independent_mask = xfeatures_mask_independent() & mask;
 	u32 lmask, hmask;
 	int err;
 
 	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
 		return;
 
-	if (WARN_ON_FPU(!dynamic_mask))
+	if (WARN_ON_FPU(!independent_mask))
 		return;
 
-	lmask = dynamic_mask;
-	hmask = dynamic_mask >> 32;
+	lmask = independent_mask;
+	hmask = independent_mask >> 32;
 
 	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
 
@@ -1199,34 +1199,34 @@ void copy_dynamic_supervisor_to_kernel(s
 }
 
 /**
- * copy_kernel_to_dynamic_supervisor() - Restore dynamic supervisor states from
- *                                       an xsave area
+ * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
+ *                                           an xsave area
  * @xstate: A pointer to an xsave area
- * @mask: Represent the dynamic supervisor features restored from the xsave area
+ * @mask: Represent the independent supervisor features restored from the xsave area
  *
- * Only the dynamic supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_DYNAMIC for the details of
- * dynamic supervisor feature). Besides the dynamic supervisor states, the
+ * Only the independent supervisor states sets in the mask are restored from the
+ * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
+ * independent supervisor feature). Besides the independent supervisor states, the
  * legacy region and XSAVE header are also restored from the xsave area. The
  * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
  * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
  *
  * The xsave area must be 64-bytes aligned.
  */
-void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask)
+void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
 {
-	u64 dynamic_mask = xfeatures_mask_dynamic() & mask;
+	u64 independent_mask = xfeatures_mask_independent() & mask;
 	u32 lmask, hmask;
 	int err;
 
 	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
 		return;
 
-	if (WARN_ON_FPU(!dynamic_mask))
+	if (WARN_ON_FPU(!independent_mask))
 		return;
 
-	lmask = dynamic_mask;
-	hmask = dynamic_mask >> 32;
+	lmask = independent_mask;
+	hmask = independent_mask >> 32;
 
 	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 37/65] x86/fpu/xstate: Sanitize handling of independent features
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (35 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 36/65] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 38/65] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
                   ` (28 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The copy functions for the independent features are horribly named and the
supervisor and independent part is just overengineered.

The point is that the supplied mask has either to be a subset of the
independent features or a subset of the task->fpu.xstate managed features.

Rewrite it so it checks for invalid overlaps of these areas in the caller
supplied feature mask. Rename it so it follows the new naming convention
for these operations. Mop up the function documentation.

This allows to use that function for other purposes as well.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
---
V4: Split out the duplicated validation code (Boris)
V3: Rename
---
 arch/x86/events/intel/lbr.c       |    6 +-
 arch/x86/include/asm/fpu/xstate.h |    5 +
 arch/x86/kernel/fpu/xstate.c      |   98 ++++++++++++++++++--------------------
 3 files changed, 54 insertions(+), 55 deletions(-)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xrstors(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xsaves(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	xsaves(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,8 +104,9 @@ void *get_xsave_addr(struct xregs_state
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
+
+void xsaves(struct xregs_state *xsave, u64 mask);
+void xrstors(struct xregs_state *xsave, u64 mask);
 
 enum xstate_copy_mode {
 	XSTATE_COPY_FP,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1162,76 +1162,74 @@ int copy_sigframe_from_user_to_xstate(st
 	return copy_uabi_to_xstate(xsave, NULL, ubuf);
 }
 
+static bool validate_xsaves_xrstors(u64 mask)
+{
+	u64 xchk;
+
+	if (WARN_ON_FPU(!cpu_feature_enabled(X86_FEATURE_XSAVES)))
+		return false;
+	/*
+	 * Validate that this is either a task->fpstate related component
+	 * subset or an independent one.
+	 */
+	if (mask & xfeatures_mask_independent())
+		xchk = ~xfeatures_mask_independent();
+	else
+		xchk = ~xfeatures_mask_all;
+
+	if (WARN_ON_ONCE(!mask || mask & xchk))
+		return false;
+
+	return true;
+}
+
 /**
- * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features saved into the xsave area
+ * xsaves - Save selected components to a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to save
  *
- * Only the independent supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
- * supervisor feature). Besides the independent supervisor states, the legacy
- * region and XSAVE header are also saved into the xsave area. The supervisor
- * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
+ * The @xstate buffer must be 64 byte aligned and correctly initialized as
+ * XSAVES does not write the full xstate header. Before first use the
+ * buffer should be zeroed otherwise a consecutive XRSTORS from that buffer
+ * can #GP.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features.
  */
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void xsaves(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
-		return;
-
-	if (WARN_ON_FPU(!independent_mask))
+	if (!validate_xsaves_xrstors(mask))
 		return;
 
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying to a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XSAVES, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 /**
- * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features restored from the xsave area
+ * xrstors - Restore selected components from a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to restore
  *
- * Only the independent supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
- * independent supervisor feature). Besides the independent supervisor states, the
- * legacy region and XSAVE header are also restored from the xsave area. The
- * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
+ * The @xstate buffer must be 64 byte aligned and correctly initialized
+ * otherwise XRSTORS from that buffer can #GP.
  *
- * The xsave area must be 64-bytes aligned.
+ * Proper usage is to restore the state which was saved with
+ * xsaves() into @xstate.
+ *
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features.
  */
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
+void xrstors(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (!validate_xsaves_xrstors(mask))
 		return;
 
-	if (WARN_ON_FPU(!independent_mask))
-		return;
-
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying from a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XRSTORS, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 #ifdef CONFIG_PROC_PID_ARCH_STATUS


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 38/65] x86/pkeys: Move read_pkru() and write_pkru()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (36 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 37/65] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
  2021-06-23 12:02 ` [patch V4 39/65] x86/fpu: Rename and sanitize fpu__save/copy() Thomas Gleixner
                   ` (27 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Dave Hansen <dave.hansen@linux.intel.com>

write_pkru() was originally used just to write to the PKRU register.  It
was mercifully short and sweet and was not out of place in pgtable.h with
some other pkey-related code.

But, later work included a requirement to also modify the task XSAVE
buffer when updating the register.  This really is more related to the
XSAVE architecture than to paging.

Move the read/write_pkru() to asm/pkru.h.  pgtable.h won't miss them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/xstate.h |    1 
 arch/x86/include/asm/pgtable.h    |   57 -----------------------------------
 arch/x86/include/asm/pkru.h       |   61 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/process_64.c      |    1 
 arch/x86/kvm/svm/sev.c            |    1 
 arch/x86/kvm/x86.c                |    1 
 arch/x86/mm/pkeys.c               |    1 
 7 files changed, 67 insertions(+), 56 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -6,6 +6,7 @@
 #include <linux/types.h>
 
 #include <asm/processor.h>
+#include <asm/fpu/api.h>
 #include <asm/user.h>
 
 /* Bit 63 of XCR0 is reserved for future expansion */
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,7 +23,7 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
-#include <asm/fpu/xstate.h>
+#include <asm/pkru.h>
 #include <asm/fpu/api.h>
 #include <asm-generic/pgtable_uffd.h>
 
@@ -126,35 +126,6 @@ static inline int pte_dirty(pte_t pte)
 	return pte_flags(pte) & _PAGE_DIRTY;
 }
 
-
-static inline u32 read_pkru(void)
-{
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return rdpkru();
-	return 0;
-}
-
-static inline void write_pkru(u32 pkru)
-{
-	struct pkru_state *pk;
-
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
-	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
-	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
-}
-
 static inline int pte_young(pte_t pte)
 {
 	return pte_flags(pte) & _PAGE_ACCESSED;
@@ -1360,32 +1331,6 @@ static inline pmd_t pmd_swp_clear_uffd_w
 }
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
 
-#define PKRU_AD_BIT 0x1
-#define PKRU_WD_BIT 0x2
-#define PKRU_BITS_PER_PKEY 2
-
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-extern u32 init_pkru_value;
-#else
-#define init_pkru_value	0
-#endif
-
-static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
-}
-
-static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	/*
-	 * Access-disable disables writes too so we need to check
-	 * both bits here.
-	 */
-	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
-}
-
 static inline u16 pte_flags_pkey(unsigned long pte_flags)
 {
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--- /dev/null
+++ b/arch/x86/include/asm/pkru.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PKRU_H
+#define _ASM_X86_PKRU_H
+
+#include <asm/fpu/xstate.h>
+
+#define PKRU_AD_BIT 0x1
+#define PKRU_WD_BIT 0x2
+#define PKRU_BITS_PER_PKEY 2
+
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+extern u32 init_pkru_value;
+#else
+#define init_pkru_value	0
+#endif
+
+static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+}
+
+static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	/*
+	 * Access-disable disables writes too so we need to check
+	 * both bits here.
+	 */
+	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+}
+
+static inline u32 read_pkru(void)
+{
+	if (boot_cpu_has(X86_FEATURE_OSPKE))
+		return rdpkru();
+	return 0;
+}
+
+static inline void write_pkru(u32 pkru)
+{
+	struct pkru_state *pk;
+
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return;
+
+	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
+
+	/*
+	 * The PKRU value in xstate needs to be in sync with the value that is
+	 * written to the CPU. The FPU restore on return to userland would
+	 * otherwise load the previous value again.
+	 */
+	fpregs_lock();
+	if (pk)
+		pk->pkru = pkru;
+	__write_pkru(pkru);
+	fpregs_unlock();
+}
+
+#endif
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -41,6 +41,7 @@
 #include <linux/syscalls.h>
 
 #include <asm/processor.h>
+#include <asm/pkru.h>
 #include <asm/fpu/internal.h>
 #include <asm/mmu_context.h>
 #include <asm/prctl.h>
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -19,6 +19,7 @@
 #include <linux/trace_events.h>
 #include <asm/fpu/internal.h>
 
+#include <asm/pkru.h>
 #include <asm/trapnr.h>
 
 #include "x86.h"
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -65,6 +65,7 @@
 #include <asm/msr.h>
 #include <asm/desc.h>
 #include <asm/mce.h>
+#include <asm/pkru.h>
 #include <linux/kernel_stat.h>
 #include <asm/fpu/internal.h> /* Ugh! */
 #include <asm/pvclock.h>
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,6 +10,7 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
+#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 39/65] x86/fpu: Rename and sanitize fpu__save/copy()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (37 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 38/65] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 40/65] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
                   ` (26 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Both functions are a misnomer.

fpu__save() is actually about synchronizing the hardware register state
into the task's memory state so that either coredump or a math exception
handler can inspect the state at the time where the problem happens.

The function guarantees to preserve the register state, while "save" is a
common terminology for saving the current state so it can be modified and
restored later. This is clearly not the case here.

Rename it to fpu_sync_fpstate().

fpu__copy() is used to clone the current task's FPU state when duplicating
task_struct. While the register state is a copy the rest of the FPU state
is not.

Name it accordingly and remove the really pointless @src argument along
with the warning which comes along with it.

Nothing can ever copy the FPU state of a non-current task. It's clearly
just a consequence of arch_dup_task_struct(), but it makes no sense to
proliferate that further.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: New patch
---
 arch/x86/include/asm/fpu/internal.h |    6 ++++--
 arch/x86/kernel/fpu/core.c          |   17 ++++++++---------
 arch/x86/kernel/fpu/regset.c        |    2 +-
 arch/x86/kernel/process.c           |    3 +--
 arch/x86/kernel/traps.c             |    5 +++--
 5 files changed, 17 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -26,14 +26,16 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
-extern int  fpu__copy(struct task_struct *dst, struct task_struct *src);
 extern void fpu__clear_user_states(struct fpu *fpu);
 extern void fpu__clear_all(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 
+extern void fpu_sync_fpstate(struct fpu *fpu);
+
+extern int  fpu_clone(struct task_struct *dst);
+
 /*
  * Boot time FPU initialization functions:
  */
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -159,11 +159,10 @@ void kernel_fpu_end(void)
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
 /*
- * Save the FPU state (mark it for reload if necessary):
- *
- * This only ever gets called for the current task.
+ * Sync the FPU register state to current's memory register state when the
+ * current task owns the FPU. The hardware register state is preserved.
  */
-void fpu__save(struct fpu *fpu)
+void fpu_sync_fpstate(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
@@ -221,18 +220,18 @@ void fpstate_init(union fpregs_state *st
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
-int fpu__copy(struct task_struct *dst, struct task_struct *src)
+/* Clone current's FPU state on fork */
+int fpu_clone(struct task_struct *dst)
 {
+	struct fpu *src_fpu = &current->thread.fpu;
 	struct fpu *dst_fpu = &dst->thread.fpu;
-	struct fpu *src_fpu = &src->thread.fpu;
 
+	/* The new task's FPU state cannot be valid in the hardware. */
 	dst_fpu->last_cpu = -1;
 
-	if (!static_cpu_has(X86_FEATURE_FPU))
+	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return 0;
 
-	WARN_ON_FPU(src_fpu != &current->thread.fpu);
-
 	/*
 	 * Don't let 'init optimized' areas of the XSAVE area
 	 * leak into the child task:
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -41,7 +41,7 @@ int regset_xregset_fpregs_active(struct
 static void sync_fpstate(struct fpu *fpu)
 {
 	if (fpu == &current->thread.fpu)
-		fpu__save(fpu);
+		fpu_sync_fpstate(fpu);
 }
 
 /*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -87,8 +87,7 @@ int arch_dup_task_struct(struct task_str
 #ifdef CONFIG_VM86
 	dst->thread.vm86 = NULL;
 #endif
-
-	return fpu__copy(dst, src);
+	return fpu_clone(dst);
 }
 
 /*
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -1046,9 +1046,10 @@ static void math_error(struct pt_regs *r
 	}
 
 	/*
-	 * Save the info for the exception handler and clear the error.
+	 * Synchronize the FPU register state to the memory register state
+	 * if necessary. This allows the exception handler to inspect it.
 	 */
-	fpu__save(fpu);
+	fpu_sync_fpstate(fpu);
 
 	task->thread.trap_nr	= trapnr;
 	task->thread.error_code = 0;


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 40/65] x86/cpu: Sanitize X86_FEATURE_OSPKE
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (38 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 39/65] x86/fpu: Rename and sanitize fpu__save/copy() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 41/65] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
                   ` (25 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

X86_FEATURE_OSPKE is enabled first on the boot CPU and the feature flag is
set. Secondary CPUs have to enable CR4.PKE as well and set their per CPU
feature flag. That's ineffective because all call sites have checks for
boot_cpu_data.

Make it smarter and force the feature flag when PKU is enabled on the boot
cpu which allows then to use cpu_feature_enabled(X86_FEATURE_OSPKE) all
over the place. That either compiles the code out when PKEY support is
disabled in Kconfig or uses a static_cpu_has() for the feature check which
makes a significant difference in hotpaths, e.g. context switch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/pkeys.h |    8 ++++----
 arch/x86/include/asm/pkru.h  |    4 ++--
 arch/x86/kernel/cpu/common.c |   24 +++++++++++-------------
 arch/x86/kernel/fpu/core.c   |    2 +-
 arch/x86/kernel/fpu/xstate.c |    2 +-
 arch/x86/kernel/process_64.c |    2 +-
 arch/x86/mm/fault.c          |    2 +-
 7 files changed, 21 insertions(+), 23 deletions(-)

--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -9,14 +9,14 @@
  * will be necessary to ensure that the types that store key
  * numbers and masks have sufficient capacity.
  */
-#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
+#define arch_max_pkey() (cpu_feature_enabled(X86_FEATURE_OSPKE) ? 16 : 1)
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 
 static inline bool arch_pkeys_enabled(void)
 {
-	return boot_cpu_has(X86_FEATURE_OSPKE);
+	return cpu_feature_enabled(X86_FEATURE_OSPKE);
 }
 
 /*
@@ -26,7 +26,7 @@ static inline bool arch_pkeys_enabled(vo
 extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return ARCH_DEFAULT_PKEY;
 
 	return __execute_only_pkey(mm);
@@ -37,7 +37,7 @@ extern int __arch_override_mprotect_pkey
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 		int prot, int pkey)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return 0;
 
 	return __arch_override_mprotect_pkey(vma, prot, pkey);
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -32,7 +32,7 @@ static inline bool __pkru_allows_write(u
 
 static inline u32 read_pkru(void)
 {
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return rdpkru();
 	return 0;
 }
@@ -41,7 +41,7 @@ static inline void write_pkru(u32 pkru)
 {
 	struct pkru_state *pk;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
 
 	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,22 +466,20 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	/* check the boot processor, plus compile options for PKU: */
-	if (!cpu_feature_enabled(X86_FEATURE_PKU))
-		return;
-	/* checks the actual processor's cpuid bits: */
-	if (!cpu_has(c, X86_FEATURE_PKU))
-		return;
-	if (pku_disabled)
+	if (c == &boot_cpu_data) {
+		if (pku_disabled || !cpu_feature_enabled(X86_FEATURE_PKU))
+			return;
+		/*
+		 * Setting CR4.PKE will cause the X86_FEATURE_OSPKE cpuid
+		 * bit to be set.  Enforce it.
+		 */
+		setup_force_cpu_cap(X86_FEATURE_OSPKE);
+
+	} else if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) {
 		return;
+	}
 
 	cr4_set_bits(X86_CR4_PKE);
-	/*
-	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
-	 * cpuid bit to be set.  We need to ensure that we
-	 * update that bit in this CPU's "cpu_info".
-	 */
-	set_cpu_cap(c, X86_FEATURE_OSPKE);
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -311,7 +311,7 @@ static inline void restore_fpregs_from_i
 	else
 		frstor(&init_fpstate.fsave);
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -921,7 +921,7 @@ int arch_set_user_pkey_access(struct tas
 	 * This check implies XSAVE support.  OSPKE only gets
 	 * set if we enable XSAVE and we enable PKU in XCR0.
 	 */
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return -EINVAL;
 
 	/*
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -137,7 +137,7 @@ void __show_regs(struct pt_regs *regs, e
 		       log_lvl, d3, d6, d7);
 	}
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		printk("%sPKRU: %08x\n", log_lvl, read_pkru());
 }
 
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -875,7 +875,7 @@ static inline bool bad_area_access_from_
 	/* This code is always called on the current mm */
 	bool foreign = false;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return false;
 	if (error_code & X86_PF_PK)
 		return true;


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 41/65] x86/pkru: Provide pkru_get_init_value()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (39 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 40/65] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 42/65] x86/pkru: Provide pkru_write_default() Thomas Gleixner
                   ` (24 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

When CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is disabled then the following
code fails to compile:

     if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
     	u32 pkru = READ_ONCE(init_pkru_value);
	..
     }

because init_pkru_value is defined as '0' which makes READ_ONCE() upset.

Provide an accessor macro to avoid #ifdeffery all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkru.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -10,8 +10,10 @@
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 extern u32 init_pkru_value;
+#define pkru_get_init_value()	READ_ONCE(init_pkru_value)
 #else
 #define init_pkru_value	0
+#define pkru_get_init_value()	0
 #endif
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 42/65] x86/pkru: Provide pkru_write_default()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (40 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 41/65] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 43/65] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
                   ` (23 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Provide a simple and trivial helper which just writes the PKRU default
value without trying to fiddle with the task's xsave buffer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/pkru.h |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -60,4 +60,12 @@ static inline void write_pkru(u32 pkru)
 	fpregs_unlock();
 }
 
+static inline void pkru_write_default(void)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	wrpkru(pkru_get_init_value());
+}
+
 #endif


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 43/65] x86/cpu: Write the default PKRU value when enabling PKE
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (41 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 42/65] x86/pkru: Provide pkru_write_default() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 44/65] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
                   ` (22 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

In preparation of making the PKRU management more independent from XSTATES,
write the default PKRU value into the hardware right after enabling PKRU in
CR4. This ensures that switch_to() and copy_thread() have the correct
setting for init task and the per CPU idle threads right away.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -480,6 +480,8 @@ static __always_inline void setup_pku(st
 	}
 
 	cr4_set_bits(X86_CR4_PKE);
+	/* Load the default PKRU value */
+	pkru_write_default();
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 44/65] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (42 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 43/65] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 45/65] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
                   ` (21 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

There is no point in using copy_init_pkru_to_fpregs() which in turn calls
write_pkru(). write_pkru() tries to fiddle with the task's xstate buffer
for nothing because the XRSTOR[S](init_fpstate) just cleared the xfeature
flag in the xstate header which makes get_xsave_addr() fail.

It's a useless exercise anyway because the reinitialization activates the
FPU so before the task's xstate buffer can be used again a XRSTOR[S] must
happen which in turn dumps the PKRU value.

Get rid of the now unused copy_init_pkru_to_fpregs().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkeys.h |    1 -
 arch/x86/kernel/fpu/core.c   |    3 +--
 arch/x86/mm/pkeys.c          |   17 -----------------
 include/linux/pkeys.h        |    4 ----
 4 files changed, 1 insertion(+), 24 deletions(-)

--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -124,7 +124,6 @@ extern int arch_set_user_pkey_access(str
 		unsigned long init_val);
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
-extern void copy_init_pkru_to_fpregs(void);
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -311,8 +311,7 @@ static inline void restore_fpregs_from_i
 	else
 		frstor(&init_fpstate.fsave);
 
-	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
-		copy_init_pkru_to_fpregs();
+	pkru_write_default();
 }
 
 /*
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -125,22 +124,6 @@ u32 init_pkru_value = PKRU_AD_KEY( 1) |
 		      PKRU_AD_KEY(10) | PKRU_AD_KEY(11) | PKRU_AD_KEY(12) |
 		      PKRU_AD_KEY(13) | PKRU_AD_KEY(14) | PKRU_AD_KEY(15);
 
-/*
- * Called from the FPU code when creating a fresh set of FPU
- * registers.  This is called from a very specific context where
- * we know the FPU registers are safe for use and we can use PKRU
- * directly.
- */
-void copy_init_pkru_to_fpregs(void)
-{
-	u32 init_pkru_value_snapshot = READ_ONCE(init_pkru_value);
-	/*
-	 * Override the PKRU state that came from 'init_fpstate'
-	 * with the baseline from the process.
-	 */
-	write_pkru(init_pkru_value_snapshot);
-}
-
 static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 			     size_t count, loff_t *ppos)
 {
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -44,10 +44,6 @@ static inline bool arch_pkeys_enabled(vo
 	return false;
 }
 
-static inline void copy_init_pkru_to_fpregs(void)
-{
-}
-
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 #endif /* _LINUX_PKEYS_H */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 45/65] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (43 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 44/65] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 46/65] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
                   ` (20 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Make it clear what the function is about.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    3 ++-
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kernel/process.c           |    2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -29,12 +29,13 @@
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
 extern void fpu__clear_user_states(struct fpu *fpu);
-extern void fpu__clear_all(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 
 extern void fpu_sync_fpstate(struct fpu *fpu);
 
+/* Clone and exit operations */
 extern int  fpu_clone(struct task_struct *dst);
+extern void fpu_flush_thread(void);
 
 /*
  * Boot time FPU initialization functions:
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -350,9 +350,9 @@ void fpu__clear_user_states(struct fpu *
 	fpu__clear(fpu, true);
 }
 
-void fpu__clear_all(struct fpu *fpu)
+void fpu_flush_thread(void)
 {
-	fpu__clear(fpu, false);
+	fpu__clear(&current->thread.fpu, false);
 }
 
 /*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -205,7 +205,7 @@ void flush_thread(void)
 	flush_ptrace_hw_breakpoint(tsk);
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
-	fpu__clear_all(&tsk->thread.fpu);
+	fpu_flush_thread();
 }
 
 void disable_TSC(void)


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 46/65] x86/fpu: Clean up the fpu__clear() variants
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (44 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 45/65] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  2021-06-23 12:02 ` [patch V4 47/65] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
                   ` (19 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Andy Lutomirski <luto@kernel.org>

fpu__clear() currently resets both register state and kernel XSAVE buffer
state.  It has two modes: one for all state (supervisor and user) and
another for user state only.  fpu__clear_all() uses the "all state"
(user_only=0) mode, while a number of signal paths use the user_only=1
mode.

Make fpu__clear() work only for user state (user_only=1) and remove the
"all state" (user_only=0) code.  Rename it to match so it can be used by
the signal paths.

Replace the "all state" (user_only=0) fpu__clear() functionality.  Use the
TIF_NEED_FPU_LOAD functionality instead of making any actual hardware
registers changes in this path.

Instead of invoking fpu__initialize() just memcpy() init_fpstate into the
task's FPU state because that has already the correct format and in case of
PKRU also contains the default PKRU value. Move the actual PKRU write out
into flush_thread() where it belongs and where it will end up anyway when
PKRU and XSTATE have been untangled.

For bisectability a workaround is required which stores the PKRU value in
the xstate memory until PKRU is untangled from XSTATE for context
switching and return to user.

[ Dave Hansen: Polished changelog ]
[ tglx: Fixed the PKRU fallout ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/core.c |  113 ++++++++++++++++++++++++++++++---------------
 arch/x86/kernel/process.c  |   10 +++
 2 files changed, 86 insertions(+), 37 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -260,19 +260,6 @@ int fpu_clone(struct task_struct *dst)
 }
 
 /*
- * Activate the current task's in-memory FPU context,
- * if it has not been used before:
- */
-static void fpu__initialize(struct fpu *fpu)
-{
-	WARN_ON_FPU(fpu != &current->thread.fpu);
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-	fpstate_init(&fpu->state);
-	trace_x86_fpu_init_state(fpu);
-}
-
-/*
  * Drops current FPU state: deactivates the fpregs and
  * the fpstate. NOTE: it still leaves previous contents
  * in the fpregs in the eager-FPU case.
@@ -314,47 +301,99 @@ static inline void restore_fpregs_from_i
 	pkru_write_default();
 }
 
+static inline unsigned int init_fpstate_copy_size(void)
+{
+	if (!use_xsave())
+		return fpu_kernel_xstate_size;
+
+	/* XSAVE(S) just needs the legacy and the xstate header part */
+	return sizeof(init_fpstate.xsave);
+}
+
+/* Temporary workaround. Will be removed once PKRU and XSTATE are untangled. */
+static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
+{
+	struct pkru_state *pk;
+
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+	/*
+	 * Force XFEATURE_PKRU to be set in the header otherwise
+	 * get_xsave_addr() does not work and it also needs to be set to
+	 * make XRSTOR(S) load it.
+	 */
+	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
+	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
+	pk->pkru = pkru_get_init_value();
+}
+
 /*
- * Clear the FPU state back to init state.
- *
- * Called by sys_execve(), by the signal handler code and by various
- * error paths.
+ * Reset current->fpu memory state to the init values.
  */
-static void fpu__clear(struct fpu *fpu, bool user_only)
+static void fpu_reset_fpstate(void)
+{
+	struct fpu *fpu = &current->thread.fpu;
+
+	fpregs_lock();
+	fpu__drop(fpu);
+	/*
+	 * This does not change the actual hardware registers. It just
+	 * resets the memory image and sets TIF_NEED_FPU_LOAD so a
+	 * subsequent return to usermode will reload the registers from the
+	 * task's memory image.
+	 *
+	 * Do not use fpstate_init() here. Just copy init_fpstate which has
+	 * the correct content already except for PKRU.
+	 */
+	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
+	pkru_set_default_in_xstate(&fpu->state.xsave);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
+	fpregs_unlock();
+}
+
+/*
+ * Reset current's user FPU states to the init states.  current's
+ * supervisor states, if any, are not modified by this function.  The
+ * caller guarantees that the XSTATE header in memory is intact.
+ */
+void fpu__clear_user_states(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
-	if (!static_cpu_has(X86_FEATURE_FPU)) {
-		fpu__drop(fpu);
-		fpu__initialize(fpu);
+	fpregs_lock();
+	if (!cpu_feature_enabled(X86_FEATURE_FPU)) {
+		fpu_reset_fpstate();
+		fpregs_unlock();
 		return;
 	}
 
-	fpregs_lock();
-
-	if (user_only) {
-		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
-		    xfeatures_mask_supervisor())
-			os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
-		restore_fpregs_from_init_fpstate(xfeatures_mask_user());
-	} else {
-		restore_fpregs_from_init_fpstate(xfeatures_mask_all);
+	/*
+	 * Ensure that current's supervisor states are loaded into their
+	 * corresponding registers.
+	 */
+	if (xfeatures_mask_supervisor() &&
+	    !fpregs_state_valid(fpu, smp_processor_id())) {
+		os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
 	}
 
+	/* Reset user states in registers. */
+	restore_fpregs_from_init_fpstate(xfeatures_mask_user());
+
+	/*
+	 * Now all FPU registers have their desired values.  Inform the FPU
+	 * state machine that current's FPU registers are in the hardware
+	 * registers. The memory image does not need to be updated because
+	 * any operation relying on it has to save the registers first when
+	 * current's FPU is marked active.
+	 */
 	fpregs_mark_activate();
 	fpregs_unlock();
 }
 
-void fpu__clear_user_states(struct fpu *fpu)
-{
-	fpu__clear(fpu, true);
-}
-
 void fpu_flush_thread(void)
 {
-	fpu__clear(&current->thread.fpu, false);
+	fpu_reset_fpstate();
 }
-
 /*
  * Load FPU context before returning to userspace.
  */
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -198,6 +198,15 @@ int copy_thread(unsigned long clone_flag
 	return ret;
 }
 
+static void pkru_flush_thread(void)
+{
+	/*
+	 * If PKRU is enabled the default PKRU value has to be loaded into
+	 * the hardware right here (similar to context switch).
+	 */
+	pkru_write_default();
+}
+
 void flush_thread(void)
 {
 	struct task_struct *tsk = current;
@@ -206,6 +215,7 @@ void flush_thread(void)
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
 	fpu_flush_thread();
+	pkru_flush_thread();
 }
 
 void disable_TSC(void)


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 47/65] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (45 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 46/65] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 48/65] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
                   ` (18 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Rename it so that it becomes entirely clear what this function is
about. It's purpose is to restore the FPU registers to the state which was
saved in the task's FPU memory state either at context switch or by an in
kernel FPU user.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    6 ++----
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |    2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -468,10 +468,8 @@ static inline void fpregs_activate(struc
 	trace_x86_fpu_regs_activated(fpu);
 }
 
-/*
- * Internal helper, do not use directly. Use switch_fpu_return() instead.
- */
-static inline void __fpregs_load_activate(void)
+/* Internal helper for switch_fpu_return() and signal frame setup */
+static inline void fpregs_restore_userregs(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	int cpu = smp_processor_id();
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -402,7 +402,7 @@ void switch_fpu_return(void)
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return;
 
-	__fpregs_load_activate();
+	fpregs_restore_userregs();
 }
 EXPORT_SYMBOL_GPL(switch_fpu_return);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -188,7 +188,7 @@ int copy_fpstate_to_sigframe(void __user
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
-		__fpregs_load_activate();
+		fpregs_restore_userregs();
 
 	pagefault_disable();
 	ret = copy_fpregs_to_sigframe(buf_fx);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 48/65] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (46 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 47/65] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 49/65] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
                   ` (17 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

copy_kernel_to_fpregs() restores all xfeatures but it is also the place
where the AMD FXSAVE_LEAK bug is handled.

That prevents fpregs_restore_userregs() to limit the restored features,
which is required to untangle PKRU and XSTATE handling and also for the
upcoming supervisor state management.

Move the FXSAVE_LEAK quirk into __copy_kernel_to_fpregs() and deinline that
function which has become rather fat.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |   25 +------------------------
 arch/x86/kernel/fpu/core.c          |   27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -379,33 +379,10 @@ static inline int os_xrstor_safe(struct
 	return err;
 }
 
-static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
-{
-	if (use_xsave()) {
-		os_xrstor(&fpstate->xsave, mask);
-	} else {
-		if (use_fxsr())
-			fxrstor(&fpstate->fxsave);
-		else
-			frstor(&fpstate->fsave);
-	}
-}
+extern void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask);
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	/*
-	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
-	 * pending. Clear the x87 state here by setting it to fixed values.
-	 * "m" is a random variable that should be in L1.
-	 */
-	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
-		asm volatile(
-			"fnclex\n\t"
-			"emms\n\t"
-			"fildl %P[addr]"	/* set F?P to defined value */
-			: : [addr] "m" (fpstate));
-	}
-
 	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -124,6 +124,33 @@ void save_fpregs_to_fpstate(struct fpu *
 }
 EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
+void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
+{
+	/*
+	 * AMD K7/K8 and later CPUs up to Zen don't save/restore
+	 * FDP/FIP/FOP unless an exception is pending. Clear the x87 state
+	 * here by setting it to fixed values.  "m" is a random variable
+	 * that should be in L1.
+	 */
+	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %P[addr]"	/* set F?P to defined value */
+			: : [addr] "m" (fpstate));
+	}
+
+	if (use_xsave()) {
+		os_xrstor(&fpstate->xsave, mask);
+	} else {
+		if (use_fxsr())
+			fxrstor(&fpstate->fxsave);
+		else
+			frstor(&fpstate->fsave);
+	}
+}
+EXPORT_SYMBOL_GPL(__restore_fpregs_from_fpstate);
+
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
 	preempt_disable();


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 49/65] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (47 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 48/65] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 50/65] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
                   ` (16 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Rename it so it's clear that this is about user ABI features which can
differ from the feature set which the kernel saves and restores because the
kernel handles e.g. PKRU differently. But the user ABI (ptrace, signal
frame) expects it to be there.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    7 ++++++-
 arch/x86/include/asm/fpu/xstate.h   |    6 +++++-
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |   10 +++++-----
 arch/x86/kernel/fpu/xstate.c        |   18 +++++++++---------
 5 files changed, 26 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -324,7 +324,12 @@ static inline void os_xrstor(struct xreg
  */
 static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
-	u64 mask = xfeatures_mask_user();
+	/*
+	 * Include the features which are not xsaved/rstored by the kernel
+	 * internally, e.g. PKRU. That's user space ABI and also required
+	 * to allow the signal handler to modify PKRU.
+	 */
+	u64 mask = xfeatures_mask_uabi();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -83,7 +83,11 @@ static inline u64 xfeatures_mask_supervi
 	return xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_user(void)
+/*
+ * The xfeatures which are enabled in XCR0 and expected to be in ptrace
+ * buffers and signal frames.
+ */
+static inline u64 xfeatures_mask_uabi(void)
 {
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -404,7 +404,7 @@ void fpu__clear_user_states(struct fpu *
 	}
 
 	/* Reset user states in registers. */
-	restore_fpregs_from_init_fpstate(xfeatures_mask_user());
+	restore_fpregs_from_init_fpstate(xfeatures_mask_uabi());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -257,14 +257,14 @@ static int copy_user_to_fpregs_zeroing(v
 
 	if (use_xsave()) {
 		if (fx_only) {
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 
 			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
-			init_bv = xfeatures_mask_user() & ~xbv;
+			init_bv = xfeatures_mask_uabi() & ~xbv;
 
 			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
@@ -420,7 +420,7 @@ static int __fpu__restore_sig(void __use
 	fpregs_unlock();
 
 	if (use_xsave() && !fx_only) {
-		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
+		u64 init_bv = xfeatures_mask_uabi() & ~user_xfeatures;
 
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
@@ -454,7 +454,7 @@ static int __fpu__restore_sig(void __use
 		if (use_xsave()) {
 			u64 init_bv;
 
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 			os_xrstor(&init_fpstate.xsave, init_bv);
 		}
 
@@ -549,7 +549,7 @@ void fpu__init_prepare_fx_sw_frame(void)
 
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
-	fx_sw_reserved.xfeatures = xfeatures_mask_user();
+	fx_sw_reserved.xfeatures = xfeatures_mask_uabi();
 	fx_sw_reserved.xstate_size = fpu_user_xstate_size;
 
 	if (IS_ENABLED(CONFIG_IA32_EMULATION) ||
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -144,7 +144,7 @@ void fpu__init_cpu_xstate(void)
 	 * managed by XSAVE{C, OPT, S} and XRSTOR{S}.  Only XSAVE user
 	 * states can be set here.
 	 */
-	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * MSR_IA32_XSS sets supervisor states managed by XSAVES.
@@ -453,7 +453,7 @@ int xfeature_size(int xfeature_nr)
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
-	if (hdr->xfeatures & ~xfeatures_mask_user())
+	if (hdr->xfeatures & ~xfeatures_mask_uabi())
 		return -EINVAL;
 
 	/* Userspace must use the uncompacted format */
@@ -756,7 +756,7 @@ void __init fpu__init_system_xstate(void
 	cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
 	xfeatures_mask_all |= ecx + ((u64)edx << 32);
 
-	if ((xfeatures_mask_user() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
+	if ((xfeatures_mask_uabi() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
 		/*
 		 * This indicates that something really unexpected happened
 		 * with the enumeration.  Disable XSAVE and try to continue
@@ -791,7 +791,7 @@ void __init fpu__init_system_xstate(void
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
 	 */
-	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_user());
+	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_uabi());
 
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
@@ -828,14 +828,14 @@ void fpu__resume_cpu(void)
 	/*
 	 * Restore XCR0 on xsave capable CPUs:
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
-		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+	if (cpu_feature_enabled(X86_FEATURE_XSAVE))
+		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * Restore IA32_XSS. The same CPUID bit enumerates support
 	 * of XSAVES and MSR_IA32_XSS.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor()  |
 				     xfeatures_mask_independent());
 	}
@@ -993,7 +993,7 @@ void copy_xstate_to_uabi_buf(struct memb
 		break;
 
 	case XSTATE_COPY_XSAVE:
-		header.xfeatures &= xfeatures_mask_user();
+		header.xfeatures &= xfeatures_mask_uabi();
 		break;
 	}
 
@@ -1038,7 +1038,7 @@ void copy_xstate_to_uabi_buf(struct memb
 		 * compacted init_fpstate. The gap tracking will zero this
 		 * later.
 		 */
-		if (!(xfeatures_mask_user() & BIT_ULL(i)))
+		if (!(xfeatures_mask_uabi() & BIT_ULL(i)))
 			continue;
 
 		/*


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 50/65] x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (48 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 49/65] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 51/65] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
                   ` (15 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

switch_to(), flush_thread() write the task's PKRU value eagerly so the PKRU
value of current is always valid in the hardware.

That means there is no point in restoring PKRU on exit to user or when
reactivating the task's FPU registers in the signal frame setup path.

This allows to remove all the xstate buffer updates with PKRU values once
the PKRU state is stored in thread struct while a task is scheduled out.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Restore supervisor state too. (Yu-Cheng)
---
 arch/x86/include/asm/fpu/internal.h |   16 +++++++++++++++-
 arch/x86/include/asm/fpu/xstate.h   |   19 +++++++++++++++++++
 arch/x86/kernel/fpu/core.c          |    2 +-
 3 files changed, 35 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -457,7 +457,21 @@ static inline void fpregs_restore_userre
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		restore_fpregs_from_fpstate(&fpu->state);
+		u64 mask;
+
+		/*
+		 * This restores _all_ xstate which has not been
+		 * established yet.
+		 *
+		 * If PKRU is enabled, then the PKRU value is already
+		 * correct because it was either set in switch_to() or in
+		 * flush_thread(). So it is excluded because it might be
+		 * not up to date in current->thread.fpu.xsave state.
+		 */
+		mask = xfeatures_mask_restore_user() |
+			xfeatures_mask_supervisor();
+		__restore_fpregs_from_fpstate(&fpu->state, mask);
+
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -35,6 +35,14 @@
 				      XFEATURE_MASK_BNDREGS | \
 				      XFEATURE_MASK_BNDCSR)
 
+/*
+ * Features which are restored when returning to user space.
+ * PKRU is not restored on return to user space because PKRU
+ * is switched eagerly in switch_to() and flush_thread()
+ */
+#define XFEATURE_MASK_USER_RESTORE	\
+	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
@@ -92,6 +100,17 @@ static inline u64 xfeatures_mask_uabi(vo
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
+/*
+ * The xfeatures which are restored by the kernel when returning to user
+ * mode. This is not necessarily the same as xfeatures_mask_uabi() as the
+ * kernel does not manage all XCR0 enabled features via xsave/xrstor as
+ * some of them have to be switched eagerly on context switch and exec().
+ */
+static inline u64 xfeatures_mask_restore_user(void)
+{
+	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -404,7 +404,7 @@ void fpu__clear_user_states(struct fpu *
 	}
 
 	/* Reset user states in registers. */
-	restore_fpregs_from_init_fpstate(xfeatures_mask_uabi());
+	restore_fpregs_from_init_fpstate(xfeatures_mask_restore_user());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 51/65] x86/fpu: Add PKRU storage outside of task XSAVE buffer
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (49 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 50/65] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
  2021-06-23 12:02 ` [patch V4 52/65] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
                   ` (14 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Dave Hansen <dave.hansen@linux.intel.com>

PKRU is currently partly XSAVE-managed and partly not.  It has space in the
task XSAVE buffer and is context-switched by XSAVE/XRSTOR.  However, it is
switched more eagerly than FPU because there may be a need for PKRU to be
up-to-date for things like copy_to/from_user() since PKRU affects
user-permission memory accesses, not just accesses from userspace itself.

This leaves PKRU in a very odd position.  XSAVE brings very little value to
the table for how Linux uses PKRU except for signal related XSTATE
handling.

Prepare to move PKRU away from being XSAVE-managed.  Allocate space in the
thread_struct for it and save/restore it in the context-switch path
separately from the XSAVE-managed features. task->thread_struct.pkru is
only valid when the task is scheduled out. For the current task the
authoritative source is the hardware, i.e. it has to be retrieved via
rdpkru().

Leave the XSAVE code in place for now to ensure bisectability.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Fix the fallout on !PKRU enabled systems in copy_thread() - Intel testing via Dave
---
 arch/x86/include/asm/processor.h |    9 +++++++++
 arch/x86/kernel/process.c        |    7 +++++++
 arch/x86/kernel/process_64.c     |   25 +++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -518,6 +518,15 @@ struct thread_struct {
 
 	unsigned int		sig_on_uaccess_err:1;
 
+	/*
+	 * Protection Keys Register for Userspace.  Loaded immediately on
+	 * context switch. Store it in thread_struct to avoid a lookup in
+	 * the tasks's FPU xstate buffer. This value is only valid when a
+	 * task is scheduled out. For 'current' the authoritative source of
+	 * PKRU is the hardware itself.
+	 */
+	u32			pkru;
+
 	/* Floating point and extended processor state */
 	struct fpu		fpu;
 	/*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -156,11 +156,18 @@ int copy_thread(unsigned long clone_flag
 
 	/* Kernel thread ? */
 	if (unlikely(p->flags & PF_KTHREAD)) {
+		p->thread.pkru = pkru_get_init_value();
 		memset(childregs, 0, sizeof(struct pt_regs));
 		kthread_frame_init(frame, sp, arg);
 		return 0;
 	}
 
+	/*
+	 * Clone current's PKRU value from hardware. tsk->thread.pkru
+	 * is only valid when scheduled out.
+	 */
+	p->thread.pkru = read_pkru();
+
 	frame->bx = 0;
 	*childregs = *current_pt_regs();
 	childregs->ax = 0;
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -340,6 +340,29 @@ static __always_inline void load_seg_leg
 	}
 }
 
+/*
+ * Store prev's PKRU value and load next's PKRU value if they differ. PKRU
+ * is not XSTATE managed on context switch because that would require a
+ * lookup in the task's FPU xsave buffer and require to keep that updated
+ * in various places.
+ */
+static __always_inline void x86_pkru_load(struct thread_struct *prev,
+					  struct thread_struct *next)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	/* Stash the prev task's value: */
+	prev->pkru = rdpkru();
+
+	/*
+	 * PKRU writes are slightly expensive.  Avoid them when not
+	 * strictly necessary:
+	 */
+	if (prev->pkru != next->pkru)
+		wrpkru(next->pkru);
+}
+
 static __always_inline void x86_fsgsbase_load(struct thread_struct *prev,
 					      struct thread_struct *next)
 {
@@ -589,6 +612,8 @@ void compat_start_thread(struct pt_regs
 
 	x86_fsgsbase_load(prev, next);
 
+	x86_pkru_load(prev, next);
+
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 52/65] x86/fpu: Hook up PKRU into ptrace()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (50 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 51/65] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
  2021-06-23 12:02 ` [patch V4 53/65] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
                   ` (13 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

From: Dave Hansen <dave.hansen@linux.intel.com>

One nice thing about having PKRU be XSAVE-managed is that it gets naturally
exposed into the XSAVE-using ABIs.  Now that XSAVE will not be used to
manage PKRU, these ABIs need to be manually enabled to deal with PKRU.

ptrace() uses copy_uabi_xstate_to_kernel() to collect the tracee's
XSTATE. As PKRU is not in the task's XSTATE buffer, use task->thread.pkru
for filling in up the ptrace buffer.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/xstate.h |    2 +-
 arch/x86/kernel/fpu/regset.c      |   10 ++++------
 arch/x86/kernel/fpu/xstate.c      |   25 ++++++++++++++++++-------
 3 files changed, 23 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -139,7 +139,7 @@ enum xstate_copy_mode {
 };
 
 struct membuf;
-void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
 			     enum xstate_copy_mode mode);
 
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -78,7 +78,7 @@ int xfpregs_get(struct task_struct *targ
 				    sizeof(fpu->state.fxsave));
 	}
 
-	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_FX);
+	copy_xstate_to_uabi_buf(to, target, XSTATE_COPY_FX);
 	return 0;
 }
 
@@ -127,14 +127,12 @@ int xfpregs_set(struct task_struct *targ
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
-	struct fpu *fpu = &target->thread.fpu;
-
 	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	sync_fpstate(fpu);
+	sync_fpstate(&target->thread.fpu);
 
-	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
+	copy_xstate_to_uabi_buf(to, target, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
@@ -337,7 +335,7 @@ int fpregs_get(struct task_struct *targe
 		struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) };
 
 		/* Handle init state optimized xstate correctly */
-		copy_xstate_to_uabi_buf(mb, &fpu->state.xsave, XSTATE_COPY_FP);
+		copy_xstate_to_uabi_buf(mb, target, XSTATE_COPY_FP);
 		fx = &fxsave;
 	} else {
 		fx = &fpu->state.fxsave;
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -962,7 +962,7 @@ static void copy_feature(bool from_xstat
 /**
  * copy_xstate_to_uabi_buf - Copy kernel saved xstate to a UABI buffer
  * @to:		membuf descriptor
- * @xsave:	The kernel xstate buffer to copy from
+ * @tsk:	The task from which to copy the saved xstate
  * @copy_mode:	The requested copy mode
  *
  * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
@@ -971,10 +971,11 @@ static void copy_feature(bool from_xstat
  *
  * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
 			     enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
 	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
 	unsigned int zerofrom;
@@ -1048,11 +1049,21 @@ void copy_xstate_to_uabi_buf(struct memb
 		if (zerofrom < xstate_offsets[i])
 			membuf_zero(&to, xstate_offsets[i] - zerofrom);
 
-		copy_feature(header.xfeatures & BIT_ULL(i), &to,
-			     __raw_xsave_addr(xsave, i),
-			     __raw_xsave_addr(xinit, i),
-			     xstate_sizes[i]);
-
+		if (i == XFEATURE_PKRU) {
+			struct pkru_state pkru = {0};
+			/*
+			 * PKRU is not necessarily up to date in the
+			 * thread's XSAVE buffer.  Fill this part from the
+			 * per-thread storage.
+			 */
+			pkru.pkru = tsk->thread.pkru;
+			membuf_write(&to, &pkru, sizeof(pkru));
+		} else {
+			copy_feature(header.xfeatures & BIT_ULL(i), &to,
+				     __raw_xsave_addr(xsave, i),
+				     __raw_xsave_addr(xinit, i),
+				     xstate_sizes[i]);
+		}
 		/*
 		 * Keep track of the last copied state in the non-compacted
 		 * target buffer for gap zeroing.


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 53/65] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (51 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 52/65] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 54/65] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
                   ` (12 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

As the PKRU state is managed separately restoring it from the xstate buffer
would be counterproductive as it might either restore a stale value or
reinit the PKRU state to 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/include/asm/fpu/xstate.h   |   10 ++++++++++
 arch/x86/kernel/fpu/xstate.c        |    1 +
 arch/x86/mm/extable.c               |    2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -259,7 +259,7 @@ static inline void fxsave(struct fxregs_
  */
 static inline void os_xrstor_booting(struct xregs_state *xstate)
 {
-	u64 mask = -1;
+	u64 mask = xfeatures_mask_fpstate();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
@@ -388,7 +388,7 @@ extern void __restore_fpregs_from_fpstat
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	__restore_fpregs_from_fpstate(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, xfeatures_mask_fpstate());
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,6 +111,16 @@ static inline u64 xfeatures_mask_restore
 	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
 }
 
+/*
+ * Like xfeatures_mask_restore_user() but additionally restors the
+ * supported supervisor states.
+ */
+static inline u64 xfeatures_mask_fpstate(void)
+{
+	return xfeatures_mask_all & \
+		(XFEATURE_MASK_USER_RESTORE | XFEATURE_MASK_SUPERVISOR_SUPPORTED);
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -60,6 +60,7 @@ static short xsave_cpuid_features[] __in
  * XSAVE buffer, both supervisor and user xstates.
  */
 u64 xfeatures_mask_all __ro_after_init;
+EXPORT_SYMBOL_GPL(xfeatures_mask_all);
 
 static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__restore_fpregs_from_fpstate(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, xfeatures_mask_fpstate());
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 54/65] x86/fpu: Remove PKRU handling from switch_fpu_finish()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (52 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 53/65] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 55/65] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
                   ` (11 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

PKRU is already updated and the xstate is not longer the proper source of
information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |   34 ++++------------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -523,39 +523,13 @@ static inline void switch_fpu_prepare(st
  */
 
 /*
- * Load PKRU from the FPU context if available. Delay loading of the
- * complete FPU state until the return to userland.
+ * Delay loading of the complete FPU state until the return to userland.
+ * PKRU is handled separately.
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu)
 {
-	u32 pkru_val = init_pkru_value;
-	struct pkru_state *pk;
-
-	if (!static_cpu_has(X86_FEATURE_FPU))
-		return;
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-
-	/*
-	 * PKRU state is switched eagerly because it needs to be valid before we
-	 * return to userland e.g. for a copy_to_user() operation.
-	 */
-	if (!(current->flags & PF_KTHREAD)) {
-		/*
-		 * If the PKRU bit in xsave.header.xfeatures is not set,
-		 * then the PKRU component was in init state, which means
-		 * XRSTOR will set PKRU to 0. If the bit is not set then
-		 * get_xsave_addr() will return NULL because the PKRU value
-		 * in memory is not valid. This means pkru_val has to be
-		 * set to 0 and not to init_pkru_value.
-		 */
-		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
-		pkru_val = pk ? pk->pkru : 0;
-	}
-	__write_pkru(pkru_val);
+	if (static_cpu_has(X86_FEATURE_FPU))
+		set_thread_flag(TIF_NEED_FPU_LOAD);
 }
 
 #endif /* _ASM_X86_FPU_INTERNAL_H */


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 55/65] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (53 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 54/65] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] x86/fpu: Don't " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 56/65] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
                   ` (10 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

PKRU for a task is stored in task->thread.pkru when the task is scheduled
out. For 'current' the authoritative source of PKRU is the hardware.

fpu_reset_fpstate() has two callers:

  1) fpu__clear_user_states() for !FPU systems. For those PKRU is irrelevant

  2) fpu_flush_thread() which is invoked from flush_thread(). flush_thread()
     resets the hardware to the kernel restrictive default value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/core.c |   22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -337,23 +337,6 @@ static inline unsigned int init_fpstate_
 	return sizeof(init_fpstate.xsave);
 }
 
-/* Temporary workaround. Will be removed once PKRU and XSTATE are untangled. */
-static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
-{
-	struct pkru_state *pk;
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-	/*
-	 * Force XFEATURE_PKRU to be set in the header otherwise
-	 * get_xsave_addr() does not work and it also needs to be set to
-	 * make XRSTOR(S) load it.
-	 */
-	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
-	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
-	pk->pkru = pkru_get_init_value();
-}
-
 /*
  * Reset current->fpu memory state to the init values.
  */
@@ -371,9 +354,12 @@ static void fpu_reset_fpstate(void)
 	 *
 	 * Do not use fpstate_init() here. Just copy init_fpstate which has
 	 * the correct content already except for PKRU.
+	 *
+	 * PKRU handling does not rely on the xstate when restoring for
+	 * user space as PKRU is eagerly written in switch_to() and
+	 * flush_thread().
 	 */
 	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
-	pkru_set_default_in_xstate(&fpu->state.xsave);
 	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 }


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 56/65] x86/pkru: Remove xstate fiddling from write_pkru()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (54 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 55/65] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 57/65] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
                   ` (9 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The PKRU value of a task is stored in task->thread.pkru when the task is
scheduled out. PKRU is restored on schedule in from there. So keeping the
XSAVE buffer up to date is a pointless exercise.

Remove the xstate fiddling and cleanup all related functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/pkru.h          |   17 ++++-------------
 arch/x86/include/asm/special_insns.h |   14 +-------------
 arch/x86/kvm/x86.c                   |    4 ++--
 3 files changed, 7 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -41,23 +41,14 @@ static inline u32 read_pkru(void)
 
 static inline void write_pkru(u32 pkru)
 {
-	struct pkru_state *pk;
-
 	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
 	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
 	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
+	if (pkru != rdpkru())
+		wrpkru(pkru);
 }
 
 static inline void pkru_write_default(void)
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -104,25 +104,13 @@ static inline void wrpkru(u32 pkru)
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
 
-static inline void __write_pkru(u32 pkru)
-{
-	/*
-	 * WRPKRU is relatively expensive compared to RDPKRU.
-	 * Avoid WRPKRU when it would not change the value.
-	 */
-	if (pkru == rdpkru())
-		return;
-
-	wrpkru(pkru);
-}
-
 #else
 static inline u32 rdpkru(void)
 {
 	return 0;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 }
 #endif
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -943,7 +943,7 @@ void kvm_load_guest_xsave_state(struct k
 	    (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
 	    vcpu->arch.pkru != vcpu->arch.host_pkru)
-		__write_pkru(vcpu->arch.pkru);
+		write_pkru(vcpu->arch.pkru);
 }
 EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
 
@@ -957,7 +957,7 @@ void kvm_load_host_xsave_state(struct kv
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
 		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vcpu->arch.host_pkru)
-			__write_pkru(vcpu->arch.host_pkru);
+			write_pkru(vcpu->arch.host_pkru);
 	}
 
 	if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 57/65] x86/fpu: Mark init_fpstate __ro_after_init
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (55 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 56/65] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 58/65] x86/fpu/signal: Move initial checks into fpu__sig_restore() Thomas Gleixner
                   ` (8 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Nothing has to write into that state after init

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/kernel/fpu/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -23,7 +23,7 @@
  * Represents the initial FPU state. It's mostly (but not completely) zeroes,
  * depending on the FPU hardware format:
  */
-union fpregs_state init_fpstate __read_mostly;
+union fpregs_state init_fpstate __ro_after_init;
 
 /*
  * Track whether the kernel is using the FPU state


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 58/65] x86/fpu/signal: Move initial checks into fpu__sig_restore()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (56 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 57/65] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] x86/fpu/signal: Move initial checks into fpu__restore_sig() tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 59/65] x86/fpu/signal: Remove the legacy alignment check Thomas Gleixner
                   ` (7 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

__fpu_sig_restore() is convoluted and some of the basic checks can trivialy be done
in the calling function as well as the final error handling of clearing user state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Catch error for math-emu as well. Use cpu_feature_enabled(). (Boris)
---
 arch/x86/kernel/fpu/signal.c |   76 +++++++++++++++++++++++--------------------
 1 file changed, 41 insertions(+), 35 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -277,11 +277,11 @@ static int copy_user_to_fpregs_zeroing(v
 		return frstor_from_user_sigframe(buf);
 }
 
-static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
+static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
+			     bool ia32_fxstate)
 {
 	struct user_i387_ia32_struct *envp = NULL;
 	int state_size = fpu_kernel_xstate_size;
-	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
@@ -289,26 +289,6 @@ static int __fpu__restore_sig(void __use
 	int fx_only = 0;
 	int ret = 0;
 
-	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
-			 IS_ENABLED(CONFIG_IA32_EMULATION));
-
-	if (!buf) {
-		fpu__clear_user_states(fpu);
-		return 0;
-	}
-
-	if (!access_ok(buf, size)) {
-		ret = -EACCES;
-		goto out;
-	}
-
-	if (!static_cpu_has(X86_FEATURE_FPU)) {
-		ret = fpregs_soft_set(current, NULL, 0,
-				      sizeof(struct user_i387_ia32_struct),
-				      NULL, buf);
-		goto out;
-	}
-
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
 		if (unlikely(check_for_xstate(buf_fx, buf_fx, &fx_sw_user))) {
@@ -391,7 +371,7 @@ static int __fpu__restore_sig(void __use
 		 */
 		ret = __copy_from_user(&env, buf, sizeof(env));
 		if (ret)
-			goto out;
+			return ret;
 		envp = &env;
 	}
 
@@ -424,7 +404,7 @@ static int __fpu__restore_sig(void __use
 
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
-			goto out;
+			return ret;
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
 					      fx_only);
@@ -442,10 +422,8 @@ static int __fpu__restore_sig(void __use
 
 	} else if (use_fxsr()) {
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
-		if (ret) {
-			ret = -EFAULT;
-			goto out;
-		}
+		if (ret)
+			return -EFAULT;
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
 					      fx_only);
@@ -462,7 +440,7 @@ static int __fpu__restore_sig(void __use
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)
-			goto out;
+			return ret;
 
 		fpregs_lock();
 		ret = frstor_safe(&fpu->state.fsave);
@@ -472,10 +450,6 @@ static int __fpu__restore_sig(void __use
 	else
 		fpregs_deactivate(fpu);
 	fpregs_unlock();
-
-out:
-	if (ret)
-		fpu__clear_user_states(fpu);
 	return ret;
 }
 
@@ -490,15 +464,47 @@ static inline int xstate_sigframe_size(v
  */
 int fpu__restore_sig(void __user *buf, int ia32_frame)
 {
+	unsigned int size = xstate_sigframe_size();
+	struct fpu *fpu = &current->thread.fpu;
 	void __user *buf_fx = buf;
-	int size = xstate_sigframe_size();
+	bool ia32_fxstate = false;
+	int ret;
 
+	if (unlikely(!buf)) {
+		fpu__clear_user_states(fpu);
+		return 0;
+	}
+
+	ia32_frame &= (IS_ENABLED(CONFIG_X86_32) ||
+		       IS_ENABLED(CONFIG_IA32_EMULATION));
+
+	/*
+	 * Only FXSR enabled systems need the FX state quirk.
+	 * FRSTOR does not need it and can use the fast path.
+	 */
 	if (ia32_frame && use_fxsr()) {
 		buf_fx = buf + sizeof(struct fregs_state);
 		size += sizeof(struct fregs_state);
+		ia32_fxstate = true;
 	}
 
-	return __fpu__restore_sig(buf, buf_fx, size);
+	if (!access_ok(buf, size)) {
+		ret = -EACCES;
+		goto out;
+	}
+
+	if (!IS_ENABLED(CONFIG_X86_64) && !cpu_feature_enabled(X86_FEATURE_FPU)) {
+		ret = fpregs_soft_set(current, NULL, 0,
+				      sizeof(struct user_i387_ia32_struct),
+				      NULL, buf);
+	} else {
+		ret = __fpu_restore_sig(buf, buf_fx, ia32_fxstate);
+	}
+
+out:
+	if (unlikely(ret))
+		fpu__clear_user_states(fpu);
+	return ret;
 }
 
 unsigned long


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 59/65] x86/fpu/signal: Remove the legacy alignment check
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (57 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 58/65] x86/fpu/signal: Move initial checks into fpu__sig_restore() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 60/65] x86/fpu/signal: Sanitize the xstate check on sigframe Thomas Gleixner
                   ` (6 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Checking for the XSTATE buffer being 64 byte aligned and if not deciding
just to restore the FXSR state is daft.

If user space provides an unaligned math frame and has the extended state
magic set in the FX software reserved bytes, then it really can keep the
pieces.

If the frame is unaligned and the FX software magic is not set, then
fx_only is already set and the restore will use fxrstor.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/signal.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -306,9 +306,6 @@ static int __fpu_restore_sig(void __user
 		}
 	}
 
-	if ((unsigned long)buf_fx % 64)
-		fx_only = 1;
-
 	if (!ia32_fxstate) {
 		/*
 		 * Attempt to restore the FPU registers directly from user


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 60/65] x86/fpu/signal: Sanitize the xstate check on sigframe
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (58 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 59/65] x86/fpu/signal: Remove the legacy alignment check Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 61/65] x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing() Thomas Gleixner
                   ` (5 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Utilize the check for the extended state magic in the FX software reserved
bytes and set the parameters for restoring fx_only in the relevant members
of fw_sw_user.

This allows further cleanups on top because the data is consistent.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V4:  Add the missing __ro_after_init (Andrew)
---
 arch/x86/kernel/fpu/signal.c |   70 ++++++++++++++++++++-----------------------
 1 file changed, 33 insertions(+), 37 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -15,29 +15,30 @@
 #include <asm/sigframe.h>
 #include <asm/trace/fpu.h>
 
-static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
+static struct _fpx_sw_bytes fx_sw_reserved __ro_after_init;
+static struct _fpx_sw_bytes fx_sw_reserved_ia32 __ro_after_init;
 
 /*
  * Check for the presence of extended state information in the
  * user fpstate pointer in the sigcontext.
  */
-static inline int check_for_xstate(struct fxregs_state __user *buf,
-				   void __user *fpstate,
-				   struct _fpx_sw_bytes *fx_sw)
+static inline int check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
+					   struct _fpx_sw_bytes *fx_sw)
 {
 	int min_xstate_size = sizeof(struct fxregs_state) +
 			      sizeof(struct xstate_header);
+	void __user *fpstate = fxbuf;
 	unsigned int magic2;
 
-	if (__copy_from_user(fx_sw, &buf->sw_reserved[0], sizeof(*fx_sw)))
-		return -1;
+	if (__copy_from_user(fx_sw, &fxbuf->sw_reserved[0], sizeof(*fx_sw)))
+		return -EFAULT;
 
 	/* Check for the first magic field and other error scenarios. */
 	if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
 	    fx_sw->xstate_size < min_xstate_size ||
 	    fx_sw->xstate_size > fpu_user_xstate_size ||
 	    fx_sw->xstate_size > fx_sw->extended_size)
-		return -1;
+		goto setfx;
 
 	/*
 	 * Check for the presence of second magic word at the end of memory
@@ -45,10 +46,18 @@ static inline int check_for_xstate(struc
 	 * fpstate layout with out copying the extended state information
 	 * in the memory layout.
 	 */
-	if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size))
-	    || magic2 != FP_XSTATE_MAGIC2)
-		return -1;
+	if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size)))
+		return -EFAULT;
 
+	if (likely(magic2 == FP_XSTATE_MAGIC2))
+		return 0;
+setfx:
+	trace_x86_fpu_xstate_check_failed(&current->thread.fpu);
+
+	/* Set the parameters for fx only state */
+	fx_sw->magic1 = 0;
+	fx_sw->xstate_size = sizeof(struct fxregs_state);
+	fx_sw->xfeatures = XFEATURE_MASK_FPSSE;
 	return 0;
 }
 
@@ -213,21 +222,15 @@ int copy_fpstate_to_sigframe(void __user
 
 static inline void
 sanitize_restored_user_xstate(union fpregs_state *state,
-			      struct user_i387_ia32_struct *ia32_env,
-			      u64 user_xfeatures, int fx_only)
+			      struct user_i387_ia32_struct *ia32_env, u64 mask)
 {
 	struct xregs_state *xsave = &state->xsave;
 	struct xstate_header *header = &xsave->header;
 
 	if (use_xsave()) {
 		/*
-		 * Clear all feature bits which are not set in
-		 * user_xfeatures and clear all extended features
-		 * for fx_only mode.
-		 */
-		u64 mask = fx_only ? XFEATURE_MASK_FPSSE : user_xfeatures;
-
-		/*
+		 * Clear all feature bits which are not set in mask.
+		 *
 		 * Supervisor state has to be preserved. The sigframe
 		 * restore can only modify user features, i.e. @mask
 		 * cannot contain them.
@@ -286,24 +289,19 @@ static int __fpu_restore_sig(void __user
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
 	u64 user_xfeatures = 0;
-	int fx_only = 0;
+	bool fx_only = false;
 	int ret = 0;
 
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
-		if (unlikely(check_for_xstate(buf_fx, buf_fx, &fx_sw_user))) {
-			/*
-			 * Couldn't find the extended state information in the
-			 * memory layout. Restore just the FP/SSE and init all
-			 * the other extended state.
-			 */
-			state_size = sizeof(struct fxregs_state);
-			fx_only = 1;
-			trace_x86_fpu_xstate_check_failed(fpu);
-		} else {
-			state_size = fx_sw_user.xstate_size;
-			user_xfeatures = fx_sw_user.xfeatures;
-		}
+
+		ret = check_xstate_in_sigframe(buf_fx, &fx_sw_user);
+		if (unlikely(ret))
+			return ret;
+
+		fx_only = !fx_sw_user.magic1;
+		state_size = fx_sw_user.xstate_size;
+		user_xfeatures = fx_sw_user.xfeatures;
 	}
 
 	if (!ia32_fxstate) {
@@ -403,8 +401,7 @@ static int __fpu_restore_sig(void __user
 		if (ret)
 			return ret;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
-					      fx_only);
+		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
 
 		fpregs_lock();
 		if (unlikely(init_bv))
@@ -422,8 +419,7 @@ static int __fpu_restore_sig(void __user
 		if (ret)
 			return -EFAULT;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
-					      fx_only);
+		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
 
 		fpregs_lock();
 		if (use_xsave()) {


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 61/65] x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (59 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 60/65] x86/fpu/signal: Sanitize the xstate check on sigframe Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 62/65] x86/fpu/signal: Split out the direct restore code Thomas Gleixner
                   ` (4 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Now that user_xfeatures is correctly set when xsave is enabled, remove the
duplicated initialization of components.

Rename the function while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/signal.c |   36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -251,33 +251,27 @@ sanitize_restored_user_xstate(union fpre
 }
 
 /*
- * Restore the extended state if present. Otherwise, restore the FP/SSE state.
+ * Restore the FPU state directly from the userspace signal frame.
  */
-static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
+static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
 {
-	u64 init_bv;
-	int r;
-
 	if (use_xsave()) {
-		if (fx_only) {
-			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
+		u64 init_bv = xfeatures_mask_uabi() & ~xrestore;
+		int ret;
 
-			r = fxrstor_from_user_sigframe(buf);
-			if (!r)
-				os_xrstor(&init_fpstate.xsave, init_bv);
-			return r;
-		} else {
-			init_bv = xfeatures_mask_uabi() & ~xbv;
-
-			r = xrstor_from_user_sigframe(buf, xbv);
-			if (!r && unlikely(init_bv))
-				os_xrstor(&init_fpstate.xsave, init_bv);
-			return r;
-		}
+		if (likely(!fx_only))
+			ret = xrstor_from_user_sigframe(buf, xrestore);
+		else
+			ret = fxrstor_from_user_sigframe(buf);
+
+		if (!ret && unlikely(init_bv))
+			os_xrstor(&init_fpstate.xsave, init_bv);
+		return ret;
 	} else if (use_fxsr()) {
 		return fxrstor_from_user_sigframe(buf);
-	} else
+	} else {
 		return frstor_from_user_sigframe(buf);
+	}
 }
 
 static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
@@ -314,7 +308,7 @@ static int __fpu_restore_sig(void __user
 		 */
 		fpregs_lock();
 		pagefault_disable();
-		ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
+		ret = restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
 		pagefault_enable();
 		if (!ret) {
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 62/65] x86/fpu/signal: Split out the direct restore code
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (60 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 61/65] x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing() Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 63/65] x86/fpu: Return proper error codes from user access functions Thomas Gleixner
                   ` (3 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

Prepare for smarter failure handling of the direct restore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Rename function, fix whitespace and comments (Boris)
---
 arch/x86/kernel/fpu/signal.c |  112 ++++++++++++++++++++++---------------------
 1 file changed, 58 insertions(+), 54 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -250,10 +250,8 @@ sanitize_restored_user_xstate(union fpre
 	}
 }
 
-/*
- * Restore the FPU state directly from the userspace signal frame.
- */
-static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
+static int __restore_fpregs_from_user(void __user *buf, u64 xrestore,
+				      bool fx_only)
 {
 	if (use_xsave()) {
 		u64 init_bv = xfeatures_mask_uabi() & ~xrestore;
@@ -274,6 +272,57 @@ static int restore_fpregs_from_user(void
 	}
 }
 
+static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
+{
+	struct fpu *fpu = &current->thread.fpu;
+	int ret;
+
+	fpregs_lock();
+	pagefault_disable();
+	ret = __restore_fpregs_from_user(buf, xrestore, fx_only);
+	pagefault_enable();
+
+	if (unlikely(ret)) {
+		/*
+		 * The above did an FPU restore operation, restricted to
+		 * the user portion of the registers, and failed, but the
+		 * microcode might have modified the FPU registers
+		 * nevertheless.
+		 *
+		 * If the FPU registers do not belong to current, then
+		 * invalidate the FPU register state otherwise the task
+		 * might preempt current and return to user space with
+		 * corrupted FPU registers.
+		 *
+		 * In case current owns the FPU registers then no further
+		 * action is required. The fixup in the slow path will
+		 * handle it correctly.
+		 */
+		if (test_thread_flag(TIF_NEED_FPU_LOAD))
+			__cpu_invalidate_fpregs_state();
+		fpregs_unlock();
+		return ret;
+	}
+
+	/*
+	 * Restore supervisor states: previous context switch etc has done
+	 * XSAVES and saved the supervisor states in the kernel buffer from
+	 * which they can be restored now.
+	 *
+	 * It would be optimal to handle this with a single XRSTORS, but
+	 * this does not work because the rest of the FPU registers have
+	 * been restored from a user buffer directly. The single XRSTORS
+	 * happens below, when the user buffer has been copied to the
+	 * kernel one.
+	 */
+	if (test_thread_flag(TIF_NEED_FPU_LOAD) && xfeatures_mask_supervisor())
+		os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
+
+	fpregs_mark_activate();
+	fpregs_unlock();
+	return 0;
+}
+
 static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 			     bool ia32_fxstate)
 {
@@ -298,61 +347,16 @@ static int __fpu_restore_sig(void __user
 		user_xfeatures = fx_sw_user.xfeatures;
 	}
 
-	if (!ia32_fxstate) {
+	if (likely(!ia32_fxstate)) {
 		/*
 		 * Attempt to restore the FPU registers directly from user
-		 * memory. For that to succeed, the user access cannot cause
-		 * page faults. If it does, fall back to the slow path below,
-		 * going through the kernel buffer with the enabled pagefault
-		 * handler.
+		 * memory. For that to succeed, the user access cannot cause page
+		 * faults. If it does, fall back to the slow path below, going
+		 * through the kernel buffer with the enabled pagefault handler.
 		 */
-		fpregs_lock();
-		pagefault_disable();
 		ret = restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
-		pagefault_enable();
-		if (!ret) {
-
-			/*
-			 * Restore supervisor states: previous context switch
-			 * etc has done XSAVES and saved the supervisor states
-			 * in the kernel buffer from which they can be restored
-			 * now.
-			 *
-			 * We cannot do a single XRSTORS here - which would
-			 * be nice - because the rest of the FPU registers are
-			 * being restored from a user buffer directly. The
-			 * single XRSTORS happens below, when the user buffer
-			 * has been copied to the kernel one.
-			 */
-			if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
-			    xfeatures_mask_supervisor()) {
-				os_xrstor(&fpu->state.xsave,
-					  xfeatures_mask_supervisor());
-			}
-			fpregs_mark_activate();
-			fpregs_unlock();
+		if (likely(!ret))
 			return 0;
-		}
-
-		/*
-		 * The above did an FPU restore operation, restricted to
-		 * the user portion of the registers, and failed, but the
-		 * microcode might have modified the FPU registers
-		 * nevertheless.
-		 *
-		 * If the FPU registers do not belong to current, then
-		 * invalidate the FPU register state otherwise the task might
-		 * preempt current and return to user space with corrupted
-		 * FPU registers.
-		 *
-		 * In case current owns the FPU registers then no further
-		 * action is required. The fixup below will handle it
-		 * correctly.
-		 */
-		if (test_thread_flag(TIF_NEED_FPU_LOAD))
-			__cpu_invalidate_fpregs_state();
-
-		fpregs_unlock();
 	} else {
 		/*
 		 * For 32-bit frames with fxstate, copy the fxstate so it can


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 63/65] x86/fpu: Return proper error codes from user access functions
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (61 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 62/65] x86/fpu/signal: Split out the direct restore code Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 64/65] x86/fpu/signal: Handle #PF in the direct restore path Thomas Gleixner
                   ` (2 subsequent siblings)
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

When *RSTOR from user memory raises an exception there is no way to
differentiate them. That's bad because it forces the slow path even when
the failure was not a fault. If the operation raised eg. #GP then going
through the slow path is pointless.

Use _ASM_EXTABLE_FAULT() which stores the trap number and let the exception
fixup return the negated trap number as error.

This allows to separate the fast path and let it handle faults directly and
avoid the slow path for all other exceptions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Fixup XSTATE_OP() as well (Boris)
---
 arch/x86/include/asm/fpu/internal.h |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -88,6 +88,7 @@ static inline void fpstate_init_soft(str
 #endif
 extern void save_fpregs_to_fpstate(struct fpu *fpu);
 
+/* Returns 0 or the negated trap number, which results in -EFAULT for #PF */
 #define user_insn(insn, output, input...)				\
 ({									\
 	int err;							\
@@ -95,14 +96,14 @@ extern void save_fpregs_to_fpstate(struc
 	might_fault();							\
 									\
 	asm volatile(ASM_STAC "\n"					\
-		     "1:" #insn "\n\t"					\
+		     "1: " #insn "\n"					\
 		     "2: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
-		     "3:  movl $-1,%[err]\n"				\
+		     "3:  negl %%eax\n"					\
 		     "    jmp  2b\n"					\
 		     ".previous\n"					\
-		     _ASM_EXTABLE(1b, 3b)				\
-		     : [err] "=r" (err), output				\
+		     _ASM_EXTABLE_FAULT(1b, 3b)				\
+		     : [err] "=a" (err), output				\
 		     : "0"(0), input);					\
 	err;								\
 })
@@ -196,16 +197,20 @@ static inline void fxsave(struct fxregs_
 #define XRSTOR		".byte " REX_PREFIX "0x0f,0xae,0x2f"
 #define XRSTORS		".byte " REX_PREFIX "0x0f,0xc7,0x1f"
 
+/*
+ * After this @err contains 0 on success or the negated trap number when
+ * the operation raises an exception. For faults this results in -EFAULT.
+ */
 #define XSTATE_OP(op, st, lmask, hmask, err)				\
 	asm volatile("1:" op "\n\t"					\
 		     "xor %[err], %[err]\n"				\
 		     "2:\n\t"						\
 		     ".pushsection .fixup,\"ax\"\n\t"			\
-		     "3: movl $-2,%[err]\n\t"				\
+		     "3: negl %%eax\n\t"				\
 		     "jmp 2b\n\t"					\
 		     ".popsection\n\t"					\
-		     _ASM_EXTABLE(1b, 3b)				\
-		     : [err] "=r" (err)					\
+		     _ASM_EXTABLE_FAULT(1b, 3b)				\
+		     : [err] "=a" (err)					\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")
 


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 64/65] x86/fpu/signal: Handle #PF in the direct restore path
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (62 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 63/65] x86/fpu: Return proper error codes from user access functions Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-23 12:02 ` [patch V4 65/65] x86/fpu/signal: Let xrstor handle the features to init Thomas Gleixner
  2021-06-25 14:50 ` [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Oliver Sang
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

If *RSTOR raises an exception, then the slow path is taken. That's wrong
because if the reason was not #PF then going through the slow path is waste
of time because that will end up with the same conclusion that the data is
invalid.

Now that the wrapper around *RSTOR return an negative error code, which is
the negated trap number, it's possible to differentiate.

If the *RSTOR raised #PF then handle it directly in the fast path and if it
was some other exception, e.g. #GP, then give up and do not try the fast
path.

This removes the legacy frame FRSTOR code from the slow path because FRSTOR
is not a ia32_fxstate frame and is therefore handled in the fast path.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V4: Made the #PF retry path a bit more readable.
---
 arch/x86/kernel/fpu/signal.c |   67 +++++++++++++++++++++----------------------
 1 file changed, 33 insertions(+), 34 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -272,11 +272,17 @@ static int __restore_fpregs_from_user(vo
 	}
 }
 
-static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
+/*
+ * Attempt to restore the FPU registers directly from user memory.
+ * Pagefaults are handled and any errors returned are fatal.
+ */
+static int restore_fpregs_from_user(void __user *buf, u64 xrestore,
+				    bool fx_only, unsigned int size)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	int ret;
 
+retry:
 	fpregs_lock();
 	pagefault_disable();
 	ret = __restore_fpregs_from_user(buf, xrestore, fx_only);
@@ -293,14 +299,18 @@ static int restore_fpregs_from_user(void
 		 * invalidate the FPU register state otherwise the task
 		 * might preempt current and return to user space with
 		 * corrupted FPU registers.
-		 *
-		 * In case current owns the FPU registers then no further
-		 * action is required. The fixup in the slow path will
-		 * handle it correctly.
 		 */
 		if (test_thread_flag(TIF_NEED_FPU_LOAD))
 			__cpu_invalidate_fpregs_state();
 		fpregs_unlock();
+
+		/* Try to handle #PF, but anything else is fatal. */
+		if (ret != -EFAULT)
+			return -EINVAL;
+
+		ret = fault_in_pages_readable(buf, size);
+		if (!ret)
+			goto retry;
 		return ret;
 	}
 
@@ -311,9 +321,7 @@ static int restore_fpregs_from_user(void
 	 *
 	 * It would be optimal to handle this with a single XRSTORS, but
 	 * this does not work because the rest of the FPU registers have
-	 * been restored from a user buffer directly. The single XRSTORS
-	 * happens below, when the user buffer has been copied to the
-	 * kernel one.
+	 * been restored from a user buffer directly.
 	 */
 	if (test_thread_flag(TIF_NEED_FPU_LOAD) && xfeatures_mask_supervisor())
 		os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
@@ -326,14 +334,13 @@ static int restore_fpregs_from_user(void
 static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 			     bool ia32_fxstate)
 {
-	struct user_i387_ia32_struct *envp = NULL;
 	int state_size = fpu_kernel_xstate_size;
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
 	u64 user_xfeatures = 0;
 	bool fx_only = false;
-	int ret = 0;
+	int ret;
 
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
@@ -354,21 +361,20 @@ static int __fpu_restore_sig(void __user
 		 * faults. If it does, fall back to the slow path below, going
 		 * through the kernel buffer with the enabled pagefault handler.
 		 */
-		ret = restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
-		if (likely(!ret))
-			return 0;
-	} else {
-		/*
-		 * For 32-bit frames with fxstate, copy the fxstate so it can
-		 * be reconstructed later.
-		 */
-		ret = __copy_from_user(&env, buf, sizeof(env));
-		if (ret)
-			return ret;
-		envp = &env;
+		return restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only,
+						state_size);
 	}
 
 	/*
+	 * Copy the legacy state because the FP portion of the FX frame has
+	 * to be ignored for histerical raisins. The legacy state is folded
+	 * in once the larger state has been copied.
+	 */
+	ret = __copy_from_user(&env, buf, sizeof(env));
+	if (ret)
+		return ret;
+
+	/*
 	 * By setting TIF_NEED_FPU_LOAD it is ensured that our xstate is
 	 * not modified on context switch and that the xstate is considered
 	 * to be loaded again on return to userland (overriding last_cpu avoids
@@ -382,8 +388,7 @@ static int __fpu_restore_sig(void __user
 		 * supervisor state is preserved. Save the full state for
 		 * simplicity. There is no point in optimizing this by only
 		 * saving the supervisor states and then shuffle them to
-		 * the right place in memory. This is the slow path and the
-		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
+		 * the right place in memory. It's ia32 mode. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
 			os_xsave(&fpu->state.xsave);
@@ -399,7 +404,7 @@ static int __fpu_restore_sig(void __user
 		if (ret)
 			return ret;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
+		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
 
 		fpregs_lock();
 		if (unlikely(init_bv))
@@ -412,12 +417,12 @@ static int __fpu_restore_sig(void __user
 		ret = os_xrstor_safe(&fpu->state.xsave,
 				     user_xfeatures | xfeatures_mask_supervisor());
 
-	} else if (use_fxsr()) {
+	} else {
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
 		if (ret)
 			return -EFAULT;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
+		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
 
 		fpregs_lock();
 		if (use_xsave()) {
@@ -428,14 +433,8 @@ static int __fpu_restore_sig(void __user
 		}
 
 		ret = fxrstor_safe(&fpu->state.fxsave);
-	} else {
-		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
-		if (ret)
-			return ret;
-
-		fpregs_lock();
-		ret = frstor_safe(&fpu->state.fsave);
 	}
+
 	if (!ret)
 		fpregs_mark_activate();
 	else


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [patch V4 65/65] x86/fpu/signal: Let xrstor handle the features to init
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (63 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 64/65] x86/fpu/signal: Handle #PF in the direct restore path Thomas Gleixner
@ 2021-06-23 12:02 ` Thomas Gleixner
  2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  2021-06-25 14:50 ` [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Oliver Sang
  65 siblings, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-23 12:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

There is no reason to do an extra XRSTOR from initfp_state for feature bits
which have been cleared by user space in the FX magic xfeatures storage.

Just clear them in the task's XSTATE header and do a full restore which
will put these cleared features into init state.

There is no real difference in performance because the current code already
does a full restore when the xfeatures bits are preserved as the signal
frame setup has stored them, which is the full UABI feature set.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/signal.c |   89 ++++++++++++++-----------------------------
 1 file changed, 31 insertions(+), 58 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -220,36 +220,6 @@ int copy_fpstate_to_sigframe(void __user
 	return 0;
 }
 
-static inline void
-sanitize_restored_user_xstate(union fpregs_state *state,
-			      struct user_i387_ia32_struct *ia32_env, u64 mask)
-{
-	struct xregs_state *xsave = &state->xsave;
-	struct xstate_header *header = &xsave->header;
-
-	if (use_xsave()) {
-		/*
-		 * Clear all feature bits which are not set in mask.
-		 *
-		 * Supervisor state has to be preserved. The sigframe
-		 * restore can only modify user features, i.e. @mask
-		 * cannot contain them.
-		 */
-		header->xfeatures &= mask | xfeatures_mask_supervisor();
-	}
-
-	if (use_fxsr()) {
-		/*
-		 * mscsr reserved bits must be masked to zero for security
-		 * reasons.
-		 */
-		xsave->i387.mxcsr &= mxcsr_feature_mask;
-
-		if (ia32_env)
-			convert_to_fxsr(&state->fxsave, ia32_env);
-	}
-}
-
 static int __restore_fpregs_from_user(void __user *buf, u64 xrestore,
 				      bool fx_only)
 {
@@ -352,6 +322,8 @@ static int __fpu_restore_sig(void __user
 		fx_only = !fx_sw_user.magic1;
 		state_size = fx_sw_user.xstate_size;
 		user_xfeatures = fx_sw_user.xfeatures;
+	} else {
+		user_xfeatures = XFEATURE_MASK_FPSSE;
 	}
 
 	if (likely(!ia32_fxstate)) {
@@ -395,54 +367,55 @@ static int __fpu_restore_sig(void __user
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
+	__cpu_invalidate_fpregs_state();
 	fpregs_unlock();
 
 	if (use_xsave() && !fx_only) {
-		u64 init_bv = xfeatures_mask_uabi() & ~user_xfeatures;
-
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			return ret;
+	} else {
+		if (__copy_from_user(&fpu->state.fxsave, buf_fx,
+				     sizeof(fpu->state.fxsave)))
+			return -EFAULT;
 
-		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
+		/* Reject invalid MXCSR values. */
+		if (fpu->state.fxsave.mxcsr & mxcsr_feature_mask)
+			return -EINVAL;
 
-		fpregs_lock();
-		if (unlikely(init_bv))
-			os_xrstor(&init_fpstate.xsave, init_bv);
+		/* Enforce XFEATURE_MASK_FPSSE when XSAVE is enabled */
+		if (use_xsave())
+			fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FPSSE;
+	}
+
+	/* Fold the legacy FP storage */
+	convert_to_fxsr(&fpu->state.fxsave, &env);
 
+	fpregs_lock();
+	if (use_xsave()) {
 		/*
-		 * Restore previously saved supervisor xstates along with
-		 * copied-in user xstates.
+		 * Remove all UABI feature bits not set in user_xfeatures
+		 * from the memory xstate header which makes the full
+		 * restore below bring them into init state. This works for
+		 * fx_only mode as well because that has only FP and SSE
+		 * set in user_xfeatures.
+		 *
+		 * Preserve supervisor states!
 		 */
-		ret = os_xrstor_safe(&fpu->state.xsave,
-				     user_xfeatures | xfeatures_mask_supervisor());
+		u64 mask = user_xfeatures | xfeatures_mask_supervisor();
 
+		fpu->state.xsave.header.xfeatures &= mask;
+		ret = os_xrstor_safe(&fpu->state.xsave, xfeatures_mask_all);
 	} else {
-		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
-		if (ret)
-			return -EFAULT;
-
-		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
-
-		fpregs_lock();
-		if (use_xsave()) {
-			u64 init_bv;
-
-			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
-			os_xrstor(&init_fpstate.xsave, init_bv);
-		}
-
 		ret = fxrstor_safe(&fpu->state.fxsave);
 	}
 
-	if (!ret)
+	if (likely(!ret))
 		fpregs_mark_activate();
-	else
-		fpregs_deactivate(fpu);
+
 	fpregs_unlock();
 	return ret;
 }
-
 static inline int xstate_sigframe_size(void)
 {
 	return use_xsave() ? fpu_user_xstate_size + FP_XSTATE_MAGIC2_SIZE :


^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 07/65] x86/fpu: Move inlines where they belong
  2021-06-23 12:01 ` [patch V4 07/65] x86/fpu: Move inlines where they belong Thomas Gleixner
@ 2021-06-23 18:43   ` Bae, Chang Seok
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 142+ messages in thread
From: Bae, Chang Seok @ 2021-06-23 18:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Yu, Fenghua, Luck, Tony, Yu,
	Yu-cheng, Sebastian Andrzej Siewior, Borislav Petkov,
	Peter Zijlstra, Kan Liang, Megha Dey, Sang, Oliver

On Jun 23, 2021, at 05:01, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> They are only used in fpstate_init() and there is no point to have them in
> a header just to make reading the code harder.

<snip>

> -static inline void fpstate_init_xstate(struct xregs_state *xsave)
> -{
> -	/*
> -	 * XRSTORS requires these bits set in xcomp_bv, or it will
> -	 * trigger #GP:
> -	 */
> -	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
> -}

static void __init setup_init_fpu_buf(void)
{
...

	if (boot_cpu_has(X86_FEATURE_XSAVES))
		init_fpstate.xsave.header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
						     xfeatures_mask_all;


But this line [1] can be exactly replaced with the inline function above. At
least, I don’t see consistency on the mainline.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n460

Thanks,
Chang


^ permalink raw reply	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Let xrstor handle the features to init
  2021-06-23 12:02 ` [patch V4 65/65] x86/fpu/signal: Let xrstor handle the features to init Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     6f9866a166cd1ad3ebb2dcdb3874aa8fee8dea2f
Gitweb:        https://git.kernel.org/tip/6f9866a166cd1ad3ebb2dcdb3874aa8fee8dea2f
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:32 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 23:45:31 +02:00

x86/fpu/signal: Let xrstor handle the features to init

There is no reason to do an extra XRSTOR from init_fpstate for feature
bits which have been cleared by user space in the FX magic xfeatures
storage.

Just clear them in the task's XSTATE header and do a full restore which
will put these cleared features into init state.

There is no real difference in performance because the current code
already does a full restore when the xfeatures bits are preserved as the
signal frame setup has stored them, which is the full UABI feature set.

 [ bp: Use the negated mxcsr_feature_mask in the MXCSR check. ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.804115017@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 89 ++++++++++++-----------------------
 1 file changed, 31 insertions(+), 58 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 4c252d0..445c57c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -220,36 +220,6 @@ retry:
 	return 0;
 }
 
-static inline void
-sanitize_restored_user_xstate(union fpregs_state *state,
-			      struct user_i387_ia32_struct *ia32_env, u64 mask)
-{
-	struct xregs_state *xsave = &state->xsave;
-	struct xstate_header *header = &xsave->header;
-
-	if (use_xsave()) {
-		/*
-		 * Clear all feature bits which are not set in mask.
-		 *
-		 * Supervisor state has to be preserved. The sigframe
-		 * restore can only modify user features, i.e. @mask
-		 * cannot contain them.
-		 */
-		header->xfeatures &= mask | xfeatures_mask_supervisor();
-	}
-
-	if (use_fxsr()) {
-		/*
-		 * mscsr reserved bits must be masked to zero for security
-		 * reasons.
-		 */
-		xsave->i387.mxcsr &= mxcsr_feature_mask;
-
-		if (ia32_env)
-			convert_to_fxsr(&state->fxsave, ia32_env);
-	}
-}
-
 static int __restore_fpregs_from_user(void __user *buf, u64 xrestore,
 				      bool fx_only)
 {
@@ -352,6 +322,8 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		fx_only = !fx_sw_user.magic1;
 		state_size = fx_sw_user.xstate_size;
 		user_xfeatures = fx_sw_user.xfeatures;
+	} else {
+		user_xfeatures = XFEATURE_MASK_FPSSE;
 	}
 
 	if (likely(!ia32_fxstate)) {
@@ -395,54 +367,55 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
+	__cpu_invalidate_fpregs_state();
 	fpregs_unlock();
 
 	if (use_xsave() && !fx_only) {
-		u64 init_bv = xfeatures_mask_uabi() & ~user_xfeatures;
-
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			return ret;
+	} else {
+		if (__copy_from_user(&fpu->state.fxsave, buf_fx,
+				     sizeof(fpu->state.fxsave)))
+			return -EFAULT;
 
-		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
+		/* Reject invalid MXCSR values. */
+		if (fpu->state.fxsave.mxcsr & ~mxcsr_feature_mask)
+			return -EINVAL;
 
-		fpregs_lock();
-		if (unlikely(init_bv))
-			os_xrstor(&init_fpstate.xsave, init_bv);
+		/* Enforce XFEATURE_MASK_FPSSE when XSAVE is enabled */
+		if (use_xsave())
+			fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FPSSE;
+	}
+
+	/* Fold the legacy FP storage */
+	convert_to_fxsr(&fpu->state.fxsave, &env);
 
+	fpregs_lock();
+	if (use_xsave()) {
 		/*
-		 * Restore previously saved supervisor xstates along with
-		 * copied-in user xstates.
+		 * Remove all UABI feature bits not set in user_xfeatures
+		 * from the memory xstate header which makes the full
+		 * restore below bring them into init state. This works for
+		 * fx_only mode as well because that has only FP and SSE
+		 * set in user_xfeatures.
+		 *
+		 * Preserve supervisor states!
 		 */
-		ret = os_xrstor_safe(&fpu->state.xsave,
-				     user_xfeatures | xfeatures_mask_supervisor());
+		u64 mask = user_xfeatures | xfeatures_mask_supervisor();
 
+		fpu->state.xsave.header.xfeatures &= mask;
+		ret = os_xrstor_safe(&fpu->state.xsave, xfeatures_mask_all);
 	} else {
-		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
-		if (ret)
-			return -EFAULT;
-
-		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
-
-		fpregs_lock();
-		if (use_xsave()) {
-			u64 init_bv;
-
-			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
-			os_xrstor(&init_fpstate.xsave, init_bv);
-		}
-
 		ret = fxrstor_safe(&fpu->state.fxsave);
 	}
 
-	if (!ret)
+	if (likely(!ret))
 		fpregs_mark_activate();
-	else
-		fpregs_deactivate(fpu);
+
 	fpregs_unlock();
 	return ret;
 }
-
 static inline int xstate_sigframe_size(void)
 {
 	return use_xsave() ? fpu_user_xstate_size + FP_XSTATE_MAGIC2_SIZE :

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Handle #PF in the direct restore path
  2021-06-23 12:02 ` [patch V4 64/65] x86/fpu/signal: Handle #PF in the direct restore path Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     fcb3635f5018e53024c6be3c3213737f469f74ff
Gitweb:        https://git.kernel.org/tip/fcb3635f5018e53024c6be3c3213737f469f74ff
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:31 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 20:05:33 +02:00

x86/fpu/signal: Handle #PF in the direct restore path

If *RSTOR raises an exception, then the slow path is taken. That's wrong
because if the reason was not #PF then going through the slow path is waste
of time because that will end up with the same conclusion that the data is
invalid.

Now that the wrapper around *RSTOR return an negative error code, which is
the negated trap number, it's possible to differentiate.

If the *RSTOR raised #PF then handle it directly in the fast path and if it
was some other exception, e.g. #GP, then give up and do not try the fast
path.

This removes the legacy frame FRSTOR code from the slow path because FRSTOR
is not a ia32_fxstate frame and is therefore handled in the fast path.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.696022863@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 67 +++++++++++++++++------------------
 1 file changed, 33 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index aa268d9..4c252d0 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -272,11 +272,17 @@ static int __restore_fpregs_from_user(void __user *buf, u64 xrestore,
 	}
 }
 
-static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
+/*
+ * Attempt to restore the FPU registers directly from user memory.
+ * Pagefaults are handled and any errors returned are fatal.
+ */
+static int restore_fpregs_from_user(void __user *buf, u64 xrestore,
+				    bool fx_only, unsigned int size)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	int ret;
 
+retry:
 	fpregs_lock();
 	pagefault_disable();
 	ret = __restore_fpregs_from_user(buf, xrestore, fx_only);
@@ -293,14 +299,18 @@ static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only
 		 * invalidate the FPU register state otherwise the task
 		 * might preempt current and return to user space with
 		 * corrupted FPU registers.
-		 *
-		 * In case current owns the FPU registers then no further
-		 * action is required. The fixup in the slow path will
-		 * handle it correctly.
 		 */
 		if (test_thread_flag(TIF_NEED_FPU_LOAD))
 			__cpu_invalidate_fpregs_state();
 		fpregs_unlock();
+
+		/* Try to handle #PF, but anything else is fatal. */
+		if (ret != -EFAULT)
+			return -EINVAL;
+
+		ret = fault_in_pages_readable(buf, size);
+		if (!ret)
+			goto retry;
 		return ret;
 	}
 
@@ -311,9 +321,7 @@ static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only
 	 *
 	 * It would be optimal to handle this with a single XRSTORS, but
 	 * this does not work because the rest of the FPU registers have
-	 * been restored from a user buffer directly. The single XRSTORS
-	 * happens below, when the user buffer has been copied to the
-	 * kernel one.
+	 * been restored from a user buffer directly.
 	 */
 	if (test_thread_flag(TIF_NEED_FPU_LOAD) && xfeatures_mask_supervisor())
 		os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
@@ -326,14 +334,13 @@ static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only
 static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 			     bool ia32_fxstate)
 {
-	struct user_i387_ia32_struct *envp = NULL;
 	int state_size = fpu_kernel_xstate_size;
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
 	u64 user_xfeatures = 0;
 	bool fx_only = false;
-	int ret = 0;
+	int ret;
 
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
@@ -354,21 +361,20 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		 * faults. If it does, fall back to the slow path below, going
 		 * through the kernel buffer with the enabled pagefault handler.
 		 */
-		ret = restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
-		if (likely(!ret))
-			return 0;
-	} else {
-		/*
-		 * For 32-bit frames with fxstate, copy the fxstate so it can
-		 * be reconstructed later.
-		 */
-		ret = __copy_from_user(&env, buf, sizeof(env));
-		if (ret)
-			return ret;
-		envp = &env;
+		return restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only,
+						state_size);
 	}
 
 	/*
+	 * Copy the legacy state because the FP portion of the FX frame has
+	 * to be ignored for histerical raisins. The legacy state is folded
+	 * in once the larger state has been copied.
+	 */
+	ret = __copy_from_user(&env, buf, sizeof(env));
+	if (ret)
+		return ret;
+
+	/*
 	 * By setting TIF_NEED_FPU_LOAD it is ensured that our xstate is
 	 * not modified on context switch and that the xstate is considered
 	 * to be loaded again on return to userland (overriding last_cpu avoids
@@ -382,8 +388,7 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		 * supervisor state is preserved. Save the full state for
 		 * simplicity. There is no point in optimizing this by only
 		 * saving the supervisor states and then shuffle them to
-		 * the right place in memory. This is the slow path and the
-		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
+		 * the right place in memory. It's ia32 mode. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
 			os_xsave(&fpu->state.xsave);
@@ -399,7 +404,7 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		if (ret)
 			return ret;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
+		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
 
 		fpregs_lock();
 		if (unlikely(init_bv))
@@ -412,12 +417,12 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		ret = os_xrstor_safe(&fpu->state.xsave,
 				     user_xfeatures | xfeatures_mask_supervisor());
 
-	} else if (use_fxsr()) {
+	} else {
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
 		if (ret)
 			return -EFAULT;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
+		sanitize_restored_user_xstate(&fpu->state, &env, user_xfeatures);
 
 		fpregs_lock();
 		if (use_xsave()) {
@@ -428,14 +433,8 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		}
 
 		ret = fxrstor_safe(&fpu->state.fxsave);
-	} else {
-		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
-		if (ret)
-			return ret;
-
-		fpregs_lock();
-		ret = frstor_safe(&fpu->state.fsave);
 	}
+
 	if (!ret)
 		fpregs_mark_activate();
 	else

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Return proper error codes from user access functions
  2021-06-23 12:02 ` [patch V4 63/65] x86/fpu: Return proper error codes from user access functions Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     aee8c67a4faa40a8df4e79316dbfc92d123989c1
Gitweb:        https://git.kernel.org/tip/aee8c67a4faa40a8df4e79316dbfc92d123989c1
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:30 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 20:04:58 +02:00

x86/fpu: Return proper error codes from user access functions

When *RSTOR from user memory raises an exception, there is no way to
differentiate them. That's bad because it forces the slow path even when
the failure was not a fault. If the operation raised eg. #GP then going
through the slow path is pointless.

Use _ASM_EXTABLE_FAULT() which stores the trap number and let the exception
fixup return the negated trap number as error.

This allows to separate the fast path and let it handle faults directly and
avoid the slow path for all other exceptions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.601480369@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 528a868..5a18694 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -88,6 +88,7 @@ static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 extern void save_fpregs_to_fpstate(struct fpu *fpu);
 
+/* Returns 0 or the negated trap number, which results in -EFAULT for #PF */
 #define user_insn(insn, output, input...)				\
 ({									\
 	int err;							\
@@ -95,14 +96,14 @@ extern void save_fpregs_to_fpstate(struct fpu *fpu);
 	might_fault();							\
 									\
 	asm volatile(ASM_STAC "\n"					\
-		     "1:" #insn "\n\t"					\
+		     "1: " #insn "\n"					\
 		     "2: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
-		     "3:  movl $-1,%[err]\n"				\
+		     "3:  negl %%eax\n"					\
 		     "    jmp  2b\n"					\
 		     ".previous\n"					\
-		     _ASM_EXTABLE(1b, 3b)				\
-		     : [err] "=r" (err), output				\
+		     _ASM_EXTABLE_FAULT(1b, 3b)				\
+		     : [err] "=a" (err), output				\
 		     : "0"(0), input);					\
 	err;								\
 })
@@ -196,16 +197,20 @@ static inline void fxsave(struct fxregs_state *fx)
 #define XRSTOR		".byte " REX_PREFIX "0x0f,0xae,0x2f"
 #define XRSTORS		".byte " REX_PREFIX "0x0f,0xc7,0x1f"
 
+/*
+ * After this @err contains 0 on success or the negated trap number when
+ * the operation raises an exception. For faults this results in -EFAULT.
+ */
 #define XSTATE_OP(op, st, lmask, hmask, err)				\
 	asm volatile("1:" op "\n\t"					\
 		     "xor %[err], %[err]\n"				\
 		     "2:\n\t"						\
 		     ".pushsection .fixup,\"ax\"\n\t"			\
-		     "3: movl $-2,%[err]\n\t"				\
+		     "3: negl %%eax\n\t"				\
 		     "jmp 2b\n\t"					\
 		     ".popsection\n\t"					\
-		     _ASM_EXTABLE(1b, 3b)				\
-		     : [err] "=r" (err)					\
+		     _ASM_EXTABLE_FAULT(1b, 3b)				\
+		     : [err] "=a" (err)					\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Split out the direct restore code
  2021-06-23 12:02 ` [patch V4 62/65] x86/fpu/signal: Split out the direct restore code Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     0a6c2e9ec91c96bde1e8ce063180ac6e05e680f7
Gitweb:        https://git.kernel.org/tip/0a6c2e9ec91c96bde1e8ce063180ac6e05e680f7
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:29 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 20:03:44 +02:00

x86/fpu/signal: Split out the direct restore code

Prepare for smarter failure handling of the direct restore.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.493455414@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 112 +++++++++++++++++-----------------
 1 file changed, 58 insertions(+), 54 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a1a7013..aa268d9 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -250,10 +250,8 @@ sanitize_restored_user_xstate(union fpregs_state *state,
 	}
 }
 
-/*
- * Restore the FPU state directly from the userspace signal frame.
- */
-static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
+static int __restore_fpregs_from_user(void __user *buf, u64 xrestore,
+				      bool fx_only)
 {
 	if (use_xsave()) {
 		u64 init_bv = xfeatures_mask_uabi() & ~xrestore;
@@ -274,6 +272,57 @@ static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only
 	}
 }
 
+static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
+{
+	struct fpu *fpu = &current->thread.fpu;
+	int ret;
+
+	fpregs_lock();
+	pagefault_disable();
+	ret = __restore_fpregs_from_user(buf, xrestore, fx_only);
+	pagefault_enable();
+
+	if (unlikely(ret)) {
+		/*
+		 * The above did an FPU restore operation, restricted to
+		 * the user portion of the registers, and failed, but the
+		 * microcode might have modified the FPU registers
+		 * nevertheless.
+		 *
+		 * If the FPU registers do not belong to current, then
+		 * invalidate the FPU register state otherwise the task
+		 * might preempt current and return to user space with
+		 * corrupted FPU registers.
+		 *
+		 * In case current owns the FPU registers then no further
+		 * action is required. The fixup in the slow path will
+		 * handle it correctly.
+		 */
+		if (test_thread_flag(TIF_NEED_FPU_LOAD))
+			__cpu_invalidate_fpregs_state();
+		fpregs_unlock();
+		return ret;
+	}
+
+	/*
+	 * Restore supervisor states: previous context switch etc has done
+	 * XSAVES and saved the supervisor states in the kernel buffer from
+	 * which they can be restored now.
+	 *
+	 * It would be optimal to handle this with a single XRSTORS, but
+	 * this does not work because the rest of the FPU registers have
+	 * been restored from a user buffer directly. The single XRSTORS
+	 * happens below, when the user buffer has been copied to the
+	 * kernel one.
+	 */
+	if (test_thread_flag(TIF_NEED_FPU_LOAD) && xfeatures_mask_supervisor())
+		os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
+
+	fpregs_mark_activate();
+	fpregs_unlock();
+	return 0;
+}
+
 static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 			     bool ia32_fxstate)
 {
@@ -298,61 +347,16 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		user_xfeatures = fx_sw_user.xfeatures;
 	}
 
-	if (!ia32_fxstate) {
+	if (likely(!ia32_fxstate)) {
 		/*
 		 * Attempt to restore the FPU registers directly from user
-		 * memory. For that to succeed, the user access cannot cause
-		 * page faults. If it does, fall back to the slow path below,
-		 * going through the kernel buffer with the enabled pagefault
-		 * handler.
+		 * memory. For that to succeed, the user access cannot cause page
+		 * faults. If it does, fall back to the slow path below, going
+		 * through the kernel buffer with the enabled pagefault handler.
 		 */
-		fpregs_lock();
-		pagefault_disable();
 		ret = restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
-		pagefault_enable();
-		if (!ret) {
-
-			/*
-			 * Restore supervisor states: previous context switch
-			 * etc has done XSAVES and saved the supervisor states
-			 * in the kernel buffer from which they can be restored
-			 * now.
-			 *
-			 * We cannot do a single XRSTORS here - which would
-			 * be nice - because the rest of the FPU registers are
-			 * being restored from a user buffer directly. The
-			 * single XRSTORS happens below, when the user buffer
-			 * has been copied to the kernel one.
-			 */
-			if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
-			    xfeatures_mask_supervisor()) {
-				os_xrstor(&fpu->state.xsave,
-					  xfeatures_mask_supervisor());
-			}
-			fpregs_mark_activate();
-			fpregs_unlock();
+		if (likely(!ret))
 			return 0;
-		}
-
-		/*
-		 * The above did an FPU restore operation, restricted to
-		 * the user portion of the registers, and failed, but the
-		 * microcode might have modified the FPU registers
-		 * nevertheless.
-		 *
-		 * If the FPU registers do not belong to current, then
-		 * invalidate the FPU register state otherwise the task might
-		 * preempt current and return to user space with corrupted
-		 * FPU registers.
-		 *
-		 * In case current owns the FPU registers then no further
-		 * action is required. The fixup below will handle it
-		 * correctly.
-		 */
-		if (test_thread_flag(TIF_NEED_FPU_LOAD))
-			__cpu_invalidate_fpregs_state();
-
-		fpregs_unlock();
 	} else {
 		/*
 		 * For 32-bit frames with fxstate, copy the fxstate so it can

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()
  2021-06-23 12:02 ` [patch V4 61/65] x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing() Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     cdcec1b77001e7f2cd10dccfc6d9b6d5d3f1f3ea
Gitweb:        https://git.kernel.org/tip/cdcec1b77001e7f2cd10dccfc6d9b6d5d3f1f3ea
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:28 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 20:03:15 +02:00

x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()

Now that user_xfeatures is correctly set when xsave is enabled, remove
the duplicated initialization of components.

Rename the function while at it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.377341297@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 36 ++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index d552410..a1a7013 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -251,33 +251,27 @@ sanitize_restored_user_xstate(union fpregs_state *state,
 }
 
 /*
- * Restore the extended state if present. Otherwise, restore the FP/SSE state.
+ * Restore the FPU state directly from the userspace signal frame.
  */
-static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
+static int restore_fpregs_from_user(void __user *buf, u64 xrestore, bool fx_only)
 {
-	u64 init_bv;
-	int r;
-
 	if (use_xsave()) {
-		if (fx_only) {
-			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
+		u64 init_bv = xfeatures_mask_uabi() & ~xrestore;
+		int ret;
 
-			r = fxrstor_from_user_sigframe(buf);
-			if (!r)
-				os_xrstor(&init_fpstate.xsave, init_bv);
-			return r;
-		} else {
-			init_bv = xfeatures_mask_uabi() & ~xbv;
-
-			r = xrstor_from_user_sigframe(buf, xbv);
-			if (!r && unlikely(init_bv))
-				os_xrstor(&init_fpstate.xsave, init_bv);
-			return r;
-		}
+		if (likely(!fx_only))
+			ret = xrstor_from_user_sigframe(buf, xrestore);
+		else
+			ret = fxrstor_from_user_sigframe(buf);
+
+		if (!ret && unlikely(init_bv))
+			os_xrstor(&init_fpstate.xsave, init_bv);
+		return ret;
 	} else if (use_fxsr()) {
 		return fxrstor_from_user_sigframe(buf);
-	} else
+	} else {
 		return frstor_from_user_sigframe(buf);
+	}
 }
 
 static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
@@ -314,7 +308,7 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		 */
 		fpregs_lock();
 		pagefault_disable();
-		ret = copy_user_to_fpregs_zeroing(buf_fx, user_xfeatures, fx_only);
+		ret = restore_fpregs_from_user(buf_fx, user_xfeatures, fx_only);
 		pagefault_enable();
 		if (!ret) {
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Remove the legacy alignment check
  2021-06-23 12:02 ` [patch V4 59/65] x86/fpu/signal: Remove the legacy alignment check Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     9ba589f9cdbd8906465b108bc7ec0fc1519a06d3
Gitweb:        https://git.kernel.org/tip/9ba589f9cdbd8906465b108bc7ec0fc1519a06d3
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:26 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 20:01:55 +02:00

x86/fpu/signal: Remove the legacy alignment check

Checking for the XSTATE buffer being 64-byte aligned, and if not,
deciding just to restore the FXSR state is daft.

If user space provides an unaligned math frame and has the extended state
magic set in the FX software reserved bytes, then it really can keep the
pieces.

If the frame is unaligned and the FX software magic is not set, then
fx_only is already set and the restore will use fxrstor.

Remove it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.184149902@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 42e85c3..8a327c0 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -306,9 +306,6 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		}
 	}
 
-	if ((unsigned long)buf_fx % 64)
-		fx_only = 1;
-
 	if (!ia32_fxstate) {
 		/*
 		 * Attempt to restore the FPU registers directly from user

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Sanitize the xstate check on sigframe
  2021-06-23 12:02 ` [patch V4 60/65] x86/fpu/signal: Sanitize the xstate check on sigframe Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     1258a8c896044564514c1b53795ba3033b1e9fd6
Gitweb:        https://git.kernel.org/tip/1258a8c896044564514c1b53795ba3033b1e9fd6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:27 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 20:02:46 +02:00

x86/fpu/signal: Sanitize the xstate check on sigframe

Utilize the check for the extended state magic in the FX software reserved
bytes and set the parameters for restoring fx_only in the relevant members
of fw_sw_user.

This allows further cleanups on top because the data is consistent.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.277738268@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 70 ++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 8a327c0..d552410 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -15,29 +15,30 @@
 #include <asm/sigframe.h>
 #include <asm/trace/fpu.h>
 
-static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
+static struct _fpx_sw_bytes fx_sw_reserved __ro_after_init;
+static struct _fpx_sw_bytes fx_sw_reserved_ia32 __ro_after_init;
 
 /*
  * Check for the presence of extended state information in the
  * user fpstate pointer in the sigcontext.
  */
-static inline int check_for_xstate(struct fxregs_state __user *buf,
-				   void __user *fpstate,
-				   struct _fpx_sw_bytes *fx_sw)
+static inline int check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
+					   struct _fpx_sw_bytes *fx_sw)
 {
 	int min_xstate_size = sizeof(struct fxregs_state) +
 			      sizeof(struct xstate_header);
+	void __user *fpstate = fxbuf;
 	unsigned int magic2;
 
-	if (__copy_from_user(fx_sw, &buf->sw_reserved[0], sizeof(*fx_sw)))
-		return -1;
+	if (__copy_from_user(fx_sw, &fxbuf->sw_reserved[0], sizeof(*fx_sw)))
+		return -EFAULT;
 
 	/* Check for the first magic field and other error scenarios. */
 	if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
 	    fx_sw->xstate_size < min_xstate_size ||
 	    fx_sw->xstate_size > fpu_user_xstate_size ||
 	    fx_sw->xstate_size > fx_sw->extended_size)
-		return -1;
+		goto setfx;
 
 	/*
 	 * Check for the presence of second magic word at the end of memory
@@ -45,10 +46,18 @@ static inline int check_for_xstate(struct fxregs_state __user *buf,
 	 * fpstate layout with out copying the extended state information
 	 * in the memory layout.
 	 */
-	if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size))
-	    || magic2 != FP_XSTATE_MAGIC2)
-		return -1;
+	if (__get_user(magic2, (__u32 __user *)(fpstate + fx_sw->xstate_size)))
+		return -EFAULT;
 
+	if (likely(magic2 == FP_XSTATE_MAGIC2))
+		return 0;
+setfx:
+	trace_x86_fpu_xstate_check_failed(&current->thread.fpu);
+
+	/* Set the parameters for fx only state */
+	fx_sw->magic1 = 0;
+	fx_sw->xstate_size = sizeof(struct fxregs_state);
+	fx_sw->xfeatures = XFEATURE_MASK_FPSSE;
 	return 0;
 }
 
@@ -213,21 +222,15 @@ retry:
 
 static inline void
 sanitize_restored_user_xstate(union fpregs_state *state,
-			      struct user_i387_ia32_struct *ia32_env,
-			      u64 user_xfeatures, int fx_only)
+			      struct user_i387_ia32_struct *ia32_env, u64 mask)
 {
 	struct xregs_state *xsave = &state->xsave;
 	struct xstate_header *header = &xsave->header;
 
 	if (use_xsave()) {
 		/*
-		 * Clear all feature bits which are not set in
-		 * user_xfeatures and clear all extended features
-		 * for fx_only mode.
-		 */
-		u64 mask = fx_only ? XFEATURE_MASK_FPSSE : user_xfeatures;
-
-		/*
+		 * Clear all feature bits which are not set in mask.
+		 *
 		 * Supervisor state has to be preserved. The sigframe
 		 * restore can only modify user features, i.e. @mask
 		 * cannot contain them.
@@ -286,24 +289,19 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
 	u64 user_xfeatures = 0;
-	int fx_only = 0;
+	bool fx_only = false;
 	int ret = 0;
 
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
-		if (unlikely(check_for_xstate(buf_fx, buf_fx, &fx_sw_user))) {
-			/*
-			 * Couldn't find the extended state information in the
-			 * memory layout. Restore just the FP/SSE and init all
-			 * the other extended state.
-			 */
-			state_size = sizeof(struct fxregs_state);
-			fx_only = 1;
-			trace_x86_fpu_xstate_check_failed(fpu);
-		} else {
-			state_size = fx_sw_user.xstate_size;
-			user_xfeatures = fx_sw_user.xfeatures;
-		}
+
+		ret = check_xstate_in_sigframe(buf_fx, &fx_sw_user);
+		if (unlikely(ret))
+			return ret;
+
+		fx_only = !fx_sw_user.magic1;
+		state_size = fx_sw_user.xstate_size;
+		user_xfeatures = fx_sw_user.xfeatures;
 	}
 
 	if (!ia32_fxstate) {
@@ -403,8 +401,7 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		if (ret)
 			return ret;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
-					      fx_only);
+		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
 
 		fpregs_lock();
 		if (unlikely(init_bv))
@@ -422,8 +419,7 @@ static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
 		if (ret)
 			return -EFAULT;
 
-		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
-					      fx_only);
+		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures);
 
 		fpregs_lock();
 		if (use_xsave()) {

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/signal: Move initial checks into fpu__restore_sig()
  2021-06-23 12:02 ` [patch V4 58/65] x86/fpu/signal: Move initial checks into fpu__sig_restore() Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     99a5901951b70251965b0d1542d4a8c616842a99
Gitweb:        https://git.kernel.org/tip/99a5901951b70251965b0d1542d4a8c616842a99
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:25 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:59:57 +02:00

x86/fpu/signal: Move initial checks into fpu__restore_sig()

__fpu__restore_sig() is convoluted and some of the basic checks can
trivially be done in the calling function as well as the final error
handling of clearing user state.

 [ bp: Fixup typos. ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121457.086336154@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 76 ++++++++++++++++++-----------------
 1 file changed, 41 insertions(+), 35 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a42bc9d..42e85c3 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -277,11 +277,11 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 		return frstor_from_user_sigframe(buf);
 }
 
-static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
+static int __fpu_restore_sig(void __user *buf, void __user *buf_fx,
+			     bool ia32_fxstate)
 {
 	struct user_i387_ia32_struct *envp = NULL;
 	int state_size = fpu_kernel_xstate_size;
-	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
@@ -289,26 +289,6 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	int fx_only = 0;
 	int ret = 0;
 
-	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
-			 IS_ENABLED(CONFIG_IA32_EMULATION));
-
-	if (!buf) {
-		fpu__clear_user_states(fpu);
-		return 0;
-	}
-
-	if (!access_ok(buf, size)) {
-		ret = -EACCES;
-		goto out;
-	}
-
-	if (!static_cpu_has(X86_FEATURE_FPU)) {
-		ret = fpregs_soft_set(current, NULL, 0,
-				      sizeof(struct user_i387_ia32_struct),
-				      NULL, buf);
-		goto out;
-	}
-
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
 		if (unlikely(check_for_xstate(buf_fx, buf_fx, &fx_sw_user))) {
@@ -391,7 +371,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 */
 		ret = __copy_from_user(&env, buf, sizeof(env));
 		if (ret)
-			goto out;
+			return ret;
 		envp = &env;
 	}
 
@@ -424,7 +404,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
-			goto out;
+			return ret;
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
 					      fx_only);
@@ -442,10 +422,8 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 
 	} else if (use_fxsr()) {
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
-		if (ret) {
-			ret = -EFAULT;
-			goto out;
-		}
+		if (ret)
+			return -EFAULT;
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
 					      fx_only);
@@ -462,7 +440,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)
-			goto out;
+			return ret;
 
 		fpregs_lock();
 		ret = frstor_safe(&fpu->state.fsave);
@@ -472,10 +450,6 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	else
 		fpregs_deactivate(fpu);
 	fpregs_unlock();
-
-out:
-	if (ret)
-		fpu__clear_user_states(fpu);
 	return ret;
 }
 
@@ -490,15 +464,47 @@ static inline int xstate_sigframe_size(void)
  */
 int fpu__restore_sig(void __user *buf, int ia32_frame)
 {
+	unsigned int size = xstate_sigframe_size();
+	struct fpu *fpu = &current->thread.fpu;
 	void __user *buf_fx = buf;
-	int size = xstate_sigframe_size();
+	bool ia32_fxstate = false;
+	int ret;
+
+	if (unlikely(!buf)) {
+		fpu__clear_user_states(fpu);
+		return 0;
+	}
 
+	ia32_frame &= (IS_ENABLED(CONFIG_X86_32) ||
+		       IS_ENABLED(CONFIG_IA32_EMULATION));
+
+	/*
+	 * Only FXSR enabled systems need the FX state quirk.
+	 * FRSTOR does not need it and can use the fast path.
+	 */
 	if (ia32_frame && use_fxsr()) {
 		buf_fx = buf + sizeof(struct fregs_state);
 		size += sizeof(struct fregs_state);
+		ia32_fxstate = true;
 	}
 
-	return __fpu__restore_sig(buf, buf_fx, size);
+	if (!access_ok(buf, size)) {
+		ret = -EACCES;
+		goto out;
+	}
+
+	if (!IS_ENABLED(CONFIG_X86_64) && !cpu_feature_enabled(X86_FEATURE_FPU)) {
+		ret = fpregs_soft_set(current, NULL, 0,
+				      sizeof(struct user_i387_ia32_struct),
+				      NULL, buf);
+	} else {
+		ret = __fpu_restore_sig(buf, buf_fx, ia32_fxstate);
+	}
+
+out:
+	if (unlikely(ret))
+		fpu__clear_user_states(fpu);
+	return ret;
 }
 
 unsigned long

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Mark init_fpstate __ro_after_init
  2021-06-23 12:02 ` [patch V4 57/65] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     bf68a7d98922e1665019b8bf0c4791500837c857
Gitweb:        https://git.kernel.org/tip/bf68a7d98922e1665019b8bf0c4791500837c857
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:24 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:58:45 +02:00

x86/fpu: Mark init_fpstate __ro_after_init

Nothing has to write into that state after init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.992342060@linutronix.de
---
 arch/x86/kernel/fpu/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 5295cba..7ada7bd 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -23,7 +23,7 @@
  * Represents the initial FPU state. It's mostly (but not completely) zeroes,
  * depending on the FPU hardware format:
  */
-union fpregs_state init_fpstate __read_mostly;
+union fpregs_state init_fpstate __ro_after_init;
 
 /*
  * Track whether the kernel is using the FPU state

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/pkru: Remove xstate fiddling from write_pkru()
  2021-06-23 12:02 ` [patch V4 56/65] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     72a6c08c44e4460e39315ca828f60b8d5afd6b19
Gitweb:        https://git.kernel.org/tip/72a6c08c44e4460e39315ca828f60b8d5afd6b19
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:23 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:55:51 +02:00

x86/pkru: Remove xstate fiddling from write_pkru()

The PKRU value of a task is stored in task->thread.pkru when the task is
scheduled out. PKRU is restored on schedule in from there. So keeping the
XSAVE buffer up to date is a pointless exercise.

Remove the xstate fiddling and cleanup all related functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.897372712@linutronix.de
---
 arch/x86/include/asm/pkru.h          | 17 ++++-------------
 arch/x86/include/asm/special_insns.h | 14 +-------------
 arch/x86/kvm/x86.c                   |  4 ++--
 3 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/pkru.h b/arch/x86/include/asm/pkru.h
index 7e45509..ccc539f 100644
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -41,23 +41,14 @@ static inline u32 read_pkru(void)
 
 static inline void write_pkru(u32 pkru)
 {
-	struct pkru_state *pk;
-
 	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
 	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
 	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
+	if (pkru != rdpkru())
+		wrpkru(pkru);
 }
 
 static inline void pkru_write_default(void)
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 2acd6cb..f3fbb84 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -104,25 +104,13 @@ static inline void wrpkru(u32 pkru)
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
 
-static inline void __write_pkru(u32 pkru)
-{
-	/*
-	 * WRPKRU is relatively expensive compared to RDPKRU.
-	 * Avoid WRPKRU when it would not change the value.
-	 */
-	if (pkru == rdpkru())
-		return;
-
-	wrpkru(pkru);
-}
-
 #else
 static inline u32 rdpkru(void)
 {
 	return 0;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 }
 #endif
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 07f7888..8ee7add 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -943,7 +943,7 @@ void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu)
 	    (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
 	    vcpu->arch.pkru != vcpu->arch.host_pkru)
-		__write_pkru(vcpu->arch.pkru);
+		write_pkru(vcpu->arch.pkru);
 }
 EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
 
@@ -957,7 +957,7 @@ void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
 		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vcpu->arch.host_pkru)
-			__write_pkru(vcpu->arch.host_pkru);
+			write_pkru(vcpu->arch.host_pkru);
 	}
 
 	if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate()
  2021-06-23 12:02 ` [patch V4 55/65] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     0e8c54f6b2c8b1037cef9276e451522ee90ed969
Gitweb:        https://git.kernel.org/tip/0e8c54f6b2c8b1037cef9276e451522ee90ed969
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:22 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:55:16 +02:00

x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate()

PKRU for a task is stored in task->thread.pkru when the task is scheduled
out. For 'current' the authoritative source of PKRU is the hardware.

fpu_reset_fpstate() has two callers:

  1) fpu__clear_user_states() for !FPU systems. For those PKRU is irrelevant

  2) fpu_flush_thread() which is invoked from flush_thread(). flush_thread()
     resets the hardware to the kernel restrictive default value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.802850233@linutronix.de
---
 arch/x86/kernel/fpu/core.c | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 470576c..5295cba 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -337,23 +337,6 @@ static inline unsigned int init_fpstate_copy_size(void)
 	return sizeof(init_fpstate.xsave);
 }
 
-/* Temporary workaround. Will be removed once PKRU and XSTATE are untangled. */
-static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
-{
-	struct pkru_state *pk;
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-	/*
-	 * Force XFEATURE_PKRU to be set in the header otherwise
-	 * get_xsave_addr() does not work and it also needs to be set to
-	 * make XRSTOR(S) load it.
-	 */
-	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
-	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
-	pk->pkru = pkru_get_init_value();
-}
-
 /*
  * Reset current->fpu memory state to the init values.
  */
@@ -371,9 +354,12 @@ static void fpu_reset_fpstate(void)
 	 *
 	 * Do not use fpstate_init() here. Just copy init_fpstate which has
 	 * the correct content already except for PKRU.
+	 *
+	 * PKRU handling does not rely on the xstate when restoring for
+	 * user space as PKRU is eagerly written in switch_to() and
+	 * flush_thread().
 	 */
 	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
-	pkru_set_default_in_xstate(&fpu->state.xsave);
 	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 }

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Remove PKRU handling from switch_fpu_finish()
  2021-06-23 12:02 ` [patch V4 54/65] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     954436989cc550dd91aab98363240c9c0a4b7e23
Gitweb:        https://git.kernel.org/tip/954436989cc550dd91aab98363240c9c0a4b7e23
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:21 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:54:49 +02:00

x86/fpu: Remove PKRU handling from switch_fpu_finish()

PKRU is already updated and the xstate is not longer the proper source
of information.

 [ bp: Use cpu_feature_enabled() ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.708180184@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 34 +++-------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 2a484f5..528a868 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -523,39 +523,13 @@ static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 
 /*
- * Load PKRU from the FPU context if available. Delay loading of the
- * complete FPU state until the return to userland.
+ * Delay loading of the complete FPU state until the return to userland.
+ * PKRU is handled separately.
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu)
 {
-	u32 pkru_val = init_pkru_value;
-	struct pkru_state *pk;
-
-	if (!static_cpu_has(X86_FEATURE_FPU))
-		return;
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-
-	/*
-	 * PKRU state is switched eagerly because it needs to be valid before we
-	 * return to userland e.g. for a copy_to_user() operation.
-	 */
-	if (!(current->flags & PF_KTHREAD)) {
-		/*
-		 * If the PKRU bit in xsave.header.xfeatures is not set,
-		 * then the PKRU component was in init state, which means
-		 * XRSTOR will set PKRU to 0. If the bit is not set then
-		 * get_xsave_addr() will return NULL because the PKRU value
-		 * in memory is not valid. This means pkru_val has to be
-		 * set to 0 and not to init_pkru_value.
-		 */
-		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
-		pkru_val = pk ? pk->pkru : 0;
-	}
-	__write_pkru(pkru_val);
+	if (cpu_feature_enabled(X86_FEATURE_FPU))
+		set_thread_flag(TIF_NEED_FPU_LOAD);
 }
 
 #endif /* _ASM_X86_FPU_INTERNAL_H */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
  2021-06-23 12:02 ` [patch V4 53/65] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     30a304a138738d71a09c730ca8044e9662de0dbf
Gitweb:        https://git.kernel.org/tip/30a304a138738d71a09c730ca8044e9662de0dbf
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:20 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:47:35 +02:00

x86/fpu: Mask PKRU from kernel XRSTOR[S] operations

As the PKRU state is managed separately restoring it from the xstate
buffer would be counterproductive as it might either restore a stale
value or reinit the PKRU state to 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.606745195@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  4 ++--
 arch/x86/include/asm/fpu/xstate.h   | 10 ++++++++++
 arch/x86/kernel/fpu/xstate.c        |  1 +
 arch/x86/mm/extable.c               |  2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 5217743..2a484f5 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -259,7 +259,7 @@ static inline void fxsave(struct fxregs_state *fx)
  */
 static inline void os_xrstor_booting(struct xregs_state *xstate)
 {
-	u64 mask = -1;
+	u64 mask = xfeatures_mask_fpstate();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
@@ -388,7 +388,7 @@ extern void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	__restore_fpregs_from_fpstate(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, xfeatures_mask_fpstate());
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 4ff4a00..109dfcc 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,6 +111,16 @@ static inline u64 xfeatures_mask_restore_user(void)
 	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
 }
 
+/*
+ * Like xfeatures_mask_restore_user() but additionally restors the
+ * supported supervisor states.
+ */
+static inline u64 xfeatures_mask_fpstate(void)
+{
+	return xfeatures_mask_all & \
+		(XFEATURE_MASK_USER_RESTORE | XFEATURE_MASK_SUPERVISOR_SUPPORTED);
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9fd124a..21a10a6 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -60,6 +60,7 @@ static short xsave_cpuid_features[] __initdata = {
  * XSAVE buffer, both supervisor and user xstates.
  */
 u64 xfeatures_mask_all __ro_after_init;
+EXPORT_SYMBOL_GPL(xfeatures_mask_all);
 
 static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index 2c5cccd..e1664e9 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup,
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__restore_fpregs_from_fpstate(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, xfeatures_mask_fpstate());
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Hook up PKRU into ptrace()
  2021-06-23 12:02 ` [patch V4 52/65] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
@ 2021-06-23 22:08   ` tip-bot2 for Dave Hansen
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Dave Hansen @ 2021-06-23 22:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     e84ba47e313dbc097bf859bb6e4f9219883d5f78
Gitweb:        https://git.kernel.org/tip/e84ba47e313dbc097bf859bb6e4f9219883d5f78
Author:        Dave Hansen <dave.hansen@linux.intel.com>
AuthorDate:    Wed, 23 Jun 2021 14:02:19 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:44:24 +02:00

x86/fpu: Hook up PKRU into ptrace()

One nice thing about having PKRU be XSAVE-managed is that it gets naturally
exposed into the XSAVE-using ABIs.  Now that XSAVE will not be used to
manage PKRU, these ABIs need to be manually enabled to deal with PKRU.

ptrace() uses copy_uabi_xstate_to_kernel() to collect the tracee's
XSTATE. As PKRU is not in the task's XSTATE buffer, use task->thread.pkru
for filling in up the ptrace buffer.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.508770763@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  2 +-
 arch/x86/kernel/fpu/regset.c      | 10 ++++------
 arch/x86/kernel/fpu/xstate.c      | 25 ++++++++++++++++++-------
 3 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 6a0aaaf..4ff4a00 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -139,7 +139,7 @@ enum xstate_copy_mode {
 };
 
 struct membuf;
-void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
 			     enum xstate_copy_mode mode);
 
 #endif
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 4575796..66ed317 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -78,7 +78,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 				    sizeof(fpu->state.fxsave));
 	}
 
-	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_FX);
+	copy_xstate_to_uabi_buf(to, target, XSTATE_COPY_FX);
 	return 0;
 }
 
@@ -126,14 +126,12 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
-	struct fpu *fpu = &target->thread.fpu;
-
 	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	sync_fpstate(fpu);
+	sync_fpstate(&target->thread.fpu);
 
-	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
+	copy_xstate_to_uabi_buf(to, target, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
@@ -336,7 +334,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) };
 
 		/* Handle init state optimized xstate correctly */
-		copy_xstate_to_uabi_buf(mb, &fpu->state.xsave, XSTATE_COPY_FP);
+		copy_xstate_to_uabi_buf(mb, target, XSTATE_COPY_FP);
 		fx = &fxsave;
 	} else {
 		fx = &fpu->state.fxsave;
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c513596..9fd124a 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -962,7 +962,7 @@ static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
 /**
  * copy_xstate_to_uabi_buf - Copy kernel saved xstate to a UABI buffer
  * @to:		membuf descriptor
- * @xsave:	The kernel xstate buffer to copy from
+ * @tsk:	The task from which to copy the saved xstate
  * @copy_mode:	The requested copy mode
  *
  * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
@@ -971,10 +971,11 @@ static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
  *
  * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
 			     enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
 	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
 	unsigned int zerofrom;
@@ -1048,11 +1049,21 @@ void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
 		if (zerofrom < xstate_offsets[i])
 			membuf_zero(&to, xstate_offsets[i] - zerofrom);
 
-		copy_feature(header.xfeatures & BIT_ULL(i), &to,
-			     __raw_xsave_addr(xsave, i),
-			     __raw_xsave_addr(xinit, i),
-			     xstate_sizes[i]);
-
+		if (i == XFEATURE_PKRU) {
+			struct pkru_state pkru = {0};
+			/*
+			 * PKRU is not necessarily up to date in the
+			 * thread's XSAVE buffer.  Fill this part from the
+			 * per-thread storage.
+			 */
+			pkru.pkru = tsk->thread.pkru;
+			membuf_write(&to, &pkru, sizeof(pkru));
+		} else {
+			copy_feature(header.xfeatures & BIT_ULL(i), &to,
+				     __raw_xsave_addr(xsave, i),
+				     __raw_xsave_addr(xinit, i),
+				     xstate_sizes[i]);
+		}
 		/*
 		 * Keep track of the last copied state in the non-compacted
 		 * target buffer for gap zeroing.

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Add PKRU storage outside of task XSAVE buffer
  2021-06-23 12:02 ` [patch V4 51/65] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Dave Hansen
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Dave Hansen @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     9782a712eb971ce483442076e79eb1d8d608646e
Gitweb:        https://git.kernel.org/tip/9782a712eb971ce483442076e79eb1d8d608646e
Author:        Dave Hansen <dave.hansen@linux.intel.com>
AuthorDate:    Wed, 23 Jun 2021 14:02:18 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:37:45 +02:00

x86/fpu: Add PKRU storage outside of task XSAVE buffer

PKRU is currently partly XSAVE-managed and partly not. It has space
in the task XSAVE buffer and is context-switched by XSAVE/XRSTOR.
However, it is switched more eagerly than FPU because there may be a
need for PKRU to be up-to-date for things like copy_to/from_user() since
PKRU affects user-permission memory accesses, not just accesses from
userspace itself.

This leaves PKRU in a very odd position. XSAVE brings very little value
to the table for how Linux uses PKRU except for signal related XSTATE
handling.

Prepare to move PKRU away from being XSAVE-managed. Allocate space in
the thread_struct for it and save/restore it in the context-switch path
separately from the XSAVE-managed features. task->thread_struct.pkru
is only valid when the task is scheduled out. For the current task the
authoritative source is the hardware, i.e. it has to be retrieved via
rdpkru().

Leave the XSAVE code in place for now to ensure bisectability.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.399107624@linutronix.de
---
 arch/x86/include/asm/processor.h |  9 +++++++++
 arch/x86/kernel/process.c        |  7 +++++++
 arch/x86/kernel/process_64.c     | 25 +++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 556b2b1..91946fc 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -518,6 +518,15 @@ struct thread_struct {
 
 	unsigned int		sig_on_uaccess_err:1;
 
+	/*
+	 * Protection Keys Register for Userspace.  Loaded immediately on
+	 * context switch. Store it in thread_struct to avoid a lookup in
+	 * the tasks's FPU xstate buffer. This value is only valid when a
+	 * task is scheduled out. For 'current' the authoritative source of
+	 * PKRU is the hardware itself.
+	 */
+	u32			pkru;
+
 	/* Floating point and extended processor state */
 	struct fpu		fpu;
 	/*
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index de942b0..fa6c8fa 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -156,11 +156,18 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 
 	/* Kernel thread ? */
 	if (unlikely(p->flags & PF_KTHREAD)) {
+		p->thread.pkru = pkru_get_init_value();
 		memset(childregs, 0, sizeof(struct pt_regs));
 		kthread_frame_init(frame, sp, arg);
 		return 0;
 	}
 
+	/*
+	 * Clone current's PKRU value from hardware. tsk->thread.pkru
+	 * is only valid when scheduled out.
+	 */
+	p->thread.pkru = read_pkru();
+
 	frame->bx = 0;
 	*childregs = *current_pt_regs();
 	childregs->ax = 0;
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 40a9638..ec0d836 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -340,6 +340,29 @@ static __always_inline void load_seg_legacy(unsigned short prev_index,
 	}
 }
 
+/*
+ * Store prev's PKRU value and load next's PKRU value if they differ. PKRU
+ * is not XSTATE managed on context switch because that would require a
+ * lookup in the task's FPU xsave buffer and require to keep that updated
+ * in various places.
+ */
+static __always_inline void x86_pkru_load(struct thread_struct *prev,
+					  struct thread_struct *next)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	/* Stash the prev task's value: */
+	prev->pkru = rdpkru();
+
+	/*
+	 * PKRU writes are slightly expensive.  Avoid them when not
+	 * strictly necessary:
+	 */
+	if (prev->pkru != next->pkru)
+		wrpkru(next->pkru);
+}
+
 static __always_inline void x86_fsgsbase_load(struct thread_struct *prev,
 					      struct thread_struct *next)
 {
@@ -589,6 +612,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	x86_fsgsbase_load(prev, next);
 
+	x86_pkru_load(prev, next);
+
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  2021-06-23 12:02 ` [patch V4 50/65] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     2ebe81c6d800576e1213f9d7cf0068017ae610c1
Gitweb:        https://git.kernel.org/tip/2ebe81c6d800576e1213f9d7cf0068017ae610c1
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:17 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:33:32 +02:00

x86/fpu: Dont restore PKRU in fpregs_restore_userspace()

switch_to() and flush_thread() write the task's PKRU value eagerly so
the PKRU value of current is always valid in the hardware.

That means there is no point in restoring PKRU on exit to user or when
reactivating the task's FPU registers in the signal frame setup path.

This allows to remove all the xstate buffer updates with PKRU values once
the PKRU state is stored in thread struct while a task is scheduled out.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.303919033@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 16 +++++++++++++++-
 arch/x86/include/asm/fpu/xstate.h   | 19 +++++++++++++++++++
 arch/x86/kernel/fpu/core.c          |  2 +-
 3 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index bcfb636..5217743 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -457,7 +457,21 @@ static inline void fpregs_restore_userregs(void)
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		restore_fpregs_from_fpstate(&fpu->state);
+		u64 mask;
+
+		/*
+		 * This restores _all_ xstate which has not been
+		 * established yet.
+		 *
+		 * If PKRU is enabled, then the PKRU value is already
+		 * correct because it was either set in switch_to() or in
+		 * flush_thread(). So it is excluded because it might be
+		 * not up to date in current->thread.fpu.xsave state.
+		 */
+		mask = xfeatures_mask_restore_user() |
+			xfeatures_mask_supervisor();
+		__restore_fpregs_from_fpstate(&fpu->state, mask);
+
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index af9ea13..6a0aaaf 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -35,6 +35,14 @@
 				      XFEATURE_MASK_BNDREGS | \
 				      XFEATURE_MASK_BNDCSR)
 
+/*
+ * Features which are restored when returning to user space.
+ * PKRU is not restored on return to user space because PKRU
+ * is switched eagerly in switch_to() and flush_thread()
+ */
+#define XFEATURE_MASK_USER_RESTORE	\
+	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
@@ -92,6 +100,17 @@ static inline u64 xfeatures_mask_uabi(void)
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
+/*
+ * The xfeatures which are restored by the kernel when returning to user
+ * mode. This is not necessarily the same as xfeatures_mask_uabi() as the
+ * kernel does not manage all XCR0 enabled features via xsave/xrstor as
+ * some of them have to be switched eagerly on context switch and exec().
+ */
+static inline u64 xfeatures_mask_restore_user(void)
+{
+	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1243738..470576c 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -404,7 +404,7 @@ void fpu__clear_user_states(struct fpu *fpu)
 	}
 
 	/* Reset user states in registers. */
-	restore_fpregs_from_init_fpstate(xfeatures_mask_uabi());
+	restore_fpregs_from_init_fpstate(xfeatures_mask_restore_user());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
  2021-06-23 12:02 ` [patch V4 49/65] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     65e952102122bf89f0e4f1bebf8664e32587aaed
Gitweb:        https://git.kernel.org/tip/65e952102122bf89f0e4f1bebf8664e32587aaed
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:16 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:29:52 +02:00

x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()

Rename it so it's clear that this is about user ABI features which can
differ from the feature set which the kernel saves and restores because the
kernel handles e.g. PKRU differently. But the user ABI (ptrace, signal
frame) expects it to be there.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.211585137@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  7 ++++++-
 arch/x86/include/asm/fpu/xstate.h   |  6 +++++-
 arch/x86/kernel/fpu/core.c          |  2 +-
 arch/x86/kernel/fpu/signal.c        | 10 +++++-----
 arch/x86/kernel/fpu/xstate.c        | 18 +++++++++---------
 5 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 63d9796..bcfb636 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -324,7 +324,12 @@ static inline void os_xrstor(struct xregs_state *xstate, u64 mask)
  */
 static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
-	u64 mask = xfeatures_mask_user();
+	/*
+	 * Include the features which are not xsaved/rstored by the kernel
+	 * internally, e.g. PKRU. That's user space ABI and also required
+	 * to allow the signal handler to modify PKRU.
+	 */
+	u64 mask = xfeatures_mask_uabi();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 5764cbe..af9ea13 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -83,7 +83,11 @@ static inline u64 xfeatures_mask_supervisor(void)
 	return xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_user(void)
+/*
+ * The xfeatures which are enabled in XCR0 and expected to be in ptrace
+ * buffers and signal frames.
+ */
+static inline u64 xfeatures_mask_uabi(void)
 {
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index afd0dee..1243738 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -404,7 +404,7 @@ void fpu__clear_user_states(struct fpu *fpu)
 	}
 
 	/* Reset user states in registers. */
-	restore_fpregs_from_init_fpstate(xfeatures_mask_user());
+	restore_fpregs_from_init_fpstate(xfeatures_mask_uabi());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index b12665c..a42bc9d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -257,14 +257,14 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 
 	if (use_xsave()) {
 		if (fx_only) {
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 
 			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
-			init_bv = xfeatures_mask_user() & ~xbv;
+			init_bv = xfeatures_mask_uabi() & ~xbv;
 
 			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
@@ -420,7 +420,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	fpregs_unlock();
 
 	if (use_xsave() && !fx_only) {
-		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
+		u64 init_bv = xfeatures_mask_uabi() & ~user_xfeatures;
 
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
@@ -454,7 +454,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		if (use_xsave()) {
 			u64 init_bv;
 
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 			os_xrstor(&init_fpstate.xsave, init_bv);
 		}
 
@@ -549,7 +549,7 @@ void fpu__init_prepare_fx_sw_frame(void)
 
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
-	fx_sw_reserved.xfeatures = xfeatures_mask_user();
+	fx_sw_reserved.xfeatures = xfeatures_mask_uabi();
 	fx_sw_reserved.xstate_size = fpu_user_xstate_size;
 
 	if (IS_ENABLED(CONFIG_IA32_EMULATION) ||
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index bc71609..c513596 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -144,7 +144,7 @@ void fpu__init_cpu_xstate(void)
 	 * managed by XSAVE{C, OPT, S} and XRSTOR{S}.  Only XSAVE user
 	 * states can be set here.
 	 */
-	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * MSR_IA32_XSS sets supervisor states managed by XSAVES.
@@ -453,7 +453,7 @@ int xfeature_size(int xfeature_nr)
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
-	if (hdr->xfeatures & ~xfeatures_mask_user())
+	if (hdr->xfeatures & ~xfeatures_mask_uabi())
 		return -EINVAL;
 
 	/* Userspace must use the uncompacted format */
@@ -756,7 +756,7 @@ void __init fpu__init_system_xstate(void)
 	cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
 	xfeatures_mask_all |= ecx + ((u64)edx << 32);
 
-	if ((xfeatures_mask_user() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
+	if ((xfeatures_mask_uabi() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
 		/*
 		 * This indicates that something really unexpected happened
 		 * with the enumeration.  Disable XSAVE and try to continue
@@ -791,7 +791,7 @@ void __init fpu__init_system_xstate(void)
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
 	 */
-	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_user());
+	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_uabi());
 
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
@@ -828,14 +828,14 @@ void fpu__resume_cpu(void)
 	/*
 	 * Restore XCR0 on xsave capable CPUs:
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
-		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+	if (cpu_feature_enabled(X86_FEATURE_XSAVE))
+		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * Restore IA32_XSS. The same CPUID bit enumerates support
 	 * of XSAVES and MSR_IA32_XSS.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor()  |
 				     xfeatures_mask_independent());
 	}
@@ -993,7 +993,7 @@ void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
 		break;
 
 	case XSTATE_COPY_XSAVE:
-		header.xfeatures &= xfeatures_mask_user();
+		header.xfeatures &= xfeatures_mask_uabi();
 		break;
 	}
 
@@ -1038,7 +1038,7 @@ void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
 		 * compacted init_fpstate. The gap tracking will zero this
 		 * later.
 		 */
-		if (!(xfeatures_mask_user() & BIT_ULL(i)))
+		if (!(xfeatures_mask_uabi() & BIT_ULL(i)))
 			continue;
 
 		/*

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
  2021-06-23 12:02 ` [patch V4 48/65] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     1d9bffab116fadfe1594f5fea2b50ab280d81d30
Gitweb:        https://git.kernel.org/tip/1d9bffab116fadfe1594f5fea2b50ab280d81d30
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:15 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:26:37 +02:00

x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()

copy_kernel_to_fpregs() restores all xfeatures but it is also the place
where the AMD FXSAVE_LEAK bug is handled.

That prevents fpregs_restore_userregs() to limit the restored features,
which is required to untangle PKRU and XSTATE handling and also for the
upcoming supervisor state management.

Move the FXSAVE_LEAK quirk into __copy_kernel_to_fpregs() and deinline that
function which has become rather fat.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.114271278@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 25 +------------------------
 arch/x86/kernel/fpu/core.c          | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index f7688f6..63d9796 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -379,33 +379,10 @@ static inline int os_xrstor_safe(struct xregs_state *xstate, u64 mask)
 	return err;
 }
 
-static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
-{
-	if (use_xsave()) {
-		os_xrstor(&fpstate->xsave, mask);
-	} else {
-		if (use_fxsr())
-			fxrstor(&fpstate->fxsave);
-		else
-			frstor(&fpstate->fsave);
-	}
-}
+extern void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask);
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	/*
-	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
-	 * pending. Clear the x87 state here by setting it to fixed values.
-	 * "m" is a random variable that should be in L1.
-	 */
-	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
-		asm volatile(
-			"fnclex\n\t"
-			"emms\n\t"
-			"fildl %P[addr]"	/* set F?P to defined value */
-			: : [addr] "m" (fpstate));
-	}
-
 	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 6babf18..afd0dee 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -124,6 +124,33 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 }
 EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
+void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
+{
+	/*
+	 * AMD K7/K8 and later CPUs up to Zen don't save/restore
+	 * FDP/FIP/FOP unless an exception is pending. Clear the x87 state
+	 * here by setting it to fixed values.  "m" is a random variable
+	 * that should be in L1.
+	 */
+	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %P[addr]"	/* set F?P to defined value */
+			: : [addr] "m" (fpstate));
+	}
+
+	if (use_xsave()) {
+		os_xrstor(&fpstate->xsave, mask);
+	} else {
+		if (use_fxsr())
+			fxrstor(&fpstate->fxsave);
+		else
+			frstor(&fpstate->fsave);
+	}
+}
+EXPORT_SYMBOL_GPL(__restore_fpregs_from_fpstate);
+
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
 	preempt_disable();

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
  2021-06-23 12:02 ` [patch V4 47/65] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     727d01100e15b18c67f05fb697779ad2a6c99b63
Gitweb:        https://git.kernel.org/tip/727d01100e15b18c67f05fb697779ad2a6c99b63
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:14 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:23:40 +02:00

x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()

Rename it so that it becomes entirely clear what this function is
about. It's purpose is to restore the FPU registers to the state which was
saved in the task's FPU memory state either at context switch or by an in
kernel FPU user.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121456.018867925@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 6 ++----
 arch/x86/kernel/fpu/core.c          | 2 +-
 arch/x86/kernel/fpu/signal.c        | 2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index dabbb70..f7688f6 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -465,10 +465,8 @@ static inline void fpregs_activate(struct fpu *fpu)
 	trace_x86_fpu_regs_activated(fpu);
 }
 
-/*
- * Internal helper, do not use directly. Use switch_fpu_return() instead.
- */
-static inline void __fpregs_load_activate(void)
+/* Internal helper for switch_fpu_return() and signal frame setup */
+static inline void fpregs_restore_userregs(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	int cpu = smp_processor_id();
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index aa7e808..6babf18 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -402,7 +402,7 @@ void switch_fpu_return(void)
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return;
 
-	__fpregs_load_activate();
+	fpregs_restore_userregs();
 }
 EXPORT_SYMBOL_GPL(switch_fpu_return);
 
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index fd4b58d..b12665c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -188,7 +188,7 @@ retry:
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
-		__fpregs_load_activate();
+		fpregs_restore_userregs();
 
 	pagefault_disable();
 	ret = copy_fpregs_to_sigframe(buf_fx);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Clean up the fpu__clear() variants
  2021-06-23 12:02 ` [patch V4 46/65] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Andy Lutomirski
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     33344368cb08f8d6bf55a32aa052318d3a69ea84
Gitweb:        https://git.kernel.org/tip/33344368cb08f8d6bf55a32aa052318d3a69ea84
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Wed, 23 Jun 2021 14:02:13 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:23:07 +02:00

x86/fpu: Clean up the fpu__clear() variants

fpu__clear() currently resets both register state and kernel XSAVE buffer
state.  It has two modes: one for all state (supervisor and user) and
another for user state only.  fpu__clear_all() uses the "all state"
(user_only=0) mode, while a number of signal paths use the user_only=1
mode.

Make fpu__clear() work only for user state (user_only=1) and remove the
"all state" (user_only=0) code.  Rename it to match so it can be used by
the signal paths.

Replace the "all state" (user_only=0) fpu__clear() functionality.  Use the
TIF_NEED_FPU_LOAD functionality instead of making any actual hardware
registers changes in this path.

Instead of invoking fpu__initialize() just memcpy() init_fpstate into the
task's FPU state because that has already the correct format and in case of
PKRU also contains the default PKRU value. Move the actual PKRU write out
into flush_thread() where it belongs and where it will end up anyway when
PKRU and XSTATE have been untangled.

For bisectability a workaround is required which stores the PKRU value in
the xstate memory until PKRU is untangled from XSTATE for context
switching and return to user.

[ Dave Hansen: Polished changelog ]
[ tglx: Fixed the PKRU fallout ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.922729522@linutronix.de
---
 arch/x86/kernel/fpu/core.c | 113 ++++++++++++++++++++++++------------
 arch/x86/kernel/process.c  |  10 +++-
 2 files changed, 86 insertions(+), 37 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 4b69be9..aa7e808 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -260,19 +260,6 @@ int fpu_clone(struct task_struct *dst)
 }
 
 /*
- * Activate the current task's in-memory FPU context,
- * if it has not been used before:
- */
-static void fpu__initialize(struct fpu *fpu)
-{
-	WARN_ON_FPU(fpu != &current->thread.fpu);
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-	fpstate_init(&fpu->state);
-	trace_x86_fpu_init_state(fpu);
-}
-
-/*
  * Drops current FPU state: deactivates the fpregs and
  * the fpstate. NOTE: it still leaves previous contents
  * in the fpregs in the eager-FPU case.
@@ -314,47 +301,99 @@ static inline void restore_fpregs_from_init_fpstate(u64 features_mask)
 	pkru_write_default();
 }
 
+static inline unsigned int init_fpstate_copy_size(void)
+{
+	if (!use_xsave())
+		return fpu_kernel_xstate_size;
+
+	/* XSAVE(S) just needs the legacy and the xstate header part */
+	return sizeof(init_fpstate.xsave);
+}
+
+/* Temporary workaround. Will be removed once PKRU and XSTATE are untangled. */
+static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
+{
+	struct pkru_state *pk;
+
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+	/*
+	 * Force XFEATURE_PKRU to be set in the header otherwise
+	 * get_xsave_addr() does not work and it also needs to be set to
+	 * make XRSTOR(S) load it.
+	 */
+	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
+	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
+	pk->pkru = pkru_get_init_value();
+}
+
 /*
- * Clear the FPU state back to init state.
- *
- * Called by sys_execve(), by the signal handler code and by various
- * error paths.
+ * Reset current->fpu memory state to the init values.
+ */
+static void fpu_reset_fpstate(void)
+{
+	struct fpu *fpu = &current->thread.fpu;
+
+	fpregs_lock();
+	fpu__drop(fpu);
+	/*
+	 * This does not change the actual hardware registers. It just
+	 * resets the memory image and sets TIF_NEED_FPU_LOAD so a
+	 * subsequent return to usermode will reload the registers from the
+	 * task's memory image.
+	 *
+	 * Do not use fpstate_init() here. Just copy init_fpstate which has
+	 * the correct content already except for PKRU.
+	 */
+	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
+	pkru_set_default_in_xstate(&fpu->state.xsave);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
+	fpregs_unlock();
+}
+
+/*
+ * Reset current's user FPU states to the init states.  current's
+ * supervisor states, if any, are not modified by this function.  The
+ * caller guarantees that the XSTATE header in memory is intact.
  */
-static void fpu__clear(struct fpu *fpu, bool user_only)
+void fpu__clear_user_states(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
-	if (!static_cpu_has(X86_FEATURE_FPU)) {
-		fpu__drop(fpu);
-		fpu__initialize(fpu);
+	fpregs_lock();
+	if (!cpu_feature_enabled(X86_FEATURE_FPU)) {
+		fpu_reset_fpstate();
+		fpregs_unlock();
 		return;
 	}
 
-	fpregs_lock();
-
-	if (user_only) {
-		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
-		    xfeatures_mask_supervisor())
-			os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
-		restore_fpregs_from_init_fpstate(xfeatures_mask_user());
-	} else {
-		restore_fpregs_from_init_fpstate(xfeatures_mask_all);
+	/*
+	 * Ensure that current's supervisor states are loaded into their
+	 * corresponding registers.
+	 */
+	if (xfeatures_mask_supervisor() &&
+	    !fpregs_state_valid(fpu, smp_processor_id())) {
+		os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
 	}
 
+	/* Reset user states in registers. */
+	restore_fpregs_from_init_fpstate(xfeatures_mask_user());
+
+	/*
+	 * Now all FPU registers have their desired values.  Inform the FPU
+	 * state machine that current's FPU registers are in the hardware
+	 * registers. The memory image does not need to be updated because
+	 * any operation relying on it has to save the registers first when
+	 * current's FPU is marked active.
+	 */
 	fpregs_mark_activate();
 	fpregs_unlock();
 }
 
-void fpu__clear_user_states(struct fpu *fpu)
-{
-	fpu__clear(fpu, true);
-}
-
 void fpu_flush_thread(void)
 {
-	fpu__clear(&current->thread.fpu, false);
+	fpu_reset_fpstate();
 }
-
 /*
  * Load FPU context before returning to userspace.
  */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 19d05d3..de942b0 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -198,6 +198,15 @@ int copy_thread(unsigned long clone_flags, unsigned long sp, unsigned long arg,
 	return ret;
 }
 
+static void pkru_flush_thread(void)
+{
+	/*
+	 * If PKRU is enabled the default PKRU value has to be loaded into
+	 * the hardware right here (similar to context switch).
+	 */
+	pkru_write_default();
+}
+
 void flush_thread(void)
 {
 	struct task_struct *tsk = current;
@@ -206,6 +215,7 @@ void flush_thread(void)
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
 	fpu_flush_thread();
+	pkru_flush_thread();
 }
 
 void disable_TSC(void)

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread()
  2021-06-23 12:02 ` [patch V4 45/65] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     e7ecad17c84d0f6bef635c20d02bbe4096eea700
Gitweb:        https://git.kernel.org/tip/e7ecad17c84d0f6bef635c20d02bbe4096eea700
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:12 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:20:10 +02:00

x86/fpu: Rename fpu__clear_all() to fpu_flush_thread()

Make it clear what the function is about.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.827979263@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 3 ++-
 arch/x86/kernel/fpu/core.c          | 4 ++--
 arch/x86/kernel/process.c           | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index f5da2e9..dabbb70 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -29,12 +29,13 @@
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
 extern void fpu__clear_user_states(struct fpu *fpu);
-extern void fpu__clear_all(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 
 extern void fpu_sync_fpstate(struct fpu *fpu);
 
+/* Clone and exit operations */
 extern int  fpu_clone(struct task_struct *dst);
+extern void fpu_flush_thread(void);
 
 /*
  * Boot time FPU initialization functions:
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index fedadcb..4b69be9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -350,9 +350,9 @@ void fpu__clear_user_states(struct fpu *fpu)
 	fpu__clear(fpu, true);
 }
 
-void fpu__clear_all(struct fpu *fpu)
+void fpu_flush_thread(void)
 {
-	fpu__clear(fpu, false);
+	fpu__clear(&current->thread.fpu, false);
 }
 
 /*
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index af3db53..19d05d3 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -205,7 +205,7 @@ void flush_thread(void)
 	flush_ptrace_hw_breakpoint(tsk);
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
-	fpu__clear_all(&tsk->thread.fpu);
+	fpu_flush_thread();
 }
 
 void disable_TSC(void)

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs()
  2021-06-23 12:02 ` [patch V4 44/65] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     371071131cd1032c1e9172c51234a2a324841cab
Gitweb:        https://git.kernel.org/tip/371071131cd1032c1e9172c51234a2a324841cab
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:11 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:15:16 +02:00

x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs()

There is no point in using copy_init_pkru_to_fpregs() which in turn calls
write_pkru(). write_pkru() tries to fiddle with the task's xstate buffer
for nothing because the XRSTOR[S](init_fpstate) just cleared the xfeature
flag in the xstate header which makes get_xsave_addr() fail.

It's a useless exercise anyway because the reinitialization activates the
FPU so before the task's xstate buffer can be used again a XRSTOR[S] must
happen which in turn dumps the PKRU value.

Get rid of the now unused copy_init_pkru_to_fpregs().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.732508792@linutronix.de
---
 arch/x86/include/asm/pkeys.h |  1 -
 arch/x86/kernel/fpu/core.c   |  3 +--
 arch/x86/mm/pkeys.c          | 17 -----------------
 include/linux/pkeys.h        |  4 ----
 4 files changed, 1 insertion(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index 4128f64..5c7bcaa 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -124,7 +124,6 @@ extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
-extern void copy_init_pkru_to_fpregs(void);
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 3866954..fedadcb 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -311,8 +311,7 @@ static inline void restore_fpregs_from_init_fpstate(u64 features_mask)
 	else
 		frstor(&init_fpstate.fsave);
 
-	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
-		copy_init_pkru_to_fpregs();
+	pkru_write_default();
 }
 
 /*
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index a02cfcf..fb171a5 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -125,22 +124,6 @@ u32 init_pkru_value = PKRU_AD_KEY( 1) | PKRU_AD_KEY( 2) | PKRU_AD_KEY( 3) |
 		      PKRU_AD_KEY(10) | PKRU_AD_KEY(11) | PKRU_AD_KEY(12) |
 		      PKRU_AD_KEY(13) | PKRU_AD_KEY(14) | PKRU_AD_KEY(15);
 
-/*
- * Called from the FPU code when creating a fresh set of FPU
- * registers.  This is called from a very specific context where
- * we know the FPU registers are safe for use and we can use PKRU
- * directly.
- */
-void copy_init_pkru_to_fpregs(void)
-{
-	u32 init_pkru_value_snapshot = READ_ONCE(init_pkru_value);
-	/*
-	 * Override the PKRU state that came from 'init_fpstate'
-	 * with the baseline from the process.
-	 */
-	write_pkru(init_pkru_value_snapshot);
-}
-
 static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 			     size_t count, loff_t *ppos)
 {
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 2955ba9..6beb26b 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -44,10 +44,6 @@ static inline bool arch_pkeys_enabled(void)
 	return false;
 }
 
-static inline void copy_init_pkru_to_fpregs(void)
-{
-}
-
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 #endif /* _LINUX_PKEYS_H */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/cpu: Write the default PKRU value when enabling PKE
  2021-06-23 12:02 ` [patch V4 43/65] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     fa8c84b77a54bf3cf351c8b4b26a5aca27a14013
Gitweb:        https://git.kernel.org/tip/fa8c84b77a54bf3cf351c8b4b26a5aca27a14013
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:10 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:14:54 +02:00

x86/cpu: Write the default PKRU value when enabling PKE

In preparation of making the PKRU management more independent from XSTATES,
write the default PKRU value into the hardware right after enabling PKRU in
CR4. This ensures that switch_to() and copy_thread() have the correct
setting for init task and the per CPU idle threads right away.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.622983906@linutronix.de
---
 arch/x86/kernel/cpu/common.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index dbfb335..ca668ef 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -480,6 +480,8 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 	}
 
 	cr4_set_bits(X86_CR4_PKE);
+	/* Load the default PKRU value */
+	pkru_write_default();
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/pkru: Provide pkru_write_default()
  2021-06-23 12:02 ` [patch V4 42/65] x86/pkru: Provide pkru_write_default() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     ff7ebff47c595e747aa1bb10d8a30b2acb7d425b
Gitweb:        https://git.kernel.org/tip/ff7ebff47c595e747aa1bb10d8a30b2acb7d425b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:09 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:09:53 +02:00

x86/pkru: Provide pkru_write_default()

Provide a simple and trivial helper which just writes the PKRU default
value without trying to fiddle with the task's xsave buffer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.513729794@linutronix.de
---
 arch/x86/include/asm/pkru.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pkru.h b/arch/x86/include/asm/pkru.h
index 19d3d7b..7e45509 100644
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -60,4 +60,12 @@ static inline void write_pkru(u32 pkru)
 	fpregs_unlock();
 }
 
+static inline void pkru_write_default(void)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	wrpkru(pkru_get_init_value());
+}
+
 #endif

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/pkru: Provide pkru_get_init_value()
  2021-06-23 12:02 ` [patch V4 41/65] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     739e2eec0f4849eb411567407d61491f923db405
Gitweb:        https://git.kernel.org/tip/739e2eec0f4849eb411567407d61491f923db405
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:08 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 19:02:49 +02:00

x86/pkru: Provide pkru_get_init_value()

When CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is disabled then the following
code fails to compile:

     if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
     	u32 pkru = READ_ONCE(init_pkru_value);
	..
     }

because init_pkru_value is defined as '0' which makes READ_ONCE() upset.

Provide an accessor macro to avoid #ifdeffery all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.404880646@linutronix.de
---
 arch/x86/include/asm/pkru.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/pkru.h b/arch/x86/include/asm/pkru.h
index ec8dd28..19d3d7b 100644
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -10,8 +10,10 @@
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 extern u32 init_pkru_value;
+#define pkru_get_init_value()	READ_ONCE(init_pkru_value)
 #else
 #define init_pkru_value	0
+#define pkru_get_init_value()	0
 #endif
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/cpu: Sanitize X86_FEATURE_OSPKE
  2021-06-23 12:02 ` [patch V4 40/65] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     8a1dc55a3f3ef0a723c3c117a567e7b5dd2c1793
Gitweb:        https://git.kernel.org/tip/8a1dc55a3f3ef0a723c3c117a567e7b5dd2c1793
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:07 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:59:44 +02:00

x86/cpu: Sanitize X86_FEATURE_OSPKE

X86_FEATURE_OSPKE is enabled first on the boot CPU and the feature flag is
set. Secondary CPUs have to enable CR4.PKE as well and set their per CPU
feature flag. That's ineffective because all call sites have checks for
boot_cpu_data.

Make it smarter and force the feature flag when PKU is enabled on the boot
cpu which allows then to use cpu_feature_enabled(X86_FEATURE_OSPKE) all
over the place. That either compiles the code out when PKEY support is
disabled in Kconfig or uses a static_cpu_has() for the feature check which
makes a significant difference in hotpaths, e.g. context switch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.305113644@linutronix.de
---
 arch/x86/include/asm/pkeys.h |  8 ++++----
 arch/x86/include/asm/pkru.h  |  4 ++--
 arch/x86/kernel/cpu/common.c | 24 +++++++++++-------------
 arch/x86/kernel/fpu/core.c   |  2 +-
 arch/x86/kernel/fpu/xstate.c |  2 +-
 arch/x86/kernel/process_64.c |  2 +-
 arch/x86/mm/fault.c          |  2 +-
 7 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index 2ff9b98..4128f64 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -9,14 +9,14 @@
  * will be necessary to ensure that the types that store key
  * numbers and masks have sufficient capacity.
  */
-#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
+#define arch_max_pkey() (cpu_feature_enabled(X86_FEATURE_OSPKE) ? 16 : 1)
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 
 static inline bool arch_pkeys_enabled(void)
 {
-	return boot_cpu_has(X86_FEATURE_OSPKE);
+	return cpu_feature_enabled(X86_FEATURE_OSPKE);
 }
 
 /*
@@ -26,7 +26,7 @@ static inline bool arch_pkeys_enabled(void)
 extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return ARCH_DEFAULT_PKEY;
 
 	return __execute_only_pkey(mm);
@@ -37,7 +37,7 @@ extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 		int prot, int pkey)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return 0;
 
 	return __arch_override_mprotect_pkey(vma, prot, pkey);
diff --git a/arch/x86/include/asm/pkru.h b/arch/x86/include/asm/pkru.h
index 35ffcfd..ec8dd28 100644
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -32,7 +32,7 @@ static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
 
 static inline u32 read_pkru(void)
 {
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return rdpkru();
 	return 0;
 }
@@ -41,7 +41,7 @@ static inline void write_pkru(u32 pkru)
 {
 	struct pkru_state *pk;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
 
 	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f875dea..dbfb335 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,22 +466,20 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	/* check the boot processor, plus compile options for PKU: */
-	if (!cpu_feature_enabled(X86_FEATURE_PKU))
-		return;
-	/* checks the actual processor's cpuid bits: */
-	if (!cpu_has(c, X86_FEATURE_PKU))
-		return;
-	if (pku_disabled)
+	if (c == &boot_cpu_data) {
+		if (pku_disabled || !cpu_feature_enabled(X86_FEATURE_PKU))
+			return;
+		/*
+		 * Setting CR4.PKE will cause the X86_FEATURE_OSPKE cpuid
+		 * bit to be set.  Enforce it.
+		 */
+		setup_force_cpu_cap(X86_FEATURE_OSPKE);
+
+	} else if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) {
 		return;
+	}
 
 	cr4_set_bits(X86_CR4_PKE);
-	/*
-	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
-	 * cpuid bit to be set.  We need to ensure that we
-	 * update that bit in this CPU's "cpu_info".
-	 */
-	set_cpu_cap(c, X86_FEATURE_OSPKE);
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8762b1a..3866954 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -311,7 +311,7 @@ static inline void restore_fpregs_from_init_fpstate(u64 features_mask)
 	else
 		frstor(&init_fpstate.fsave);
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
 }
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 735d44c..bc71609 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -921,7 +921,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	 * This check implies XSAVE support.  OSPKE only gets
 	 * set if we enable XSAVE and we enable PKU in XCR0.
 	 */
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return -EINVAL;
 
 	/*
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4651ab0..40a9638 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -137,7 +137,7 @@ void __show_regs(struct pt_regs *regs, enum show_regs_mode mode,
 		       log_lvl, d3, d6, d7);
 	}
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		printk("%sPKRU: %08x\n", log_lvl, read_pkru());
 }
 
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 6bda7f6..f33a61a 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -875,7 +875,7 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 	/* This code is always called on the current mm */
 	bool foreign = false;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return false;
 	if (error_code & X86_PF_PK)
 		return true;

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename and sanitize fpu__save/copy()
  2021-06-23 12:02 ` [patch V4 39/65] x86/fpu: Rename and sanitize fpu__save/copy() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     b2681e791dbcee6acb1dca7a5076a0285109ac4c
Gitweb:        https://git.kernel.org/tip/b2681e791dbcee6acb1dca7a5076a0285109ac4c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:06 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:55:56 +02:00

x86/fpu: Rename and sanitize fpu__save/copy()

Both function names are a misnomer.

fpu__save() is actually about synchronizing the hardware register state
into the task's memory state so that either coredump or a math exception
handler can inspect the state at the time where the problem happens.

The function guarantees to preserve the register state, while "save" is a
common terminology for saving the current state so it can be modified and
restored later. This is clearly not the case here.

Rename it to fpu_sync_fpstate().

fpu__copy() is used to clone the current task's FPU state when duplicating
task_struct. While the register state is a copy the rest of the FPU state
is not.

Name it accordingly and remove the really pointless @src argument along
with the warning which comes along with it.

Nothing can ever copy the FPU state of a non-current task. It's clearly
just a consequence of arch_dup_task_struct(), but it makes no sense to
proliferate that further.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.196727450@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  6 ++++--
 arch/x86/kernel/fpu/core.c          | 17 ++++++++---------
 arch/x86/kernel/fpu/regset.c        |  2 +-
 arch/x86/kernel/process.c           |  3 +--
 arch/x86/kernel/traps.c             |  5 +++--
 5 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 3558cd0..f5da2e9 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -26,14 +26,16 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
-extern int  fpu__copy(struct task_struct *dst, struct task_struct *src);
 extern void fpu__clear_user_states(struct fpu *fpu);
 extern void fpu__clear_all(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 
+extern void fpu_sync_fpstate(struct fpu *fpu);
+
+extern int  fpu_clone(struct task_struct *dst);
+
 /*
  * Boot time FPU initialization functions:
  */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 4a59e0f..8762b1a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -159,11 +159,10 @@ void kernel_fpu_end(void)
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
 /*
- * Save the FPU state (mark it for reload if necessary):
- *
- * This only ever gets called for the current task.
+ * Sync the FPU register state to current's memory register state when the
+ * current task owns the FPU. The hardware register state is preserved.
  */
-void fpu__save(struct fpu *fpu)
+void fpu_sync_fpstate(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
@@ -221,18 +220,18 @@ void fpstate_init(union fpregs_state *state)
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
-int fpu__copy(struct task_struct *dst, struct task_struct *src)
+/* Clone current's FPU state on fork */
+int fpu_clone(struct task_struct *dst)
 {
+	struct fpu *src_fpu = &current->thread.fpu;
 	struct fpu *dst_fpu = &dst->thread.fpu;
-	struct fpu *src_fpu = &src->thread.fpu;
 
+	/* The new task's FPU state cannot be valid in the hardware. */
 	dst_fpu->last_cpu = -1;
 
-	if (!static_cpu_has(X86_FEATURE_FPU))
+	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return 0;
 
-	WARN_ON_FPU(src_fpu != &current->thread.fpu);
-
 	/*
 	 * Don't let 'init optimized' areas of the XSAVE area
 	 * leak into the child task:
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 892aec1..4575796 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -41,7 +41,7 @@ int regset_xregset_fpregs_active(struct task_struct *target, const struct user_r
 static void sync_fpstate(struct fpu *fpu)
 {
 	if (fpu == &current->thread.fpu)
-		fpu__save(fpu);
+		fpu_sync_fpstate(fpu);
 }
 
 /*
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 5e1f381..af3db53 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -87,8 +87,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 #ifdef CONFIG_VM86
 	dst->thread.vm86 = NULL;
 #endif
-
-	return fpu__copy(dst, src);
+	return fpu_clone(dst);
 }
 
 /*
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 853ea7a..4c9c4aa 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -1046,9 +1046,10 @@ static void math_error(struct pt_regs *regs, int trapnr)
 	}
 
 	/*
-	 * Save the info for the exception handler and clear the error.
+	 * Synchronize the FPU register state to the memory register state
+	 * if necessary. This allows the exception handler to inspect it.
 	 */
-	fpu__save(fpu);
+	fpu_sync_fpstate(fpu);
 
 	task->thread.trap_nr	= trapnr;
 	task->thread.error_code = 0;

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/pkeys: Move read_pkru() and write_pkru()
  2021-06-23 12:02 ` [patch V4 38/65] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Dave Hansen
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Dave Hansen @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     784a46618f634973a17535b7d3d03cd4ebc0ccbd
Gitweb:        https://git.kernel.org/tip/784a46618f634973a17535b7d3d03cd4ebc0ccbd
Author:        Dave Hansen <dave.hansen@linux.intel.com>
AuthorDate:    Wed, 23 Jun 2021 14:02:05 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:52:57 +02:00

x86/pkeys: Move read_pkru() and write_pkru()

write_pkru() was originally used just to write to the PKRU register.  It
was mercifully short and sweet and was not out of place in pgtable.h with
some other pkey-related code.

But, later work included a requirement to also modify the task XSAVE
buffer when updating the register.  This really is more related to the
XSAVE architecture than to paging.

Move the read/write_pkru() to asm/pkru.h.  pgtable.h won't miss them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121455.102647114@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  1 +-
 arch/x86/include/asm/pgtable.h    | 57 +----------------------------
 arch/x86/include/asm/pkru.h       | 61 ++++++++++++++++++++++++++++++-
 arch/x86/kernel/process_64.c      |  1 +-
 arch/x86/kvm/svm/sev.c            |  1 +-
 arch/x86/kvm/x86.c                |  1 +-
 arch/x86/mm/pkeys.c               |  1 +-
 7 files changed, 67 insertions(+), 56 deletions(-)
 create mode 100644 arch/x86/include/asm/pkru.h

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 7de2384..5764cbe 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -6,6 +6,7 @@
 #include <linux/types.h>
 
 #include <asm/processor.h>
+#include <asm/fpu/api.h>
 #include <asm/user.h>
 
 /* Bit 63 of XCR0 is reserved for future expansion */
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index b1099f2..ec0d8e1 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,7 +23,7 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
-#include <asm/fpu/xstate.h>
+#include <asm/pkru.h>
 #include <asm/fpu/api.h>
 #include <asm-generic/pgtable_uffd.h>
 
@@ -126,35 +126,6 @@ static inline int pte_dirty(pte_t pte)
 	return pte_flags(pte) & _PAGE_DIRTY;
 }
 
-
-static inline u32 read_pkru(void)
-{
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return rdpkru();
-	return 0;
-}
-
-static inline void write_pkru(u32 pkru)
-{
-	struct pkru_state *pk;
-
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
-	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
-	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
-}
-
 static inline int pte_young(pte_t pte)
 {
 	return pte_flags(pte) & _PAGE_ACCESSED;
@@ -1360,32 +1331,6 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
 }
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
 
-#define PKRU_AD_BIT 0x1
-#define PKRU_WD_BIT 0x2
-#define PKRU_BITS_PER_PKEY 2
-
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-extern u32 init_pkru_value;
-#else
-#define init_pkru_value	0
-#endif
-
-static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
-}
-
-static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	/*
-	 * Access-disable disables writes too so we need to check
-	 * both bits here.
-	 */
-	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
-}
-
 static inline u16 pte_flags_pkey(unsigned long pte_flags)
 {
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
diff --git a/arch/x86/include/asm/pkru.h b/arch/x86/include/asm/pkru.h
new file mode 100644
index 0000000..35ffcfd
--- /dev/null
+++ b/arch/x86/include/asm/pkru.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PKRU_H
+#define _ASM_X86_PKRU_H
+
+#include <asm/fpu/xstate.h>
+
+#define PKRU_AD_BIT 0x1
+#define PKRU_WD_BIT 0x2
+#define PKRU_BITS_PER_PKEY 2
+
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+extern u32 init_pkru_value;
+#else
+#define init_pkru_value	0
+#endif
+
+static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+}
+
+static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	/*
+	 * Access-disable disables writes too so we need to check
+	 * both bits here.
+	 */
+	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+}
+
+static inline u32 read_pkru(void)
+{
+	if (boot_cpu_has(X86_FEATURE_OSPKE))
+		return rdpkru();
+	return 0;
+}
+
+static inline void write_pkru(u32 pkru)
+{
+	struct pkru_state *pk;
+
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return;
+
+	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
+
+	/*
+	 * The PKRU value in xstate needs to be in sync with the value that is
+	 * written to the CPU. The FPU restore on return to userland would
+	 * otherwise load the previous value again.
+	 */
+	fpregs_lock();
+	if (pk)
+		pk->pkru = pkru;
+	__write_pkru(pkru);
+	fpregs_unlock();
+}
+
+#endif
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d08307d..4651ab0 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -41,6 +41,7 @@
 #include <linux/syscalls.h>
 
 #include <asm/processor.h>
+#include <asm/pkru.h>
 #include <asm/fpu/internal.h>
 #include <asm/mmu_context.h>
 #include <asm/prctl.h>
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 8d36f0c..62926f1 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -19,6 +19,7 @@
 #include <linux/trace_events.h>
 #include <asm/fpu/internal.h>
 
+#include <asm/pkru.h>
 #include <asm/trapnr.h>
 
 #include "x86.h"
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6f651b9..07f7888 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -65,6 +65,7 @@
 #include <asm/msr.h>
 #include <asm/desc.h>
 #include <asm/mce.h>
+#include <asm/pkru.h>
 #include <linux/kernel_stat.h>
 #include <asm/fpu/internal.h> /* Ugh! */
 #include <asm/pvclock.h>
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index a840ac7..a02cfcf 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,6 +10,7 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
+#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/xstate: Sanitize handling of independent features
  2021-06-23 12:02 ` [patch V4 37/65] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Thomas Gleixner, Borislav Petkov, Kan Liang, x86,
	linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     a75c52896b6d42d6600db4d4dd9f7e4bde9218db
Gitweb:        https://git.kernel.org/tip/a75c52896b6d42d6600db4d4dd9f7e4bde9218db
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:04 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:46:20 +02:00

x86/fpu/xstate: Sanitize handling of independent features

The copy functions for the independent features are horribly named and the
supervisor and independent part is just overengineered.

The point is that the supplied mask has either to be a subset of the
independent features or a subset of the task->fpu.xstate managed features.

Rewrite it so it checks for invalid overlaps of these areas in the caller
supplied feature mask. Rename it so it follows the new naming convention
for these operations. Mop up the function documentation.

This allows to use that function for other purposes as well.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/20210623121455.004880675@linutronix.de
---
 arch/x86/events/intel/lbr.c       |  6 +-
 arch/x86/include/asm/fpu/xstate.h |  5 +-
 arch/x86/kernel/fpu/xstate.c      | 98 ++++++++++++++----------------
 3 files changed, 54 insertions(+), 55 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 2fa2df3..f338645 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(void *ctx)
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xrstors(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(void *ctx)
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xsaves(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsave(struct cpu_hw_events *cpuc)
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	xsaves(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index a55bd5c..7de2384 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,8 +104,9 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
+
+void xsaves(struct xregs_state *xsave, u64 mask);
+void xrstors(struct xregs_state *xsave, u64 mask);
 
 enum xstate_copy_mode {
 	XSTATE_COPY_FP,
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 27246a8..735d44c 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1162,76 +1162,74 @@ int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
 	return copy_uabi_to_xstate(xsave, NULL, ubuf);
 }
 
+static bool validate_xsaves_xrstors(u64 mask)
+{
+	u64 xchk;
+
+	if (WARN_ON_FPU(!cpu_feature_enabled(X86_FEATURE_XSAVES)))
+		return false;
+	/*
+	 * Validate that this is either a task->fpstate related component
+	 * subset or an independent one.
+	 */
+	if (mask & xfeatures_mask_independent())
+		xchk = ~xfeatures_mask_independent();
+	else
+		xchk = ~xfeatures_mask_all;
+
+	if (WARN_ON_ONCE(!mask || mask & xchk))
+		return false;
+
+	return true;
+}
+
 /**
- * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features saved into the xsave area
+ * xsaves - Save selected components to a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to save
  *
- * Only the independent supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
- * supervisor feature). Besides the independent supervisor states, the legacy
- * region and XSAVE header are also saved into the xsave area. The supervisor
- * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
+ * The @xstate buffer must be 64 byte aligned and correctly initialized as
+ * XSAVES does not write the full xstate header. Before first use the
+ * buffer should be zeroed otherwise a consecutive XRSTORS from that buffer
+ * can #GP.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features.
  */
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void xsaves(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (!validate_xsaves_xrstors(mask))
 		return;
 
-	if (WARN_ON_FPU(!independent_mask))
-		return;
-
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying to a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XSAVES, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 /**
- * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features restored from the xsave area
+ * xrstors - Restore selected components from a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to restore
+ *
+ * The @xstate buffer must be 64 byte aligned and correctly initialized
+ * otherwise XRSTORS from that buffer can #GP.
  *
- * Only the independent supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
- * independent supervisor feature). Besides the independent supervisor states, the
- * legacy region and XSAVE header are also restored from the xsave area. The
- * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
+ * Proper usage is to restore the state which was saved with
+ * xsaves() into @xstate.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features.
  */
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
+void xrstors(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (!validate_xsaves_xrstors(mask))
 		return;
 
-	if (WARN_ON_FPU(!independent_mask))
-		return;
-
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying from a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XRSTORS, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 #ifdef CONFIG_PROC_PID_ARCH_STATUS

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate()
  2021-06-23 12:02 ` [patch V4 34/65] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     1c61fada304c125c3f8a2b8eb1896406e4098a05
Gitweb:        https://git.kernel.org/tip/1c61fada304c125c3f8a2b8eb1896406e4098a05
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:01 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:36:42 +02:00

x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate()

This is not a copy functionality. It restores the register state from the
supplied kernel buffer.

No functional changes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.716058365@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 8 ++++----
 arch/x86/kvm/x86.c                  | 4 ++--
 arch/x86/mm/extable.c               | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 3e0eeda..3558cd0 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -376,7 +376,7 @@ static inline int os_xrstor_safe(struct xregs_state *xstate, u64 mask)
 	return err;
 }
 
-static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
+static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
 		os_xrstor(&fpstate->xsave, mask);
@@ -388,7 +388,7 @@ static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask
 	}
 }
 
-static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
+static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
 	/*
 	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
@@ -403,7 +403,7 @@ static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
 			: : [addr] "m" (fpstate));
 	}
 
-	__copy_kernel_to_fpregs(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
@@ -474,7 +474,7 @@ static inline void __fpregs_load_activate(void)
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		copy_kernel_to_fpregs(&fpu->state);
+		restore_fpregs_from_fpstate(&fpu->state);
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 71bacb7..6f651b9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9653,7 +9653,7 @@ static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 	 */
 	if (vcpu->arch.guest_fpu)
 		/* PKRU is separately restored in kvm_x86_ops.run. */
-		__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state,
+		__restore_fpregs_from_fpstate(&vcpu->arch.guest_fpu->state,
 					~XFEATURE_MASK_PKRU);
 
 	fpregs_mark_activate();
@@ -9674,7 +9674,7 @@ static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 	if (vcpu->arch.guest_fpu)
 		kvm_save_current_fpu(vcpu->arch.guest_fpu);
 
-	copy_kernel_to_fpregs(&vcpu->arch.user_fpu->state);
+	restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state);
 
 	fpregs_mark_activate();
 	fpregs_unlock();
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index 121921b..2c5cccd 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup,
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__copy_kernel_to_fpregs(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, -1);
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename initstate copy functions
  2021-06-23 12:02 ` [patch V4 35/65] x86/fpu: Rename initstate copy functions Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     b76411b1b568311bfd89d03acc587ffc1548c26f
Gitweb:        https://git.kernel.org/tip/b76411b1b568311bfd89d03acc587ffc1548c26f
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:02 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:39:53 +02:00

x86/fpu: Rename initstate copy functions

Again this not a copy. It's restoring register state from kernel memory.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.816581630@linutronix.de
---
 arch/x86/kernel/fpu/core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index c290ba2..4a59e0f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -303,7 +303,7 @@ void fpu__drop(struct fpu *fpu)
  * Clear FPU registers by setting them up from the init fpstate.
  * Caller must do fpregs_[un]lock() around it.
  */
-static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
+static inline void restore_fpregs_from_init_fpstate(u64 features_mask)
 {
 	if (use_xsave())
 		os_xrstor(&init_fpstate.xsave, features_mask);
@@ -338,9 +338,9 @@ static void fpu__clear(struct fpu *fpu, bool user_only)
 		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
 		    xfeatures_mask_supervisor())
 			os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
-		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+		restore_fpregs_from_init_fpstate(xfeatures_mask_user());
 	} else {
-		copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+		restore_fpregs_from_init_fpstate(xfeatures_mask_all);
 	}
 
 	fpregs_mark_activate();

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Get rid of the FNSAVE optimization
  2021-06-23 12:02 ` [patch V4 33/65] x86/fpu: Get rid of the FNSAVE optimization Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     08ded2cd18a09749e67a14426aa7fd1b04ab1dc0
Gitweb:        https://git.kernel.org/tip/08ded2cd18a09749e67a14426aa7fd1b04ab1dc0
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:02:00 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:29:41 +02:00

x86/fpu: Get rid of the FNSAVE optimization

The FNSAVE support requires conditionals in quite some call paths because
FNSAVE reinitializes the FPU hardware. If the save has to preserve the FPU
register state then the caller has to conditionally restore it from memory
when FNSAVE is in use.

This also requires a conditional in context switch because the restore
avoidance optimization cannot work with FNSAVE. As this only affects 20+
years old CPUs there is really no reason to keep this optimization
effective for FNSAVE. It's about time to not optimize for antiques anymore.

Just unconditionally FRSTOR the save content to the registers and clean up
the conditionals all over the place.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.617369268@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 18 +++++----
 arch/x86/kernel/fpu/core.c          | 54 +++++++++++-----------------
 2 files changed, 34 insertions(+), 38 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 559dd30..3e0eeda 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -83,6 +83,7 @@ extern void fpstate_init_soft(struct swregs_state *soft);
 #else
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
+extern void save_fpregs_to_fpstate(struct fpu *fpu);
 
 #define user_insn(insn, output, input...)				\
 ({									\
@@ -375,8 +376,6 @@ static inline int os_xrstor_safe(struct xregs_state *xstate, u64 mask)
 	return err;
 }
 
-extern int save_fpregs_to_fpstate(struct fpu *fpu);
-
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
@@ -507,12 +506,17 @@ static inline void __fpregs_load_activate(void)
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!save_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
-		else
-			old_fpu->last_cpu = cpu;
+		save_fpregs_to_fpstate(old_fpu);
+		/*
+		 * The save operation preserved register state, so the
+		 * fpu_fpregs_owner_ctx is still @old_fpu. Store the
+		 * current CPU number in @old_fpu, so the next return
+		 * to user space can avoid the FPU register restore
+		 * when is returns on the same CPU and still owns the
+		 * context.
+		 */
+		old_fpu->last_cpu = cpu;
 
-		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
 	}
 }
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index dcf4d6b..c290ba2 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -83,16 +83,20 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
+ * Save the FPU register state in fpu->state. The register state is
+ * preserved.
  *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
+ * Must be called with fpregs_lock() held.
+ *
+ * The legacy FNSAVE instruction clears all FPU state unconditionally, so
+ * register state has to be reloaded. That might be a pointless exercise
+ * when the FPU is going to be used by another task right after that. But
+ * this only affects 20+ years old 32bit systems and avoids conditionals all
+ * over the place.
+ *
+ * FXSAVE and all XSAVE variants preserve the FPU register state.
  */
-int save_fpregs_to_fpstate(struct fpu *fpu)
+void save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(&fpu->state.xsave);
@@ -103,21 +107,20 @@ int save_fpregs_to_fpstate(struct fpu *fpu)
 		 */
 		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
 			fpu->avx512_timestamp = jiffies;
-		return 1;
+		return;
 	}
 
 	if (likely(use_fxsr())) {
 		fxsave(&fpu->state.fxsave);
-		return 1;
+		return;
 	}
 
 	/*
 	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
+	 * so we have to reload them from the memory state.
 	 */
 	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
-
-	return 0;
+	frstor(&fpu->state.fsave);
 }
 EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
@@ -133,10 +136,6 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 	if (!(current->flags & PF_KTHREAD) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
 		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
@@ -171,11 +170,8 @@ void fpu__save(struct fpu *fpu)
 	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!save_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
-		}
-	}
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		save_fpregs_to_fpstate(fpu);
 
 	trace_x86_fpu_after_save(fpu);
 	fpregs_unlock();
@@ -244,20 +240,16 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 	memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
 
 	/*
-	 * If the FPU registers are not current just memcpy() the state.
-	 * Otherwise save current FPU registers directly into the child's FPU
-	 * context, without any memory-to-memory copying.
-	 *
-	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to load them back. )
+	 * If the FPU registers are not owned by current just memcpy() the
+	 * state.  Otherwise save the FPU registers directly into the
+	 * child's FPU context, without any memory-to-memory copying.
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!save_fpregs_to_fpstate(dst_fpu))
-		copy_kernel_to_fpregs(&dst_fpu->state);
-
+	else
+		save_fpregs_to_fpstate(dst_fpu);
 	fpregs_unlock();
 
 	set_tsk_thread_flag(dst, TIF_NEED_FPU_LOAD);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()
  2021-06-23 12:01 ` [patch V4 32/65] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     ebe7234b08a42d69bae94c4062a84777ea26ef99
Gitweb:        https://git.kernel.org/tip/ebe7234b08a42d69bae94c4062a84777ea26ef99
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:59 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:26:43 +02:00

x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()

A copy is guaranteed to leave the source intact, which is not the case when
FNSAVE is used as that reinitilizes the registers.

Save does not make such guarantees and it matches what this is about,
i.e. to save the state for a later restore.

Rename it to save_fpregs_to_fpstate().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.508853062@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  4 ++--
 arch/x86/kernel/fpu/core.c          | 10 +++++-----
 arch/x86/kvm/x86.c                  |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index cd88233..559dd30 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -375,7 +375,7 @@ static inline int os_xrstor_safe(struct xregs_state *xstate, u64 mask)
 	return err;
 }
 
-extern int copy_fpregs_to_fpstate(struct fpu *fpu);
+extern int save_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
@@ -507,7 +507,7 @@ static inline void __fpregs_load_activate(void)
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!copy_fpregs_to_fpstate(old_fpu))
+		if (!save_fpregs_to_fpstate(old_fpu))
 			old_fpu->last_cpu = -1;
 		else
 			old_fpu->last_cpu = cpu;
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1d25876..dcf4d6b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -92,7 +92,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
  * Modern FPU state can be kept in registers, if there are
  * no pending FP exceptions.
  */
-int copy_fpregs_to_fpstate(struct fpu *fpu)
+int save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(&fpu->state.xsave);
@@ -119,7 +119,7 @@ int copy_fpregs_to_fpstate(struct fpu *fpu)
 
 	return 0;
 }
-EXPORT_SYMBOL(copy_fpregs_to_fpstate);
+EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -137,7 +137,7 @@ void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
 		 */
-		copy_fpregs_to_fpstate(&current->thread.fpu);
+		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
 
@@ -172,7 +172,7 @@ void fpu__save(struct fpu *fpu)
 	trace_x86_fpu_before_save(fpu);
 
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!copy_fpregs_to_fpstate(fpu)) {
+		if (!save_fpregs_to_fpstate(fpu)) {
 			copy_kernel_to_fpregs(&fpu->state);
 		}
 	}
@@ -255,7 +255,7 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!copy_fpregs_to_fpstate(dst_fpu))
+	else if (!save_fpregs_to_fpstate(dst_fpu))
 		copy_kernel_to_fpregs(&dst_fpu->state);
 
 	fpregs_unlock();
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c25bf24..71bacb7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9637,7 +9637,7 @@ static void kvm_save_current_fpu(struct fpu *fpu)
 		memcpy(&fpu->state, &current->thread.fpu.state,
 		       fpu_kernel_xstate_size);
 	else
-		copy_fpregs_to_fpstate(fpu);
+		save_fpregs_to_fpstate(fpu);
 }
 
 /* Swap (qemu) user FPU context for the guest FPU context. */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()
  2021-06-23 12:01 ` [patch V4 31/65] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Andy Lutomirski, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     522e92743b35351bda1b6a9136560f833a9c2490
Gitweb:        https://git.kernel.org/tip/522e92743b35351bda1b6a9136560f833a9c2490
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:58 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:26:00 +02:00

x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()

copy_uabi_from_user_to_xstate() and copy_uabi_from_kernel_to_xstate() are
almost identical except for the copy function.

Unify them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20210623121454.414215896@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 137 +++++++++++-----------------------
 1 file changed, 47 insertions(+), 90 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 57674f8..0eb42a1 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -953,23 +953,6 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 }
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
-/*
- * Weird legacy quirk: SSE and YMM states store information in the
- * MXCSR and MXCSR_FLAGS fields of the FP area. That means if the FP
- * area is marked as unused in the xfeatures header, we need to copy
- * MXCSR and MXCSR_FLAGS if either SSE or YMM are in use.
- */
-static inline bool xfeatures_mxcsr_quirk(u64 xfeatures)
-{
-	if (!(xfeatures & (XFEATURE_MASK_SSE|XFEATURE_MASK_YMM)))
-		return false;
-
-	if (xfeatures & XFEATURE_MASK_FP)
-		return false;
-
-	return true;
-}
-
 static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
 			 void *init_xstate, unsigned int size)
 {
@@ -1082,39 +1065,53 @@ out:
 		membuf_zero(&to, to.left);
 }
 
-static inline bool mxcsr_valid(struct xstate_header *hdr, const u32 *mxcsr)
+static int copy_from_buffer(void *dst, unsigned int offset, unsigned int size,
+			    const void *kbuf, const void __user *ubuf)
 {
-	u64 mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
-
-	/* Only check if it is in use */
-	if (hdr->xfeatures & mask) {
-		/* Reserved bits in MXCSR must be zero. */
-		if (*mxcsr & ~mxcsr_feature_mask)
-			return false;
+	if (kbuf) {
+		memcpy(dst, kbuf + offset, size);
+	} else {
+		if (copy_from_user(dst, ubuf + offset, size))
+			return -EFAULT;
 	}
-	return true;
+	return 0;
 }
 
-/*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
- * and copy to the target thread. This is called from xstateregs_set().
- */
-int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+
+static int copy_uabi_to_xstate(struct xregs_state *xsave, const void *kbuf,
+			       const void __user *ubuf)
 {
 	unsigned int offset, size;
-	int i;
 	struct xstate_header hdr;
+	u64 mask;
+	int i;
 
 	offset = offsetof(struct xregs_state, header);
-	size = sizeof(hdr);
-
-	memcpy(&hdr, kbuf + offset, size);
+	if (copy_from_buffer(&hdr, offset, sizeof(hdr), kbuf, ubuf))
+		return -EFAULT;
 
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
 
-	if (!mxcsr_valid(&hdr, kbuf + offsetof(struct fxregs_state, mxcsr)))
-		return -EINVAL;
+	/* Validate MXCSR when any of the related features is in use */
+	mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
+	if (hdr.xfeatures & mask) {
+		u32 mxcsr[2];
+
+		offset = offsetof(struct fxregs_state, mxcsr);
+		if (copy_from_buffer(mxcsr, offset, sizeof(mxcsr), kbuf, ubuf))
+			return -EFAULT;
+
+		/* Reserved bits in MXCSR must be zero. */
+		if (mxcsr[0] & ~mxcsr_feature_mask)
+			return -EINVAL;
+
+		/* SSE and YMM require MXCSR even when FP is not in use. */
+		if (!(hdr.xfeatures & XFEATURE_MASK_FP)) {
+			xsave->i387.mxcsr = mxcsr[0];
+			xsave->i387.mxcsr_mask = mxcsr[1];
+		}
+	}
 
 	for (i = 0; i < XFEATURE_MAX; i++) {
 		u64 mask = ((u64)1 << i);
@@ -1125,16 +1122,11 @@ int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
 
-			memcpy(dst, kbuf + offset, size);
+			if (copy_from_buffer(dst, offset, size, kbuf, ubuf))
+				return -EFAULT;
 		}
 	}
 
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		memcpy(&xsave->i387.mxcsr, kbuf + offset, size);
-	}
-
 	/*
 	 * The state that came in from userspace was user-state only.
 	 * Mask all the user states out of 'xfeatures':
@@ -1150,6 +1142,16 @@ int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 }
 
 /*
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S]
+ * format and copy to the target thread. This is called from
+ * xstateregs_set().
+ */
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+{
+	return copy_uabi_to_xstate(xsave, kbuf, NULL);
+}
+
+/*
  * Convert from a sigreturn standard-format user-space buffer to kernel
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
@@ -1157,52 +1159,7 @@ int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
 				      const void __user *ubuf)
 {
-	unsigned int offset, size;
-	int i;
-	struct xstate_header hdr;
-
-	offset = offsetof(struct xregs_state, header);
-	size = sizeof(hdr);
-
-	if (copy_from_user(&hdr, ubuf + offset, size))
-		return -EFAULT;
-
-	if (validate_user_xstate_header(&hdr))
-		return -EINVAL;
-
-	for (i = 0; i < XFEATURE_MAX; i++) {
-		u64 mask = ((u64)1 << i);
-
-		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, i);
-
-			offset = xstate_offsets[i];
-			size = xstate_sizes[i];
-
-			if (copy_from_user(dst, ubuf + offset, size))
-				return -EFAULT;
-		}
-	}
-
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		if (copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
-			return -EFAULT;
-	}
-
-	/*
-	 * The state that came in from userspace was user-state only.
-	 * Mask all the user states out of 'xfeatures':
-	 */
-	xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR_ALL;
-
-	/*
-	 * Add back in the features that came in from userspace:
-	 */
-	xsave->header.xfeatures |= hdr.xfeatures;
-
-	return 0;
+	return copy_uabi_to_xstate(xsave, NULL, ubuf);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename xstate copy functions which are related to UABI
  2021-06-23 12:01 ` [patch V4 30/65] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     1cc34413ff3f18c30e5df89fefd95cc0f3b3292e
Gitweb:        https://git.kernel.org/tip/1cc34413ff3f18c30e5df89fefd95cc0f3b3292e
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:57 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:23:14 +02:00

x86/fpu: Rename xstate copy functions which are related to UABI

Rename them to reflect that these functions deal with user space format
XSAVE buffers.

      copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate()
      copy_user_to_xstate()   -> copy_sigframe_from_user_to_xstate()

Again a clear statement that these functions deal with user space ABI.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.318485015@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h | 4 ++--
 arch/x86/kernel/fpu/regset.c      | 2 +-
 arch/x86/kernel/fpu/signal.c      | 2 +-
 arch/x86/kernel/fpu/xstate.c      | 5 +++--
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 6611e06..00e1a2a 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -102,8 +102,8 @@ extern void __init update_regset_xstate_info(unsigned int size,
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index ddc290d..892aec1 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -166,7 +166,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	}
 
 	fpu_force_restore(fpu);
-	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
+	ret = copy_uabi_from_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
 out:
 	vfree(tmpbuf);
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 430c66d..fd4b58d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -422,7 +422,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
-		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
+		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			goto out;
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4fca8a8..57674f8 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1099,7 +1099,7 @@ static inline bool mxcsr_valid(struct xstate_header *hdr, const u32 *mxcsr)
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 {
 	unsigned int offset, size;
 	int i;
@@ -1154,7 +1154,8 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
  */
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
+				      const void __user *ubuf)
 {
 	unsigned int offset, size;
 	int i;

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename fregs-related copy functions
  2021-06-23 12:01 ` [patch V4 29/65] x86/fpu: Rename fregs related copy functions Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     6fdc908cb56123591baa4259400cfb0787582b11
Gitweb:        https://git.kernel.org/tip/6fdc908cb56123591baa4259400cfb0787582b11
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:56 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:20:27 +02:00

x86/fpu: Rename fregs-related copy functions

The function names for fnsave/fnrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_kernel_to_fregs() to frstor()
	copy_fregs_to_user()   to fnsave_to_user_sigframe()
	copy_user_to_fregs()   to frstor_from_user_sigframe()

so it's clear what these are doing. All these functions are really low
level wrappers around the equally named instructions, so mapping to the
documentation is just natural.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.223594101@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 10 +++++-----
 arch/x86/kernel/fpu/core.c          |  2 +-
 arch/x86/kernel/fpu/signal.c        |  6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 7806f39..cd88233 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -124,7 +124,7 @@ static inline void fpstate_init_soft(struct swregs_state *soft) {}
 		     _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_fprestore)	\
 		     : output : input)
 
-static inline int copy_fregs_to_user(struct fregs_state __user *fx)
+static inline int fnsave_to_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
@@ -162,17 +162,17 @@ static inline int fxrstor_from_user_sigframe(struct fxregs_state __user *fx)
 		return user_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_kernel_to_fregs(struct fregs_state *fx)
+static inline void frstor(struct fregs_state *fx)
 {
 	kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fregs_err(struct fregs_state *fx)
+static inline int frstor_safe(struct fregs_state *fx)
 {
 	return kernel_insn_err(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fregs(struct fregs_state __user *fx)
+static inline int frstor_from_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
@@ -385,7 +385,7 @@ static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask
 		if (use_fxsr())
 			fxrstor(&fpstate->fxsave);
 		else
-			copy_kernel_to_fregs(&fpstate->fsave);
+			frstor(&fpstate->fsave);
 	}
 }
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 035487d..1d25876 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -318,7 +318,7 @@ static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
 	else if (use_fxsr())
 		fxrstor(&init_fpstate.fxsave);
 	else
-		copy_kernel_to_fregs(&init_fpstate.fsave);
+		frstor(&init_fpstate.fsave);
 
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 05f8445..430c66d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -133,7 +133,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 	else if (use_fxsr())
 		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
-		err = copy_fregs_to_user((struct fregs_state __user *) buf);
+		err = fnsave_to_user_sigframe((struct fregs_state __user *) buf);
 
 	if (unlikely(err) && __clear_user(buf, fpu_user_xstate_size))
 		err = -EFAULT;
@@ -274,7 +274,7 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 	} else if (use_fxsr()) {
 		return fxrstor_from_user_sigframe(buf);
 	} else
-		return copy_user_to_fregs(buf);
+		return frstor_from_user_sigframe(buf);
 }
 
 static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
@@ -465,7 +465,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			goto out;
 
 		fpregs_lock();
-		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
+		ret = frstor_safe(&fpu->state.fsave);
 	}
 	if (!ret)
 		fpregs_mark_activate();

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/math-emu: Rename frstor()
  2021-06-23 12:01 ` [patch V4 28/65] x86/math-emu: Rename frstor() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     872c65dbf669b3b471b3d8656391a6b4f736d22b
Gitweb:        https://git.kernel.org/tip/872c65dbf669b3b471b3d8656391a6b4f736d22b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:55 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:16:33 +02:00

x86/math-emu: Rename frstor()

This is in the way of renaming the low level hardware accessors to match
the instruction name. Prepend it with FPU_ which is consistent vs. the
rest of the emulation code.

No functional change.

  [ bp: Correct the Reported-by: ]

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.111665161@linutronix.de
---
 arch/x86/math-emu/fpu_proto.h  | 2 +-
 arch/x86/math-emu/load_store.c | 2 +-
 arch/x86/math-emu/reg_ld_str.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/math-emu/fpu_proto.h b/arch/x86/math-emu/fpu_proto.h
index 70d35c2..94c4023 100644
--- a/arch/x86/math-emu/fpu_proto.h
+++ b/arch/x86/math-emu/fpu_proto.h
@@ -144,7 +144,7 @@ extern int FPU_store_int16(FPU_REG *st0_ptr, u_char st0_tag, short __user *d);
 extern int FPU_store_bcd(FPU_REG *st0_ptr, u_char st0_tag, u_char __user *d);
 extern int FPU_round_to_int(FPU_REG *r, u_char tag);
 extern u_char __user *fldenv(fpu_addr_modes addr_modes, u_char __user *s);
-extern void frstor(fpu_addr_modes addr_modes, u_char __user *data_address);
+extern void FPU_frstor(fpu_addr_modes addr_modes, u_char __user *data_address);
 extern u_char __user *fstenv(fpu_addr_modes addr_modes, u_char __user *d);
 extern void fsave(fpu_addr_modes addr_modes, u_char __user *data_address);
 extern int FPU_tagof(FPU_REG *ptr);
diff --git a/arch/x86/math-emu/load_store.c b/arch/x86/math-emu/load_store.c
index f15263e..4092df7 100644
--- a/arch/x86/math-emu/load_store.c
+++ b/arch/x86/math-emu/load_store.c
@@ -240,7 +240,7 @@ int FPU_load_store(u_char type, fpu_addr_modes addr_modes,
 		   fix-up operations. */
 		return 1;
 	case 022:		/* frstor m94/108byte */
-		frstor(addr_modes, (u_char __user *) data_address);
+		FPU_frstor(addr_modes, (u_char __user *) data_address);
 		/* Ensure that the values just loaded are not changed by
 		   fix-up operations. */
 		return 1;
diff --git a/arch/x86/math-emu/reg_ld_str.c b/arch/x86/math-emu/reg_ld_str.c
index 7ca6417..7e4521f 100644
--- a/arch/x86/math-emu/reg_ld_str.c
+++ b/arch/x86/math-emu/reg_ld_str.c
@@ -1117,7 +1117,7 @@ u_char __user *fldenv(fpu_addr_modes addr_modes, u_char __user *s)
 	return s;
 }
 
-void frstor(fpu_addr_modes addr_modes, u_char __user *data_address)
+void FPU_frstor(fpu_addr_modes addr_modes, u_char __user *data_address)
 {
 	int i, regnr;
 	u_char __user *s = fldenv(addr_modes, data_address);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename fxregs-related copy functions
  2021-06-23 12:01 ` [patch V4 27/65] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     16dcf4385933a02bb21d0af86a04439d151ad42a
Gitweb:        https://git.kernel.org/tip/16dcf4385933a02bb21d0af86a04439d151ad42a
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:54 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:12:30 +02:00

x86/fpu: Rename fxregs-related copy functions

The function names for fxsave/fxrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_fxregs_to_kernel() to fxsave()
	copy_kernel_to_fxregs() to fxrstor()
	copy_fxregs_to_user() to fxsave_to_user_sigframe()
	copy_user_to_fxregs() to fxrstor_from_user_sigframe()

so it's clear what these are doing. All these functions are really low
level wrappers around the equally named instructions, so mapping to the
documentation is just natural.

While at it, replace the static_cpu_has(X86_FEATURE_FXSR) with
use_fxsr() to be consistent with the rest of the code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121454.017863494@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 18 +++++-------------
 arch/x86/kernel/fpu/core.c          |  6 +++---
 arch/x86/kernel/fpu/signal.c        | 10 +++++-----
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a2ab744..7806f39 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -129,7 +129,7 @@ static inline int copy_fregs_to_user(struct fregs_state __user *fx)
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
 
-static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)
+static inline int fxsave_to_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxsave %[fx], [fx] "=m" (*fx), "m" (*fx));
@@ -138,7 +138,7 @@ static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)
 
 }
 
-static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
+static inline void fxrstor(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		kernel_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -146,7 +146,7 @@ static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
 		kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
+static inline int fxrstor_safe(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return kernel_insn_err(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -154,7 +154,7 @@ static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
 		return kernel_insn_err(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
+static inline int fxrstor_from_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -177,14 +177,6 @@ static inline int copy_user_to_fregs(struct fregs_state __user *fx)
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_fxregs_to_kernel(struct fpu *fpu)
-{
-	if (IS_ENABLED(CONFIG_X86_32))
-		asm volatile( "fxsave %[fx]" : [fx] "=m" (fpu->state.fxsave));
-	else
-		asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
-}
-
 static inline void fxsave(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
@@ -391,7 +383,7 @@ static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask
 		os_xrstor(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
-			copy_kernel_to_fxregs(&fpstate->fxsave);
+			fxrstor(&fpstate->fxsave);
 		else
 			copy_kernel_to_fregs(&fpstate->fsave);
 	}
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index bfdcf7f..035487d 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -107,7 +107,7 @@ int copy_fpregs_to_fpstate(struct fpu *fpu)
 	}
 
 	if (likely(use_fxsr())) {
-		copy_fxregs_to_kernel(fpu);
+		fxsave(&fpu->state.fxsave);
 		return 1;
 	}
 
@@ -315,8 +315,8 @@ static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
 {
 	if (use_xsave())
 		os_xrstor(&init_fpstate.xsave, features_mask);
-	else if (static_cpu_has(X86_FEATURE_FXSR))
-		copy_kernel_to_fxregs(&init_fpstate.fxsave);
+	else if (use_fxsr())
+		fxrstor(&init_fpstate.fxsave);
 	else
 		copy_kernel_to_fregs(&init_fpstate.fsave);
 
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 4fe632f..05f8445 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -64,7 +64,7 @@ static inline int save_fsave_header(struct task_struct *tsk, void __user *buf)
 
 		fpregs_lock();
 		if (!test_thread_flag(TIF_NEED_FPU_LOAD))
-			copy_fxregs_to_kernel(&tsk->thread.fpu);
+			fxsave(&tsk->thread.fpu.state.fxsave);
 		fpregs_unlock();
 
 		convert_from_fxsr(&env, tsk);
@@ -131,7 +131,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 	if (use_xsave())
 		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
-		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
+		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
 		err = copy_fregs_to_user((struct fregs_state __user *) buf);
 
@@ -259,7 +259,7 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 		if (fx_only) {
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
 
-			r = copy_user_to_fxregs(buf);
+			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
@@ -272,7 +272,7 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 			return r;
 		}
 	} else if (use_fxsr()) {
-		return copy_user_to_fxregs(buf);
+		return fxrstor_from_user_sigframe(buf);
 	} else
 		return copy_user_to_fregs(buf);
 }
@@ -458,7 +458,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			os_xrstor(&init_fpstate.xsave, init_bv);
 		}
 
-		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
+		ret = fxrstor_safe(&fpu->state.fxsave);
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()
  2021-06-23 12:01 ` [patch V4 26/65] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     6b862ba1821441e6083cf061404694d33a841526
Gitweb:        https://git.kernel.org/tip/6b862ba1821441e6083cf061404694d33a841526
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:53 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 18:01:56 +02:00

x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_user() to xsave_to_user_sigframe()
	copy_user_to_xregs() to xrstor_from_user_sigframe()

so it's entirely clear what this is about. This is also a clear indicator
of the potentially different storage format because this is user ABI and
cannot use compacted format.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.924266705@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 4 ++--
 arch/x86/kernel/fpu/signal.c        | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index dae1545..a2ab744 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -326,7 +326,7 @@ static inline void os_xrstor(struct xregs_state *xstate, u64 mask)
  * backward compatibility for old applications which don't understand
  * compacted format of xsave area.
  */
-static inline int copy_xregs_to_user(struct xregs_state __user *buf)
+static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
 	u64 mask = xfeatures_mask_user();
 	u32 lmask = mask;
@@ -351,7 +351,7 @@ static inline int copy_xregs_to_user(struct xregs_state __user *buf)
 /*
  * Restore xstate from user space xsave area.
  */
-static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
+static inline int xrstor_from_user_sigframe(struct xregs_state __user *buf, u64 mask)
 {
 	struct xregs_state *xstate = ((__force struct xregs_state *)buf);
 	u32 lmask = mask;
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 33675b3..4fe632f 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -129,7 +129,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 	int err;
 
 	if (use_xsave())
-		err = copy_xregs_to_user(buf);
+		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
 		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
 	else
@@ -266,7 +266,7 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
-			r = copy_user_to_xregs(buf, xbv);
+			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
 				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()
  2021-06-23 12:01 ` [patch V4 25/65] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     b16313f71c1050ad5c92548925e0e9cec26989ab
Gitweb:        https://git.kernel.org/tip/b16313f71c1050ad5c92548925e0e9cec26989ab
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:52 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:57:57 +02:00

x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_kernel() to os_xsave()
	copy_kernel_to_xregs() to os_xrstor()

These are truly low level wrappers around the actual instructions
XSAVE[OPT]/XRSTOR and XSAVES/XRSTORS with the twist that the selection
based on the available CPU features happens with an alternative to avoid
conditionals all over the place and to provide the best performance for hot
paths.

The os_ prefix tells that this is the OS selected mechanism.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.830239347@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 17 +++++++++++------
 arch/x86/kernel/fpu/core.c          |  7 +++----
 arch/x86/kernel/fpu/signal.c        | 21 +++++++++++----------
 arch/x86/kernel/fpu/xstate.c        |  2 +-
 4 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 854bfb0..dae1545 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -261,7 +261,7 @@ static inline void fxsave(struct fxregs_state *fx)
  * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
-static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
+static inline void os_xrstor_booting(struct xregs_state *xstate)
 {
 	u64 mask = -1;
 	u32 lmask = mask;
@@ -284,8 +284,11 @@ static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
 
 /*
  * Save processor xstate to xsave area.
+ *
+ * Uses either XSAVE or XSAVEOPT or XSAVES depending on the CPU features
+ * and command line options. The choice is permanent until the next reboot.
  */
-static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
+static inline void os_xsave(struct xregs_state *xstate)
 {
 	u64 mask = xfeatures_mask_all;
 	u32 lmask = mask;
@@ -302,8 +305,10 @@ static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
 
 /*
  * Restore processor xstate from xsave area.
+ *
+ * Uses XRSTORS when XSAVES is used, XRSTOR otherwise.
  */
-static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask)
+static inline void os_xrstor(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
@@ -364,13 +369,13 @@ static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
  * Restore xstate from kernel space xsave area, return an error code instead of
  * an exception.
  */
-static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
+static inline int os_xrstor_safe(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
 
-	if (static_cpu_has(X86_FEATURE_XSAVES))
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES))
 		XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
 	else
 		XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
@@ -383,7 +388,7 @@ extern int copy_fpregs_to_fpstate(struct fpu *fpu);
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
-		copy_kernel_to_xregs(&fpstate->xsave, mask);
+		os_xrstor(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
 			copy_kernel_to_fxregs(&fpstate->fxsave);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2243701..bfdcf7f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -95,7 +95,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
 int copy_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
-		copy_xregs_to_kernel(&fpu->state.xsave);
+		os_xsave(&fpu->state.xsave);
 
 		/*
 		 * AVX512 state is tracked here because its use is
@@ -314,7 +314,7 @@ void fpu__drop(struct fpu *fpu)
 static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
 {
 	if (use_xsave())
-		copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
+		os_xrstor(&init_fpstate.xsave, features_mask);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		copy_kernel_to_fxregs(&init_fpstate.fxsave);
 	else
@@ -345,8 +345,7 @@ static void fpu__clear(struct fpu *fpu, bool user_only)
 	if (user_only) {
 		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
 		    xfeatures_mask_supervisor())
-			copy_kernel_to_xregs(&fpu->state.xsave,
-					     xfeatures_mask_supervisor());
+			os_xrstor(&fpu->state.xsave, xfeatures_mask_supervisor());
 		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
 	} else {
 		copy_init_fpstate_to_fpregs(xfeatures_mask_all);
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 5010595..33675b3 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -261,14 +261,14 @@ static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 
 			r = copy_user_to_fxregs(buf);
 			if (!r)
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
 			r = copy_user_to_xregs(buf, xbv);
 			if (!r && unlikely(init_bv))
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				os_xrstor(&init_fpstate.xsave, init_bv);
 			return r;
 		}
 	} else if (use_fxsr()) {
@@ -356,9 +356,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			 * has been copied to the kernel one.
 			 */
 			if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
-			    xfeatures_mask_supervisor())
-				copy_kernel_to_xregs(&fpu->state.xsave,
-						     xfeatures_mask_supervisor());
+			    xfeatures_mask_supervisor()) {
+				os_xrstor(&fpu->state.xsave,
+					  xfeatures_mask_supervisor());
+			}
 			fpregs_mark_activate();
 			fpregs_unlock();
 			return 0;
@@ -412,7 +413,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_xregs_to_kernel(&fpu->state.xsave);
+			os_xsave(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
@@ -430,14 +431,14 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 
 		fpregs_lock();
 		if (unlikely(init_bv))
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			os_xrstor(&init_fpstate.xsave, init_bv);
 
 		/*
 		 * Restore previously saved supervisor xstates along with
 		 * copied-in user xstates.
 		 */
-		ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
-					       user_xfeatures | xfeatures_mask_supervisor());
+		ret = os_xrstor_safe(&fpu->state.xsave,
+				     user_xfeatures | xfeatures_mask_supervisor());
 
 	} else if (use_fxsr()) {
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
@@ -454,7 +455,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			u64 init_bv;
 
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			os_xrstor(&init_fpstate.xsave, init_bv);
 		}
 
 		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 427977b..4fca8a8 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -400,7 +400,7 @@ static void __init setup_init_fpu_buf(void)
 	/*
 	 * Init all the features state with header.xfeatures being 0x0
 	 */
-	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
+	os_xrstor_booting(&init_fpstate.xsave);
 
 	/*
 	 * All components are now in init state. Read the state back so

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Get rid of copy_supervisor_to_kernel()
  2021-06-23 12:01 ` [patch V4 24/65] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Andy Lutomirski, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     1f3171252dc586745bb548d48f3bcedfea34b58d
Gitweb:        https://git.kernel.org/tip/1f3171252dc586745bb548d48f3bcedfea34b58d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:51 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:53:31 +02:00

x86/fpu: Get rid of copy_supervisor_to_kernel()

If the fast path of restoring the FPU state on sigreturn fails or is not
taken and the current task's FPU is active then the FPU has to be
deactivated for the slow path to allow a safe update of the tasks FPU
memory state.

With supervisor states enabled, this requires to save the supervisor state
in the memory state first. Supervisor states require XSAVES so saving only
the supervisor state requires to reshuffle the memory buffer because XSAVES
uses the compacted format and therefore stores the supervisor states at the
beginning of the memory state. That's just an overengineered optimization.

Get rid of it and save the full state for this case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.734561971@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  1 +-
 arch/x86/kernel/fpu/signal.c      | 13 ++++---
 arch/x86/kernel/fpu/xstate.c      | 55 +------------------------------
 3 files changed, 8 insertions(+), 61 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 6e5ba42..6611e06 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,7 +104,6 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 888c8e0..5010595 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -401,15 +401,18 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	 * the optimisation).
 	 */
 	fpregs_lock();
-
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-
 		/*
-		 * Supervisor states are not modified by user space input.  Save
-		 * current supervisor states first and invalidate the FPU regs.
+		 * If supervisor states are available then save the
+		 * hardware state in current's fpstate so that the
+		 * supervisor state is preserved. Save the full state for
+		 * simplicity. There is no point in optimizing this by only
+		 * saving the supervisor states and then shuffle them to
+		 * the right place in memory. This is the slow path and the
+		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_supervisor_to_kernel(&fpu->state.xsave);
+			copy_xregs_to_kernel(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 579e343..427977b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1204,61 +1204,6 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 	return 0;
 }
 
-/*
- * Save only supervisor states to the kernel buffer.  This blows away all
- * old states, and is intended to be used only in __fpu__restore_sig(), where
- * user states are restored from the user buffer.
- */
-void copy_supervisor_to_kernel(struct xregs_state *xstate)
-{
-	struct xstate_header *header;
-	u64 max_bit, min_bit;
-	u32 lmask, hmask;
-	int err, i;
-
-	if (WARN_ON(!boot_cpu_has(X86_FEATURE_XSAVES)))
-		return;
-
-	if (!xfeatures_mask_supervisor())
-		return;
-
-	max_bit = __fls(xfeatures_mask_supervisor());
-	min_bit = __ffs(xfeatures_mask_supervisor());
-
-	lmask = xfeatures_mask_supervisor();
-	hmask = xfeatures_mask_supervisor() >> 32;
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* We should never fault when copying to a kernel buffer: */
-	if (WARN_ON_FPU(err))
-		return;
-
-	/*
-	 * At this point, the buffer has only supervisor states and must be
-	 * converted back to normal kernel format.
-	 */
-	header = &xstate->header;
-	header->xcomp_bv |= xfeatures_mask_all;
-
-	/*
-	 * This only moves states up in the buffer.  Start with
-	 * the last state and move backwards so that states are
-	 * not overwritten until after they are moved.  Note:
-	 * memmove() allows overlapping src/dst buffers.
-	 */
-	for (i = max_bit; i >= min_bit; i--) {
-		u8 *xbuf = (u8 *)xstate;
-
-		if (!((header->xfeatures >> i) & 1))
-			continue;
-
-		/* Move xfeature 'i' into its normal location */
-		memmove(xbuf + xstate_comp_offsets[i],
-			xbuf + xstate_supervisor_only_offsets[i],
-			xstate_sizes[i]);
-	}
-}
-
 /**
  * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
  *                                       an xsave area

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Cleanup arch_set_user_pkey_access()
  2021-06-23 12:01 ` [patch V4 23/65] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     9fe8a6f5eed8fff6b2d7dbc99b911334e311732d
Gitweb:        https://git.kernel.org/tip/9fe8a6f5eed8fff6b2d7dbc99b911334e311732d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:50 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:52:41 +02:00

x86/fpu: Cleanup arch_set_user_pkey_access()

The function does a sanity check with a WARN_ON_ONCE() but happily proceeds
when the pkey argument is out of range.

Clean it up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.635764326@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 185cc5d..579e343 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -912,11 +912,10 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
  * rights for @pkey to @init_val.
  */
 int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
-		unsigned long init_val)
+			      unsigned long init_val)
 {
-	u32 old_pkru;
-	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
-	u32 new_pkru_bits = 0;
+	u32 old_pkru, new_pkru_bits = 0;
+	int pkey_shift;
 
 	/*
 	 * This check implies XSAVE support.  OSPKE only gets
@@ -930,7 +929,8 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	 * values originating from in-kernel users.  Complain
 	 * if a bad value is observed.
 	 */
-	WARN_ON_ONCE(pkey >= arch_max_pkey());
+	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
+		return -EINVAL;
 
 	/* Set the bits we need in PKRU:  */
 	if (init_val & PKEY_DISABLE_ACCESS)
@@ -939,6 +939,7 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		new_pkru_bits |= PKRU_WD_BIT;
 
 	/* Shift the bits in to the correct place in PKRU for pkey: */
+	pkey_shift = pkey * PKRU_BITS_PER_PKEY;
 	new_pkru_bits <<= pkey_shift;
 
 	/* Get old PKRU and mask off any old bits in place: */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-23 12:01 ` [patch V4 22/65] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Dave Hansen
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Dave Hansen @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     71ef453355a9197fcfd8ff22391a4ad7861d79e6
Gitweb:        https://git.kernel.org/tip/71ef453355a9197fcfd8ff22391a4ad7861d79e6
Author:        Dave Hansen <dave.hansen@linux.intel.com>
AuthorDate:    Wed, 23 Jun 2021 14:01:49 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/kvm: Avoid looking up PKRU in XSAVE buffer

PKRU is being removed from the kernel XSAVE/FPU buffers.  This removal
will probably include warnings for code that look up PKRU in those
buffers.

KVM currently looks up the location of PKRU but doesn't even use the
pointer that it gets back.  Rework the code to avoid calling
get_xsave_addr() except in cases where its result is actually used.

This makes the code more clear and also avoids the inevitable PKRU
warnings.

This is probably a good cleanup and could go upstream idependently
of any PKRU rework.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.541037562@linutronix.de
---
 arch/x86/kvm/x86.c | 45 ++++++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e0f4a46..c25bf24 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4604,20 +4604,21 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *src = get_xsave_addr(xsave, xfeature_nr);
-
-		if (src) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(dest + offset, &vcpu->arch.pkru,
-				       sizeof(vcpu->arch.pkru));
-			else
-				memcpy(dest + offset, src, size);
+		void *src;
+
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
 
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(dest + offset, &vcpu->arch.pkru,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			src = get_xsave_addr(xsave, xfeature_nr);
+			if (src)
+				memcpy(dest + offset, src, size);
 		}
 
 		valid -= xfeature_mask;
@@ -4647,18 +4648,20 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *dest = get_xsave_addr(xsave, xfeature_nr);
-
-		if (dest) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(&vcpu->arch.pkru, src + offset,
-				       sizeof(vcpu->arch.pkru));
-			else
+
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
+
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(&vcpu->arch.pkru, src + offset,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			void *dest = get_xsave_addr(xsave, xfeature_nr);
+
+			if (dest)
 				memcpy(dest, src + offset, size);
 		}
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Get rid of using_compacted_format()
  2021-06-23 12:01 ` [patch V4 21/65] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     02b93c0b00df222b9ccf7a1fbd0eb59353d0a58c
Gitweb:        https://git.kernel.org/tip/02b93c0b00df222b9ccf7a1fbd0eb59353d0a58c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:48 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu: Get rid of using_compacted_format()

This function is pointlessly global and a complete misnomer because it's
usage is related to both supervisor state checks and compacted format
checks. Remove it and just make the conditions check the XSAVES feature.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.425493349@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  1 -
 arch/x86/kernel/fpu/xstate.c      | 22 ++++------------------
 2 files changed, 4 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 732ae79..6e5ba42 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index e17fde9..185cc5d 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -449,20 +449,6 @@ int xfeature_size(int xfeature_nr)
 	return eax;
 }
 
-/*
- * 'XSAVES' implies two different things:
- * 1. saving of supervisor/system state
- * 2. using the compacted format
- *
- * Use this function when dealing with the compacted format so
- * that it is obvious which aspect of 'XSAVES' is being handled
- * by the calling code.
- */
-int using_compacted_format(void)
-{
-	return boot_cpu_has(X86_FEATURE_XSAVES);
-}
-
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
@@ -581,9 +567,9 @@ static void do_extra_xstate_size_checks(void)
 		check_xstate_against_struct(i);
 		/*
 		 * Supervisor state components can be managed only by
-		 * XSAVES, which is compacted-format only.
+		 * XSAVES.
 		 */
-		if (!using_compacted_format())
+		if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
 			XSTATE_WARN_ON(xfeature_is_supervisor(i));
 
 		/* Align from the end of the previous feature */
@@ -593,9 +579,9 @@ static void do_extra_xstate_size_checks(void)
 		 * The offset of a given state in the non-compacted
 		 * format is given to us in a CPUID leaf.  We check
 		 * them for being ordered (increasing offsets) in
-		 * setup_xstate_features().
+		 * setup_xstate_features(). XSAVES uses compacted format.
 		 */
-		if (!using_compacted_format())
+		if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
 			paranoid_xstate_size = xfeature_uncompacted_offset(i);
 		/*
 		 * The compacted-format offset always depends on where

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/regset: Move fpu__read_begin() into regset
  2021-06-23 12:01 ` [patch V4 19/65] x86/fpu/regset: Move fpu__read_begin() into regset Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     5a32fac8dbe8adc08c10e2c8770c95aebfc627cd
Gitweb:        https://git.kernel.org/tip/5a32fac8dbe8adc08c10e2c8770c95aebfc627cd
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:46 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu/regset: Move fpu__read_begin() into regset

The function can only be used from the regset get() callbacks safely. So
there is no reason to have it globally exposed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.234942936@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  1 -
 arch/x86/kernel/fpu/core.c          | 20 --------------------
 arch/x86/kernel/fpu/regset.c        | 22 +++++++++++++++++++---
 3 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 0148791..4c669f3 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -26,7 +26,6 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__prepare_read(struct fpu *fpu);
 extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 9ff467b..d390644 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -282,26 +282,6 @@ static void fpu__initialize(struct fpu *fpu)
 }
 
 /*
- * This function must be called before we read a task's fpstate.
- *
- * There's two cases where this gets called:
- *
- * - for the current task (when coredumping), in which case we have
- *   to save the latest FPU registers into the fpstate,
- *
- * - or it's called for stopped tasks (ptrace), in which case the
- *   registers were already saved by the context-switch code when
- *   the task scheduled out.
- *
- * If the task has used the FPU before then save it.
- */
-void fpu__prepare_read(struct fpu *fpu)
-{
-	if (fpu == &current->thread.fpu)
-		fpu__save(fpu);
-}
-
-/*
  * This function must be called before we write a task's fpstate.
  *
  * Invalidate any cached FPU registers.
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 4b799e4..937adf7 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -28,6 +28,22 @@ int regset_xregset_fpregs_active(struct task_struct *target, const struct user_r
 		return 0;
 }
 
+/*
+ * The regset get() functions are invoked from:
+ *
+ *   - coredump to dump the current task's fpstate. If the current task
+ *     owns the FPU then the memory state has to be synchronized and the
+ *     FPU register state preserved. Otherwise fpstate is already in sync.
+ *
+ *   - ptrace to dump fpstate of a stopped task, in which case the registers
+ *     have already been saved to fpstate on context switch.
+ */
+static void sync_fpstate(struct fpu *fpu)
+{
+	if (fpu == &current->thread.fpu)
+		fpu__save(fpu);
+}
+
 int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
@@ -36,7 +52,7 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_feature_enabled(X86_FEATURE_FXSR))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	sync_fpstate(fpu);
 
 	if (!use_xsave()) {
 		return membuf_write(&to, &fpu->state.fxsave,
@@ -96,7 +112,7 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	sync_fpstate(fpu);
 
 	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
 	return 0;
@@ -287,7 +303,7 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
 	struct user_i387_ia32_struct env;
 	struct fxregs_state fxsave, *fx;
 
-	fpu__prepare_read(fpu);
+	sync_fpstate(fpu);
 
 	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return fpregs_soft_get(target, regset, to);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Move fpu__write_begin() to regset
  2021-06-23 12:01 ` [patch V4 20/65] x86/fpu: Move fpu__write_begin() to regset Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     dbb60ac764581e62f2116c5a6b8926ba3a872dd4
Gitweb:        https://git.kernel.org/tip/dbb60ac764581e62f2116c5a6b8926ba3a872dd4
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:47 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu: Move fpu__write_begin() to regset

The only usecase for fpu__write_begin is the set() callback of regset, so
the function is pointlessly global.

Move it to the regset code and rename it to fpu_force_restore() which is
exactly decribing what the function does.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.328652975@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  1 -
 arch/x86/kernel/fpu/core.c          | 24 ------------------------
 arch/x86/kernel/fpu/regset.c        | 25 ++++++++++++++++++++++---
 3 files changed, 22 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 4c669f3..854bfb0 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -26,7 +26,6 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d390644..2243701 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -282,30 +282,6 @@ static void fpu__initialize(struct fpu *fpu)
 }
 
 /*
- * This function must be called before we write a task's fpstate.
- *
- * Invalidate any cached FPU registers.
- *
- * After this function call, after registers in the fpstate are
- * modified and the child task has woken up, the child task will
- * restore the modified FPU state from the modified context. If we
- * didn't clear its cached status here then the cached in-registers
- * state pending on its former CPU could be restored, corrupting
- * the modifications.
- */
-void fpu__prepare_write(struct fpu *fpu)
-{
-	/*
-	 * Only stopped child tasks can be used to modify the FPU
-	 * state in the fpstate buffer:
-	 */
-	WARN_ON_FPU(fpu == &current->thread.fpu);
-
-	/* Invalidate any cached state: */
-	__fpu_invalidate_fpregs_state(fpu);
-}
-
-/*
  * Drops current FPU state: deactivates the fpregs and
  * the fpstate. NOTE: it still leaves previous contents
  * in the fpregs in the eager-FPU case.
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 937adf7..ddc290d 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -44,6 +44,25 @@ static void sync_fpstate(struct fpu *fpu)
 		fpu__save(fpu);
 }
 
+/*
+ * Invalidate cached FPU registers before modifying the stopped target
+ * task's fpstate.
+ *
+ * This forces the target task on resume to restore the FPU registers from
+ * modified fpstate. Otherwise the task might skip the restore and operate
+ * with the cached FPU registers which discards the modifications.
+ */
+static void fpu_force_restore(struct fpu *fpu)
+{
+	/*
+	 * Only stopped child tasks can be used to modify the FPU
+	 * state in the fpstate buffer:
+	 */
+	WARN_ON_FPU(fpu == &current->thread.fpu);
+
+	__fpu_invalidate_fpregs_state(fpu);
+}
+
 int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
@@ -88,7 +107,7 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (newstate.mxcsr & ~mxcsr_feature_mask)
 		return -EINVAL;
 
-	fpu__prepare_write(fpu);
+	fpu_force_restore(fpu);
 
 	/* Copy the state  */
 	memcpy(&fpu->state.fxsave, &newstate, sizeof(newstate));
@@ -146,7 +165,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 		}
 	}
 
-	fpu__prepare_write(fpu);
+	fpu_force_restore(fpu);
 	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
 out:
@@ -346,7 +365,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (ret)
 		return ret;
 
-	fpu__prepare_write(fpu);
+	fpu_force_restore(fpu);
 
 	if (cpu_feature_enabled(X86_FEATURE_FXSR))
 		convert_to_fxsr(&fpu->state.fxsave, &env);

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Remove fpstate_sanitize_xstate()
  2021-06-23 12:01 ` [patch V4 18/65] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     afac9e894364418731d1d7e66c1118b31fd130e8
Gitweb:        https://git.kernel.org/tip/afac9e894364418731d1d7e66c1118b31fd130e8
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:45 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu: Remove fpstate_sanitize_xstate()

No more users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.124819167@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  2 +-
 arch/x86/kernel/fpu/xstate.c        | 79 +----------------------------
 2 files changed, 81 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 9ed9846..0148791 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -86,8 +86,6 @@ extern void fpstate_init_soft(struct swregs_state *soft);
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-extern void fpstate_sanitize_xstate(struct fpu *fpu);
-
 #define user_insn(insn, output, input...)				\
 ({									\
 	int err;							\
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 8d023a2..e17fde9 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -129,85 +129,6 @@ static bool xfeature_is_supervisor(int xfeature_nr)
 }
 
 /*
- * When executing XSAVEOPT (or other optimized XSAVE instructions), if
- * a processor implementation detects that an FPU state component is still
- * (or is again) in its initialized state, it may clear the corresponding
- * bit in the header.xfeatures field, and can skip the writeout of registers
- * to the corresponding memory layout.
- *
- * This means that when the bit is zero, the state component might still contain
- * some previous - non-initialized register state.
- *
- * Before writing xstate information to user-space we sanitize those components,
- * to always ensure that the memory layout of a feature will be in the init state
- * if the corresponding header bit is zero. This is to ensure that user-space doesn't
- * see some stale state in the memory layout during signal handling, debugging etc.
- */
-void fpstate_sanitize_xstate(struct fpu *fpu)
-{
-	struct fxregs_state *fx = &fpu->state.fxsave;
-	int feature_bit;
-	u64 xfeatures;
-
-	if (!use_xsaveopt())
-		return;
-
-	xfeatures = fpu->state.xsave.header.xfeatures;
-
-	/*
-	 * None of the feature bits are in init state. So nothing else
-	 * to do for us, as the memory layout is up to date.
-	 */
-	if ((xfeatures & xfeatures_mask_all) == xfeatures_mask_all)
-		return;
-
-	/*
-	 * FP is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_FP)) {
-		fx->cwd = 0x37f;
-		fx->swd = 0;
-		fx->twd = 0;
-		fx->fop = 0;
-		fx->rip = 0;
-		fx->rdp = 0;
-		memset(fx->st_space, 0, sizeof(fx->st_space));
-	}
-
-	/*
-	 * SSE is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_SSE))
-		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
-
-	/*
-	 * First two features are FPU and SSE, which above we handled
-	 * in a special way already:
-	 */
-	feature_bit = 0x2;
-	xfeatures = (xfeatures_mask_user() & ~xfeatures) >> 2;
-
-	/*
-	 * Update all the remaining memory layouts according to their
-	 * standard xstate layout, if their header bit is in the init
-	 * state:
-	 */
-	while (xfeatures) {
-		if (xfeatures & 0x1) {
-			int offset = xstate_comp_offsets[feature_bit];
-			int size = xstate_sizes[feature_bit];
-
-			memcpy((void *)fx + offset,
-			       (void *)&init_fpstate.xsave + offset,
-			       size);
-		}
-
-		xfeatures >>= 1;
-		feature_bit++;
-	}
-}
-
-/*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
  */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get()
  2021-06-23 12:01 ` [patch V4 17/65] x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     3f7f75634ccefefcc929696f346db7a748e78f79
Gitweb:        https://git.kernel.org/tip/3f7f75634ccefefcc929696f346db7a748e78f79
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:44 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get()

Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the
FX state when XSAVE* is in use. This avoids to overwrite the FPU state
buffer with fpstate_sanitize_xstate() which is error prone and duplicated
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121453.014441775@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index ccbe25f..4b799e4 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -210,10 +210,10 @@ static inline u32 twd_fxsr_to_i387(struct fxregs_state *fxsave)
  * FXSR floating point environment conversions.
  */
 
-void
-convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+static void __convert_from_fxsr(struct user_i387_ia32_struct *env,
+				struct task_struct *tsk,
+				struct fxregs_state *fxsave)
 {
-	struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
 	struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
 	struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
 	int i;
@@ -247,6 +247,12 @@ convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
 		memcpy(&to[i], &from[i], sizeof(to[0]));
 }
 
+void
+convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+{
+	__convert_from_fxsr(env, tsk, &tsk->thread.fpu.state.fxsave);
+}
+
 void convert_to_fxsr(struct fxregs_state *fxsave,
 		     const struct user_i387_ia32_struct *env)
 
@@ -279,25 +285,29 @@ int fpregs_get(struct task_struct *target, const struct user_regset *regset,
 {
 	struct fpu *fpu = &target->thread.fpu;
 	struct user_i387_ia32_struct env;
+	struct fxregs_state fxsave, *fx;
 
 	fpu__prepare_read(fpu);
 
-	if (!boot_cpu_has(X86_FEATURE_FPU))
+	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return fpregs_soft_get(target, regset, to);
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR)) {
+	if (!cpu_feature_enabled(X86_FEATURE_FXSR)) {
 		return membuf_write(&to, &fpu->state.fsave,
 				    sizeof(struct fregs_state));
 	}
 
-	fpstate_sanitize_xstate(fpu);
+	if (use_xsave()) {
+		struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) };
 
-	if (to.left == sizeof(env)) {
-		convert_from_fxsr(to.p, target);
-		return 0;
+		/* Handle init state optimized xstate correctly */
+		copy_xstate_to_uabi_buf(mb, &fpu->state.xsave, XSTATE_COPY_FP);
+		fx = &fxsave;
+	} else {
+		fx = &fpu->state.fxsave;
 	}
 
-	convert_from_fxsr(&env, target);
+	__convert_from_fxsr(&env, target, fx);
 	return membuf_write(&to, &env, sizeof(env));
 }
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get()
  2021-06-23 12:01 ` [patch V4 16/65] x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     adc997b3d66d1cfa8c15a7dbafdaef239a51b5db
Gitweb:        https://git.kernel.org/tip/adc997b3d66d1cfa8c15a7dbafdaef239a51b5db
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:43 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get()

Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the
FX state when XSAVE* is in use. This avoids overwriting the FPU state
buffer with fpstate_sanitize_xstate() which is error prone and duplicated
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.901736860@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 783f84d..ccbe25f 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -33,13 +33,18 @@ int xfpregs_get(struct task_struct *target, const struct user_regset *regset,
 {
 	struct fpu *fpu = &target->thread.fpu;
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
+	if (!cpu_feature_enabled(X86_FEATURE_FXSR))
 		return -ENODEV;
 
 	fpu__prepare_read(fpu);
-	fpstate_sanitize_xstate(fpu);
 
-	return membuf_write(&to, &fpu->state.fxsave, sizeof(struct fxregs_state));
+	if (!use_xsave()) {
+		return membuf_write(&to, &fpu->state.fxsave,
+				    sizeof(fpu->state.fxsave));
+	}
+
+	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_FX);
+	return 0;
 }
 
 int xfpregs_set(struct task_struct *target, const struct user_regset *regset,

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()
  2021-06-23 12:01 ` [patch V4 15/65] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  2021-06-24 15:09   ` [PATCH] x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again Thomas Gleixner
  1 sibling, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     eb6f51723f03c9a1c098ed196a31a03e626b9fb6
Gitweb:        https://git.kernel.org/tip/eb6f51723f03c9a1c098ed196a31a03e626b9fb6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:42 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:47 +02:00

x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()

When xsave with init state optimization is used then a component's state
in the task's xsave buffer can be stale when the corresponding feature bit
is not set.

fpregs_get() and xfpregs_get() invoke fpstate_sanitize_xstate() to update
the task's xsave buffer before retrieving the FX or FP state. That's just
duplicated code as copy_xstate_to_kernel() already handles this correctly.

Add a copy mode argument to the function which allows to restrict the state
copy to the FP and SSE features.

Also rename the function to copy_xstate_to_uabi_buf() so the name reflects
what it is doing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.805327286@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h | 12 +++++++--
 arch/x86/kernel/fpu/regset.c      |  2 +-
 arch/x86/kernel/fpu/xstate.c      | 42 ++++++++++++++++++++++--------
 3 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 1bb2d16..732ae79 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -103,12 +103,20 @@ extern void __init update_regset_xstate_info(unsigned int size,
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
-struct membuf;
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
+enum xstate_copy_mode {
+	XSTATE_COPY_FP,
+	XSTATE_COPY_FX,
+	XSTATE_COPY_XSAVE,
+};
+
+struct membuf;
+void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+			     enum xstate_copy_mode mode);
+
 #endif
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 7041b14..783f84d 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -93,7 +93,7 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 
 	fpu__prepare_read(fpu);
 
-	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	copy_xstate_to_uabi_buf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4203247..8d023a2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1068,14 +1068,20 @@ static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
 	membuf_write(to, from_xstate ? xstate : init_xstate, size);
 }
 
-/*
- * Convert from kernel XSAVE or XSAVES compacted format to UABI
- * non-compacted format and copy to a kernel-space ptrace buffer.
+/**
+ * copy_xstate_to_uabi_buf - Copy kernel saved xstate to a UABI buffer
+ * @to:		membuf descriptor
+ * @xsave:	The kernel xstate buffer to copy from
+ * @copy_mode:	The requested copy mode
+ *
+ * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
+ * format, i.e. from the kernel internal hardware dependent storage format
+ * to the requested @mode. UABI XSTATE is always uncompacted!
  *
- * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVE.
+ * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
+void copy_xstate_to_uabi_buf(struct membuf to, struct xregs_state *xsave,
+			     enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
 	struct xregs_state *xinit = &init_fpstate.xsave;
@@ -1083,12 +1089,22 @@ void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 	unsigned int zerofrom;
 	int i;
 
-	/*
-	 * The destination is a ptrace buffer; we put in only user xstates:
-	 */
-	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
-	header.xfeatures &= xfeatures_mask_user();
+
+	/* Mask out the feature bits depending on copy mode */
+	switch (copy_mode) {
+	case XSTATE_COPY_FP:
+		header.xfeatures &= XFEATURE_MASK_FP;
+		break;
+
+	case XSTATE_COPY_FX:
+		header.xfeatures &= XFEATURE_MASK_FP | XFEATURE_MASK_SSE;
+		break;
+
+	case XSTATE_COPY_XSAVE:
+		header.xfeatures &= xfeatures_mask_user();
+		break;
+	}
 
 	/* Copy FP state up to MXCSR */
 	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
@@ -1109,6 +1125,9 @@ void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
 		     sizeof(xsave->i387.xmm_space));
 
+	if (copy_mode != XSTATE_COPY_XSAVE)
+		goto out;
+
 	/* Zero the padding area */
 	membuf_zero(&to, sizeof(xsave->i387.padding));
 
@@ -1150,6 +1169,7 @@ void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 		zerofrom = xstate_offsets[i] + xstate_sizes[i];
 	}
 
+out:
 	if (to.left)
 		membuf_zero(&to, to.left);
 }

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Clean up fpregs_set()
  2021-06-23 12:01 ` [patch V4 14/65] x86/fpu: Clean up fpregs_set() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Andy Lutomirski
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     da53f60bb86e60830932926cf1093953a811912c
Gitweb:        https://git.kernel.org/tip/da53f60bb86e60830932926cf1093953a811912c
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Wed, 23 Jun 2021 14:01:41 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Clean up fpregs_set()

fpregs_set() has unnecessary complexity to support short or nonzero-offset
writes and to handle the case in which a copy from userspace overwrites
some of the target buffer and then fails.  Support for partial writes is
useless -- just require that the write has offset 0 and the correct size,
and copy into a temporary kernel buffer to avoid clobbering the state if
the user access fails.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.710467587@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 5610f77..7041b14 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -304,31 +304,32 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
+	/* No funny business with partial or oversized writes is permitted. */
+	if (pos != 0 || count != sizeof(struct user_i387_ia32_struct))
+		return -EINVAL;
 
-	if (!boot_cpu_has(X86_FEATURE_FPU))
+	if (!cpu_feature_enabled(X86_FEATURE_FPU))
 		return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
-		return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-					  &fpu->state.fsave, 0,
-					  -1);
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
+	if (ret)
+		return ret;
 
-	if (pos > 0 || count < sizeof(env))
-		convert_from_fxsr(&env, target);
+	fpu__prepare_write(fpu);
 
-	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
-	if (!ret)
-		convert_to_fxsr(&target->thread.fpu.state.fxsave, &env);
+	if (cpu_feature_enabled(X86_FEATURE_FXSR))
+		convert_to_fxsr(&fpu->state.fxsave, &env);
+	else
+		memcpy(&fpu->state.fsave, &env, sizeof(env));
 
 	/*
-	 * update the header bit in the xsave header, indicating the
+	 * Update the header bit in the xsave header, indicating the
 	 * presence of FP.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
+	if (cpu_feature_enabled(X86_FEATURE_XSAVE))
 		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FP;
-	return ret;
+
+	return 0;
 }
 
 #endif	/* CONFIG_X86_32 || CONFIG_IA32_EMULATION */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values
  2021-06-23 12:01 ` [patch V4 13/65] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Andy Lutomirski
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     145e9e0d8c6fada4a40f9fc65b34658077874d9c
Gitweb:        https://git.kernel.org/tip/145e9e0d8c6fada4a40f9fc65b34658077874d9c
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Wed, 23 Jun 2021 14:01:40 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values

There is no benefit from accepting and silently changing an invalid MXCSR
value supplied via ptrace().  Instead, return -EINVAL on invalid input.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.613614842@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index f24ce87..5610f77 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -63,8 +63,9 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (ret)
 		return ret;
 
-	/* Mask invalid MXCSR bits (for historical reasons). */
-	newstate.mxcsr &= mxcsr_feature_mask;
+	/* Do not allow an invalid MXCSR value. */
+	if (newstate.mxcsr & ~mxcsr_feature_mask)
+		return -EINVAL;
 
 	fpu__prepare_write(fpu);
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Rewrite xfpregs_set()
  2021-06-23 12:01 ` [patch V4 12/65] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Andy Lutomirski
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     6164331d15f7d912fb9369245368e9564ea49813
Gitweb:        https://git.kernel.org/tip/6164331d15f7d912fb9369245368e9564ea49813
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Wed, 23 Jun 2021 14:01:39 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Rewrite xfpregs_set()

xfpregs_set() was incomprehensible.  Almost all of the complexity was due
to trying to support nonsensically sized writes or -EFAULT errors that
would have partially or completely overwritten the destination before
failing.  Nonsensically sized input would only have been possible using
PTRACE_SETREGSET on REGSET_XFP.  Fortunately, it appears (based on Debian
code search results) that no one uses that API at all, let alone with the
wrong sized buffer.  Failed user access can be handled more cleanly by
first copying to kernel memory.

Just rewrite it to require sensible input.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.504234607@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 37 +++++++++++++++++++++--------------
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index d60e77d..f24ce87 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -47,30 +47,39 @@ int xfpregs_set(struct task_struct *target, const struct user_regset *regset,
 		const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
+	struct user32_fxsr_struct newstate;
 	int ret;
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
+	BUILD_BUG_ON(sizeof(newstate) != sizeof(struct fxregs_state));
+
+	if (!cpu_feature_enabled(X86_FEATURE_FXSR))
 		return -ENODEV;
 
+	/* No funny business with partial or oversized writes is permitted. */
+	if (pos != 0 || count != sizeof(newstate))
+		return -EINVAL;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newstate, 0, -1);
+	if (ret)
+		return ret;
+
+	/* Mask invalid MXCSR bits (for historical reasons). */
+	newstate.mxcsr &= mxcsr_feature_mask;
+
 	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
 
-	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-				 &fpu->state.fxsave, 0, -1);
+	/* Copy the state  */
+	memcpy(&fpu->state.fxsave, &newstate, sizeof(newstate));
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	fpu->state.fxsave.mxcsr &= mxcsr_feature_mask;
+	/* Clear xmm8..15 */
+	BUILD_BUG_ON(sizeof(fpu->state.fxsave.xmm_space) != 16 * 16);
+	memset(&fpu->state.fxsave.xmm_space[8], 0, 8 * 16);
 
-	/*
-	 * update the header bits in the xsave header, indicating the
-	 * presence of FP and SSE state.
-	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
+	/* Mark FP and SSE as in use when XSAVE is enabled */
+	if (use_xsave())
 		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FPSSE;
 
-	return ret;
+	return 0;
 }
 
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Simplify PTRACE_GETREGS code
  2021-06-23 12:01 ` [patch V4 11/65] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Dave Hansen
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Dave Hansen @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Thomas Gleixner, Borislav Petkov, Andy Lutomirski,
	x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     3a3351126ee8f1f1c86c4c79c60a650c1da89733
Gitweb:        https://git.kernel.org/tip/3a3351126ee8f1f1c86c4c79c60a650c1da89733
Author:        Dave Hansen <dave.hansen@linux.intel.com>
AuthorDate:    Wed, 23 Jun 2021 14:01:38 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Simplify PTRACE_GETREGS code

ptrace() has interfaces that let a ptracer inspect a ptracee's register state.
This includes XSAVE state.  The ptrace() ABI includes a hardware-format XSAVE
buffer for both the SETREGS and GETREGS interfaces.

In the old days, the kernel buffer and the ptrace() ABI buffer were the
same boring non-compacted format.  But, since the advent of supervisor
states and the compacted format, the kernel buffer has diverged from the
format presented in the ABI.

This leads to two paths in the kernel:
1. Effectively a verbatim copy_to_user() which just copies the kernel buffer
   out to userspace.  This is used when the kernel buffer is kept in the
   non-compacted form which means that it shares a format with the ptrace
   ABI.
2. A one-state-at-a-time path: copy_xstate_to_kernel().  This is theoretically
   slower since it does a bunch of piecemeal copies.

Remove the verbatim copy case.  Speed probably does not matter in this path,
and the vast majority of new hardware will use the one-state-at-a-time path
anyway.  This ensures greater testing for the "slow" path.

This also makes enabling PKRU in this interface easier since a single path
can be patched instead of two.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.408457100@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 24 +++---------------------
 arch/x86/kernel/fpu/xstate.c |  6 +++---
 2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index a50c0a9..d60e77d 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -77,32 +77,14 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
 
-	if (!boot_cpu_has(X86_FEATURE_XSAVE))
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	xsave = &fpu->state.xsave;
-
 	fpu__prepare_read(fpu);
 
-	if (using_compacted_format()) {
-		copy_xstate_to_kernel(to, xsave);
-		return 0;
-	} else {
-		fpstate_sanitize_xstate(fpu);
-		/*
-		 * Copy the 48 bytes defined by the software into the xsave
-		 * area in the thread struct, so that we can copy the whole
-		 * area to user using one user_regset_copyout().
-		 */
-		memcpy(&xsave->i387.sw_reserved, xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
-
-		/*
-		 * Copy the xstate memory layout.
-		 */
-		return membuf_write(&to, xsave, fpu_user_xstate_size);
-	}
+	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	return 0;
 }
 
 int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9cf84c5..4203247 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1069,11 +1069,11 @@ static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
 }
 
 /*
- * Convert from kernel XSAVES compacted format to standard format and copy
- * to a kernel-space ptrace buffer.
+ * Convert from kernel XSAVE or XSAVES compacted format to UABI
+ * non-compacted format and copy to a kernel-space ptrace buffer.
  *
  * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVES.
+ * from xstateregs_get() and there we check the CPU has XSAVE.
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()
  2021-06-23 12:01 ` [patch V4 10/65] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     947f4947cf00ea1e6d319eb182c64ea51ba4de8d
Gitweb:        https://git.kernel.org/tip/947f4947cf00ea1e6d319eb182c64ea51ba4de8d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:37 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()

Instead of masking out reserved bits, check them and reject the provided
state as invalid if not zero.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.308388343@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 2b7b579..9cf84c5 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1154,6 +1154,19 @@ void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 		membuf_zero(&to, to.left);
 }
 
+static inline bool mxcsr_valid(struct xstate_header *hdr, const u32 *mxcsr)
+{
+	u64 mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
+
+	/* Only check if it is in use */
+	if (hdr->xfeatures & mask) {
+		/* Reserved bits in MXCSR must be zero. */
+		if (*mxcsr & ~mxcsr_feature_mask)
+			return false;
+	}
+	return true;
+}
+
 /*
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
@@ -1172,6 +1185,9 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
 
+	if (!mxcsr_valid(&hdr, kbuf + offsetof(struct fxregs_state, mxcsr)))
+		return -EINVAL;
+
 	for (i = 0; i < XFEATURE_MAX; i++) {
 		u64 mask = ((u64)1 << i);
 
@@ -1202,9 +1218,6 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
-	/* mxcsr reserved bits must be masked to zero for historical reasons. */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Sanitize xstateregs_set()
  2021-06-23 12:01 ` [patch V4 09/65] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  2022-07-14  4:04   ` [patch V4 09/65] " Andrei Vagin
  1 sibling, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     43be46e89698a41dbf4fff81a322f4c2ae21b5e2
Gitweb:        https://git.kernel.org/tip/43be46e89698a41dbf4fff81a322f4c2ae21b5e2
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:36 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Sanitize xstateregs_set()

xstateregs_set() operates on a stopped task and tries to copy the provided
buffer into the task's fpu.state.xsave buffer.

Any error while copying or invalid state detected after copying results in
wiping the target task's FPU state completely including supervisor states.

That's just wrong. The caller supplied invalid data or has a problem with
unmapped memory, so there is absolutely no justification to corrupt the
target state.

Fix this with the following modifications:

 1) If data has to be copied from userspace, allocate a buffer and copy from
    user first.

 2) Use copy_kernel_to_xstate() unconditionally so that header checking
    works correctly.

 3) Return on error without corrupting the target state.

This prevents corrupting states and lets the caller deal with the problem
it caused in the first place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.214903673@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  4 +---
 arch/x86/kernel/fpu/regset.c      | 42 ++++++++++++------------------
 arch/x86/kernel/fpu/xstate.c      | 14 +++++-----
 3 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index d22e973..1bb2d16 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,8 +111,4 @@ void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
-
-/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr);
-
 #endif
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 6bb8744..a50c0a9 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -2,11 +2,13 @@
 /*
  * FPU register's regset abstraction, for ptrace, core dumps, etc.
  */
+#include <linux/sched/task_stack.h>
+#include <linux/vmalloc.h>
+
 #include <asm/fpu/internal.h>
 #include <asm/fpu/signal.h>
 #include <asm/fpu/regset.h>
 #include <asm/fpu/xstate.h>
-#include <linux/sched/task_stack.h>
 
 /*
  * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
@@ -108,10 +110,10 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 		  const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
+	struct xregs_state *tmpbuf = NULL;
 	int ret;
 
-	if (!boot_cpu_has(X86_FEATURE_XSAVE))
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
 	/*
@@ -120,32 +122,22 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
-	xsave = &fpu->state.xsave;
-
-	fpu__prepare_write(fpu);
+	if (!kbuf) {
+		tmpbuf = vmalloc(count);
+		if (!tmpbuf)
+			return -ENOMEM;
 
-	if (using_compacted_format()) {
-		if (kbuf)
-			ret = copy_kernel_to_xstate(xsave, kbuf);
-		else
-			ret = copy_user_to_xstate(xsave, ubuf);
-	} else {
-		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
-		if (!ret)
-			ret = validate_user_xstate_header(&xsave->header);
+		if (copy_from_user(tmpbuf, ubuf, count)) {
+			ret = -EFAULT;
+			goto out;
+		}
 	}
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
-	/*
-	 * In case of failure, mark all states as init:
-	 */
-	if (ret)
-		fpstate_init(&fpu->state);
+	fpu__prepare_write(fpu);
+	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
+out:
+	vfree(tmpbuf);
 	return ret;
 }
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index b82879c..2b7b579 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -543,7 +543,7 @@ int using_compacted_format(void)
 }
 
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr)
+static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
 	if (hdr->xfeatures & ~xfeatures_mask_user())
@@ -1155,7 +1155,7 @@ void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 }
 
 /*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVES format
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
@@ -1202,14 +1202,16 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
+	/* mxcsr reserved bits must be masked to zero for historical reasons. */
+	xsave->i387.mxcsr &= mxcsr_feature_mask;
+
 	return 0;
 }
 
 /*
- * Convert from a ptrace or sigreturn standard-format user-space buffer to
- * kernel XSAVES format and copy to the target thread. This is called from
- * xstateregs_set(), as well as potentially from the sigreturn() and
- * rt_sigreturn() system calls.
+ * Convert from a sigreturn standard-format user-space buffer to kernel
+ * XSAVE[S] format and copy to the target thread. This is called from the
+ * sigreturn() and rt_sigreturn() system calls.
  */
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 {

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-23 12:01 ` [patch V4 08/65] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Andy Lutomirski, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     07d6688b22e09be465652cf2da0da6bf86154df6
Gitweb:        https://git.kernel.org/tip/07d6688b22e09be465652cf2da0da6bf86154df6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:35 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Limit xstate copy size in xstateregs_set()

If the count argument is larger than the xstate size, this will happily
copy beyond the end of xstate.

Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.120741557@linutronix.de
---
 arch/x86/kernel/fpu/regset.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index c413756..6bb8744 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -117,7 +117,7 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	/*
 	 * A whole standard-format XSAVE buffer is needed:
 	 */
-	if ((pos != 0) || (count < fpu_user_xstate_size))
+	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
 	xsave = &fpu->state.xsave;

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Move inlines where they belong
  2021-06-23 12:01 ` [patch V4 07/65] x86/fpu: Move inlines where they belong Thomas Gleixner
  2021-06-23 18:43   ` Bae, Chang Seok
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  1 sibling, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     e68524456c855e500f0a636adb1aa977e1e0b4d8
Gitweb:        https://git.kernel.org/tip/e68524456c855e500f0a636adb1aa977e1e0b4d8
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:34 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Move inlines where they belong

They are only used in fpstate_init() and there is no point to have them in
a header just to make reading the code harder.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121452.023118522@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 14 --------------
 arch/x86/kernel/fpu/core.c          | 15 +++++++++++++++
 2 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 58f2082..9ed9846 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -86,20 +86,6 @@ extern void fpstate_init_soft(struct swregs_state *soft);
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-static inline void fpstate_init_xstate(struct xregs_state *xsave)
-{
-	/*
-	 * XRSTORS requires these bits set in xcomp_bv, or it will
-	 * trigger #GP:
-	 */
-	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
-}
-
-static inline void fpstate_init_fxstate(struct fxregs_state *fx)
-{
-	fx->cwd = 0x37f;
-	fx->mxcsr = MXCSR_DEFAULT;
-}
 extern void fpstate_sanitize_xstate(struct fpu *fpu);
 
 #define user_insn(insn, output, input...)				\
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 571220a..9ff467b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -181,6 +181,21 @@ void fpu__save(struct fpu *fpu)
 	fpregs_unlock();
 }
 
+static inline void fpstate_init_xstate(struct xregs_state *xsave)
+{
+	/*
+	 * XRSTORS requires these bits set in xcomp_bv, or it will
+	 * trigger #GP:
+	 */
+	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
+}
+
+static inline void fpstate_init_fxstate(struct fxregs_state *fx)
+{
+	fx->cwd = 0x37f;
+	fx->mxcsr = MXCSR_DEFAULT;
+}
+
 /*
  * Legacy x87 fpstate state init:
  */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Remove unused get_xsave_field_ptr()
  2021-06-23 12:01 ` [patch V4 06/65] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Andy Lutomirski, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     4098b3eef37be19572d270f9b761c3e8ffcf37ac
Gitweb:        https://git.kernel.org/tip/4098b3eef37be19572d270f9b761c3e8ffcf37ac
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:33 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Remove unused get_xsave_field_ptr()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121451.915614415@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  1 +-
 arch/x86/kernel/fpu/xstate.c      | 30 +------------------------------
 2 files changed, 31 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 47a9223..d22e973 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5e825ff..b82879c 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -998,36 +998,6 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
-/*
- * This wraps up the common operations that need to occur when retrieving
- * data from xsave state.  It first ensures that the current task was
- * using the FPU and retrieves the data in to a buffer.  It then calculates
- * the offset of the requested field in the buffer.
- *
- * This function is safe to call whether the FPU is in use or not.
- *
- * Note that this only works on the current task.
- *
- * Inputs:
- *	@xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
- *	XFEATURE_SSE, etc...)
- * Output:
- *	address of the state in the xsave area or NULL if the state
- *	is not present or is in its 'init state'.
- */
-const void *get_xsave_field_ptr(int xfeature_nr)
-{
-	struct fpu *fpu = &current->thread.fpu;
-
-	/*
-	 * fpu__save() takes the CPU's xstate registers
-	 * and saves them off to the 'fpu memory buffer.
-	 */
-	fpu__save(fpu);
-
-	return get_xsave_addr(&fpu->state.xsave, xfeature_nr);
-}
-
 #ifdef CONFIG_ARCH_HAS_PKEYS
 
 /*

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Get rid of fpu__get_supported_xfeatures_mask()
  2021-06-23 12:01 ` [patch V4 05/65] x86/fpu: Get rid of fpu__get_supported_xfeatures_mask() Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     ce38f038ede735fd425ebda10d1758420a669a87
Gitweb:        https://git.kernel.org/tip/ce38f038ede735fd425ebda10d1758420a669a87
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:32 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:46 +02:00

x86/fpu: Get rid of fpu__get_supported_xfeatures_mask()

This function is really not doing what the comment advertises:

 "Find supported xfeatures based on cpu features and command-line input.
  This must be called after fpu__init_parse_early_param() is called and
  xfeatures_mask is enumerated."

fpu__init_parse_early_param() does not exist anymore and the function just
returns a constant.

Remove it and fix the caller and get rid of further references to
fpu__init_parse_early_param().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121451.816404717@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h |  1 -
 arch/x86/kernel/cpu/common.c        |  5 ++---
 arch/x86/kernel/fpu/init.c          | 11 -----------
 arch/x86/kernel/fpu/xstate.c        |  4 +++-
 4 files changed, 5 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 16bf4d4..58f2082 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -45,7 +45,6 @@ extern void fpu__init_cpu_xstate(void);
 extern void fpu__init_system(struct cpuinfo_x86 *c);
 extern void fpu__init_check_bugs(void);
 extern void fpu__resume_cpu(void);
-extern u64 fpu__get_supported_xfeatures_mask(void);
 
 /*
  * Debugging facility:
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index dcbb11f..f875dea 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1715,9 +1715,8 @@ void print_cpu_info(struct cpuinfo_x86 *c)
 }
 
 /*
- * clearcpuid= was already parsed in fpu__init_parse_early_param.
- * But we need to keep a dummy __setup around otherwise it would
- * show up as an environment variable for init.
+ * clearcpuid= was already parsed in cpu_parse_early_param().  This dummy
+ * function prevents it from becoming an environment variable for init.
  */
 static __init int setup_clearcpuid(char *arg)
 {
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 95aa109..64e2992 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -216,17 +216,6 @@ static void __init fpu__init_system_xstate_size_legacy(void)
 	fpu_user_xstate_size = fpu_kernel_xstate_size;
 }
 
-/*
- * Find supported xfeatures based on cpu features and command-line input.
- * This must be called after fpu__init_parse_early_param() is called and
- * xfeatures_mask is enumerated.
- */
-u64 __init fpu__get_supported_xfeatures_mask(void)
-{
-	return XFEATURE_MASK_USER_SUPPORTED |
-	       XFEATURE_MASK_SUPERVISOR_SUPPORTED;
-}
-
 /* Legacy code to initialize eager fpu mode. */
 static void __init fpu__init_system_ctx_switch(void)
 {
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index a64f61a..5e825ff 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -868,7 +868,9 @@ void __init fpu__init_system_xstate(void)
 			xfeatures_mask_all &= ~BIT_ULL(i);
 	}
 
-	xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();
+	xfeatures_mask_all &= XFEATURE_MASK_USER_SUPPORTED |
+			      XFEATURE_MASK_SUPERVISOR_SUPPORTED;
+
 	/* Store it for paranoia check at the end */
 	xfeatures = xfeatures_mask_all;
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Make xfeatures_mask_all __ro_after_init
  2021-06-23 12:01 ` [patch V4 04/65] x86/fpu: Make xfeatures_mask_all __ro_after_init Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kan Liang, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     4e8e4313cf81add679e1c57677d689c02e382a67
Gitweb:        https://git.kernel.org/tip/4e8e4313cf81add679e1c57677d689c02e382a67
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:31 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:45 +02:00

x86/fpu: Make xfeatures_mask_all __ro_after_init

Nothing has to modify this after init.

But of course there is code which unconditionally masks
xfeatures_mask_all on CPU hotplug. This goes unnoticed during boot
hotplug because at that point the variable is still RW mapped.

This is broken in several ways:

  1) Masking this in post init CPU hotplug means that any
     modification of this state goes unnoticed until actual hotplug
     happens.

  2) If that ever happens then these bogus feature bits are already
     populated all over the place and the system is in inconsistent state
     vs. the compacted XSTATE offsets. If at all then this has to panic the
     machine because the inconsistency cannot be undone anymore.

Make this a one-time paranoia check in xstate init code and disable
xsave when this happens.

Reported-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121451.712803952@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index bcb9f56..a64f61a 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -59,7 +59,7 @@ static short xsave_cpuid_features[] __initdata = {
  * This represents the full set of bits that should ever be set in a kernel
  * XSAVE buffer, both supervisor and user xstates.
  */
-u64 xfeatures_mask_all __read_mostly;
+u64 xfeatures_mask_all __ro_after_init;
 
 static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
@@ -213,19 +213,8 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
  */
 void fpu__init_cpu_xstate(void)
 {
-	u64 unsup_bits;
-
 	if (!boot_cpu_has(X86_FEATURE_XSAVE) || !xfeatures_mask_all)
 		return;
-	/*
-	 * Unsupported supervisor xstates should not be found in
-	 * the xfeatures mask.
-	 */
-	unsup_bits = xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;
-	WARN_ONCE(unsup_bits, "x86/fpu: Found unsupported supervisor xstates: 0x%llx\n",
-		  unsup_bits);
-
-	xfeatures_mask_all &= ~XFEATURE_MASK_SUPERVISOR_UNSUPPORTED;
 
 	cr4_set_bits(X86_CR4_OSXSAVE);
 
@@ -825,6 +814,7 @@ void __init fpu__init_system_xstate(void)
 {
 	unsigned int eax, ebx, ecx, edx;
 	static int on_boot_cpu __initdata = 1;
+	u64 xfeatures;
 	int err;
 	int i;
 
@@ -879,6 +869,8 @@ void __init fpu__init_system_xstate(void)
 	}
 
 	xfeatures_mask_all &= fpu__get_supported_xfeatures_mask();
+	/* Store it for paranoia check at the end */
+	xfeatures = xfeatures_mask_all;
 
 	/* Enable xstate instructions to be able to continue with initialization: */
 	fpu__init_cpu_xstate();
@@ -896,8 +888,18 @@ void __init fpu__init_system_xstate(void)
 	setup_init_fpu_buf();
 	setup_xstate_comp_offsets();
 	setup_supervisor_only_offsets();
-	print_xstate_offset_size();
 
+	/*
+	 * Paranoia check whether something in the setup modified the
+	 * xfeatures mask.
+	 */
+	if (xfeatures != xfeatures_mask_all) {
+		pr_err("x86/fpu: xfeatures modified from 0x%016llx to 0x%016llx during init, disabling XSAVE\n",
+		       xfeatures, xfeatures_mask_all);
+		goto out_disable;
+	}
+
+	print_xstate_offset_size();
 	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
 		xfeatures_mask_all,
 		fpu_kernel_xstate_size,

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Mark various FPU state variables __ro_after_init
  2021-06-23 12:01 ` [patch V4 03/65] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Andy Lutomirski, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     ce578f16348b003675c928a1992498b33b515f18
Gitweb:        https://git.kernel.org/tip/ce578f16348b003675c928a1992498b33b515f18
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:30 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:45 +02:00

x86/fpu: Mark various FPU state variables __ro_after_init

Nothing modifies these after booting.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20210623121451.611751529@linutronix.de
---
 arch/x86/kernel/fpu/init.c   |  4 ++--
 arch/x86/kernel/fpu/xstate.c | 14 +++++++++-----
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 701f196..95aa109 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -89,7 +89,7 @@ static void fpu__init_system_early_generic(struct cpuinfo_x86 *c)
 /*
  * Boot time FPU feature detection code:
  */
-unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
+unsigned int mxcsr_feature_mask __ro_after_init = 0xffffffffu;
 EXPORT_SYMBOL_GPL(mxcsr_feature_mask);
 
 static void __init fpu__init_system_mxcsr(void)
@@ -135,7 +135,7 @@ static void __init fpu__init_system_generic(void)
  * This is inherent to the XSAVE architecture which puts all state
  * components into a single, continuous memory block:
  */
-unsigned int fpu_kernel_xstate_size;
+unsigned int fpu_kernel_xstate_size __ro_after_init;
 EXPORT_SYMBOL_GPL(fpu_kernel_xstate_size);
 
 /* Get alignment of the TYPE. */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d822ce7..bcb9f56 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -61,17 +61,21 @@ static short xsave_cpuid_features[] __initdata = {
  */
 u64 xfeatures_mask_all __read_mostly;
 
-static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
  * it is always in standard format for user mode. This is the user
  * mode standard format size used for signal and ptrace frames.
  */
-unsigned int fpu_user_xstate_size;
+unsigned int fpu_user_xstate_size __ro_after_init;
 
 /*
  * Return whether the system supports a given xfeature.

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")
  2021-06-23 12:01 ` [patch V4 02/65] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     b3607269ff57fd3c9690cb25962c5e4b91a0fd3b
Gitweb:        https://git.kernel.org/tip/b3607269ff57fd3c9690cb25962c5e4b91a0fd3b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:29 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:45 +02:00

x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")

This cannot work and it's unclear how that ever made a difference.

init_fpstate.xsave.header.xfeatures is always 0 so get_xsave_addr() will
always return a NULL pointer, which will prevent storing the default PKRU
value in init_fpstate.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121451.451391598@linutronix.de
---
 arch/x86/kernel/cpu/common.c | 5 -----
 arch/x86/mm/pkeys.c          | 6 ------
 2 files changed, 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9173dd4..dcbb11f 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,8 +466,6 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	struct pkru_state *pk;
-
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@@ -478,9 +476,6 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 		return;
 
 	cr4_set_bits(X86_CR4_PKE);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (pk)
-		pk->pkru = init_pkru_value;
 	/*
 	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index a2332ee..a840ac7 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/fpu/internal.h>		/* init_fpstate			*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -154,7 +153,6 @@ static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
-	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@@ -177,10 +175,6 @@ static ssize_t init_pkru_write_file(struct file *file,
 		return -EINVAL;
 
 	WRITE_ONCE(init_pkru_value, new_init_pkru);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (!pk)
-		return -EINVAL;
-	pk->pkru = new_init_pkru;
 	return count;
 }
 

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-23 12:01 ` [patch V4 01/65] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
@ 2021-06-23 22:09   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-23 22:09 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     9625895011d130033d1bc7aac0d77a9bf68ff8a6
Gitweb:        https://git.kernel.org/tip/9625895011d130033d1bc7aac0d77a9bf68ff8a6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 23 Jun 2021 14:01:28 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 23 Jun 2021 17:49:45 +02:00

x86/fpu: Fix copy_xstate_to_kernel() gap handling

The gap handling in copy_xstate_to_kernel() is wrong when XSAVES is in
use.

Using init_fpstate for copying the init state of features which are
not set in the xstate header is only correct for the legacy area, but
not for the extended features area because when XSAVES is in use then
init_fpstate is in compacted form which means the xstate offsets which
are used to copy from init_fpstate are not valid.

Fortunately, this is not a real problem today because all extended
features in use have an all-zeros init state, but it is wrong
nevertheless and with a potentially dynamically sized init_fpstate this
would result in an access outside of the init_fpstate.

Fix this by keeping track of the last copied state in the target buffer and
explicitly zero it when there is a feature or alignment gap.

Use the compacted offset when accessing the extended feature space in
init_fpstate.

As this is not a functional issue on older kernels this is intentionally
not tagged for stable.

Fixes: b8be15d58806 ("x86/fpu/xstate: Re-enable XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210623121451.294282032@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 105 +++++++++++++++++++---------------
 1 file changed, 61 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index f0f64d4..d822ce7 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1084,20 +1084,10 @@ static inline bool xfeatures_mxcsr_quirk(u64 xfeatures)
 	return true;
 }
 
-static void fill_gap(struct membuf *to, unsigned *last, unsigned offset)
+static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
+			 void *init_xstate, unsigned int size)
 {
-	if (*last >= offset)
-		return;
-	membuf_write(to, (void *)&init_fpstate.xsave + *last, offset - *last);
-	*last = offset;
-}
-
-static void copy_part(struct membuf *to, unsigned *last, unsigned offset,
-		      unsigned size, void *from)
-{
-	fill_gap(to, last, offset);
-	membuf_write(to, from, size);
-	*last = offset + size;
+	membuf_write(to, from_xstate ? xstate : init_xstate, size);
 }
 
 /*
@@ -1109,10 +1099,10 @@ static void copy_part(struct membuf *to, unsigned *last, unsigned offset,
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {
+	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
-	const unsigned off_mxcsr = offsetof(struct fxregs_state, mxcsr);
-	unsigned size = to.left;
-	unsigned last = 0;
+	unsigned int zerofrom;
 	int i;
 
 	/*
@@ -1122,41 +1112,68 @@ void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 	header.xfeatures = xsave->header.xfeatures;
 	header.xfeatures &= xfeatures_mask_user();
 
-	if (header.xfeatures & XFEATURE_MASK_FP)
-		copy_part(&to, &last, 0, off_mxcsr, &xsave->i387);
-	if (header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM))
-		copy_part(&to, &last, off_mxcsr,
-			  MXCSR_AND_FLAGS_SIZE, &xsave->i387.mxcsr);
-	if (header.xfeatures & XFEATURE_MASK_FP)
-		copy_part(&to, &last, offsetof(struct fxregs_state, st_space),
-			  128, &xsave->i387.st_space);
-	if (header.xfeatures & XFEATURE_MASK_SSE)
-		copy_part(&to, &last, xstate_offsets[XFEATURE_SSE],
-			  256, &xsave->i387.xmm_space);
-	/*
-	 * Fill xsave->i387.sw_reserved value for ptrace frame:
-	 */
-	copy_part(&to, &last, offsetof(struct fxregs_state, sw_reserved),
-		  48, xstate_fx_sw_bytes);
-	/*
-	 * Copy xregs_state->header:
-	 */
-	copy_part(&to, &last, offsetof(struct xregs_state, header),
-		  sizeof(header), &header);
+	/* Copy FP state up to MXCSR */
+	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
+		     &xinit->i387, off_mxcsr);
+
+	/* Copy MXCSR when SSE or YMM are set in the feature mask */
+	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
+		     &to, &xsave->i387.mxcsr, &xinit->i387.mxcsr,
+		     MXCSR_AND_FLAGS_SIZE);
+
+	/* Copy the remaining FP state */
+	copy_feature(header.xfeatures & XFEATURE_MASK_FP,
+		     &to, &xsave->i387.st_space, &xinit->i387.st_space,
+		     sizeof(xsave->i387.st_space));
+
+	/* Copy the SSE state - shared with YMM, but independently managed */
+	copy_feature(header.xfeatures & XFEATURE_MASK_SSE,
+		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
+		     sizeof(xsave->i387.xmm_space));
+
+	/* Zero the padding area */
+	membuf_zero(&to, sizeof(xsave->i387.padding));
+
+	/* Copy xsave->i387.sw_reserved */
+	membuf_write(&to, xstate_fx_sw_bytes, sizeof(xsave->i387.sw_reserved));
+
+	/* Copy the user space relevant state of @xsave->header */
+	membuf_write(&to, &header, sizeof(header));
+
+	zerofrom = offsetof(struct xregs_state, extended_state_area);
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
 		/*
-		 * Copy only in-use xstates:
+		 * The ptrace buffer is in non-compacted XSAVE format.
+		 * In non-compacted format disabled features still occupy
+		 * state space, but there is no state to copy from in the
+		 * compacted init_fpstate. The gap tracking will zero this
+		 * later.
 		 */
-		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, i);
+		if (!(xfeatures_mask_user() & BIT_ULL(i)))
+			continue;
 
-			copy_part(&to, &last, xstate_offsets[i],
-				  xstate_sizes[i], src);
-		}
+		/*
+		 * If there was a feature or alignment gap, zero the space
+		 * in the destination buffer.
+		 */
+		if (zerofrom < xstate_offsets[i])
+			membuf_zero(&to, xstate_offsets[i] - zerofrom);
+
+		copy_feature(header.xfeatures & BIT_ULL(i), &to,
+			     __raw_xsave_addr(xsave, i),
+			     __raw_xsave_addr(xinit, i),
+			     xstate_sizes[i]);
 
+		/*
+		 * Keep track of the last copied state in the non-compacted
+		 * target buffer for gap zeroing.
+		 */
+		zerofrom = xstate_offsets[i] + xstate_sizes[i];
 	}
-	fill_gap(&to, &last, size);
+
+	if (to.left)
+		membuf_zero(&to, to.left);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* [PATCH] x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again
  2021-06-23 12:01 ` [patch V4 15/65] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
@ 2021-06-24 15:09   ` Thomas Gleixner
  2021-06-24 15:41     ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
  1 sibling, 1 reply; 142+ messages in thread
From: Thomas Gleixner @ 2021-06-24 15:09 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang, Chang Seok Bae, Megha Dey, Oliver Sang

The change which made copy_xstate_to_uabi_buf() usable for
[x]fpregs_get() removed the zeroing of the header which means the
header, which is copied to user space later, contains except for the
xfeatures member random stack content.

Add the memset() back to zero it before usage.

Fixes: eb6f51723f03 ("x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
Changelog summary: I'm a moron.
---
 arch/x86/kernel/fpu/xstate.c |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -982,6 +982,7 @@ void copy_xstate_to_uabi_buf(struct memb
 	unsigned int zerofrom;
 	int i;
 
+	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
 
 	/* Mask out the feature bits depending on copy mode */

^ permalink raw reply	[flat|nested] 142+ messages in thread

* [tip: x86/fpu] x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again
  2021-06-24 15:09   ` [PATCH] x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again Thomas Gleixner
@ 2021-06-24 15:41     ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 142+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-24 15:41 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     93c2cdc975aab53c222472c5b96c2d41dbeb350c
Gitweb:        https://git.kernel.org/tip/93c2cdc975aab53c222472c5b96c2d41dbeb350c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Jun 2021 17:09:18 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 24 Jun 2021 17:19:51 +02:00

x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again

The change which made copy_xstate_to_uabi_buf() usable for
[x]fpregs_get() removed the zeroing of the header which means the
header, which is copied to user space later, contains except for the
xfeatures member, random stack content.

Add the memset() back to zero it before usage.

Fixes: eb6f51723f03 ("x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/875yy3wb8h.ffs@nanos.tec.linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 21a10a6..c8def1b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -982,6 +982,7 @@ void copy_xstate_to_uabi_buf(struct membuf to, struct task_struct *tsk,
 	unsigned int zerofrom;
 	int i;
 
+	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
 
 	/* Mask out the feature bits depending on copy mode */

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Re: [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing
  2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (64 preceding siblings ...)
  2021-06-23 12:02 ` [patch V4 65/65] x86/fpu/signal: Let xrstor handle the features to init Thomas Gleixner
@ 2021-06-25 14:50 ` Oliver Sang
  65 siblings, 0 replies; 142+ messages in thread
From: Oliver Sang @ 2021-06-25 14:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Peter Zijlstra, Kan Liang, Chang Seok Bae, Megha Dey, aubrey.li,
	zhengjun.xing, feng.tang, yujie.liu, beibei.si, philip.li,
	julie.du

Hi Thomas,

On Wed, Jun 23, 2021 at 02:01:27PM +0200, Thomas Gleixner wrote:
> The main parts of this series are:
> 
>   - Simplification and removal/replacement of redundant and/or
>     overengineered code.
> 
>   - Name space cleanup as the existing names were just a permanent source
>     of confusion.
> 
>   - Clear seperation of user ABI and kernel internal state handling.
> 
>   - Removal of PKRU from being XSTATE managed in the kernel because PKRU
>     has to be eagerly restored on context switch and keeping it in sync
>     in the xstate buffer is just pointless overhead and fragile.
> 
>     The kernel still XSAVEs PKRU on context switch but the value in the
>     buffer is not longer used and never restored from the buffer.
> 
>     This still needs to be cleaned up, but the series is already 40+
>     patches large and the cleanup of this is not a functional problem.
> 
>     The functional issues of PKRU management are fully addressed with the
>     series as is.
> 
>   - Cleanup of fpu signal restore
> 
>     - Make the fast path self contained. Handle #PF directly and skip
>       the slow path on any other exception as that will just end up
>       with the same result that the frame is invalid. This allows
>       the compiler to optimize the slow path out for 64bit kernels
>       w/o ia32 emulation.
> 
>     - Reduce code duplication and unnecessary operations
> 
> It applies on top of
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
> 
> and is also available via git:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu

0-Day kernel CI tested this branch from performance view, choosing some sub-tests
from will-it-scale and stress-ng (detail as below)

Test Summary
============
no obvious performance changes found vs v5.13-rc7 so far

Test Environment
================
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git/log/?h=x86/fpu
* a263e30a4f111 (tglx-devel/x86/fpu) x86/fpu/signal: Let xrstor handle the features to init
* cf917e53a97e9 x86/fpu/signal: Handle #PF in the direct restore path
* 7b269dff0be66 x86/fpu: Return proper error codes from user access functions
* d2d6be9e16386 x86/fpu/signal: Split out the direct restore code
...
* d3ca29fb9911f x86/fpu: Fix copy_xstate_to_kernel() gap handling
*   58a664af246e6 Merge branch 'x86/fpu' of ../tip into x86/fpu
|\
| * b7c11876d24bd (tip/x86/fpu, peterz-queue/x86/fpu) selftests/x86: Test signal frame XSTATE header corruption handling
...
* | 13311e74253fe (tag: v5.13-rc7


64bit kernel testing, upon below platforms:
(1)
model: Cascade Lake
Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz
nr_node: 2
nr_cpu: 88
memory: 128G
(2)
model: Ice Lake
nr_node: 2
nr_cpu: 96
memory: 256G


32bit kernel testing, upon below platform:
Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz
model: Ivy Bridge
nr_node: 1
nr_cpu: 8
memory: 16G


tested below test suites:
will-it-scale-performance-context_switch1
will-it-scale-performance-page_fault1
will-it-scale-performance-poll1
will-it-scale-performance-pthread_mutex1
will-it-scale-performance-writeseek1

regarding stress-ng, either no gap or data is unstable on
64bit (1) platform

commit:                                                                                                          
  v5.13-rc7                                                                                                      
  a263e30a4f111e13a6c859102d1b319a79b072c0                                                                       
                                                                                                                 
       v5.13-rc7 a263e30a4f111e13a6c859102d1                                                                     
---------------- ---------------------------                                                                     
         %stddev     %change         %stddev
             \          |                \
 2.786e+08 ±  2%      -1.3%  2.751e+08        stress-ng.switch.ops
   4643776 ±  2%      -1.3%    4585061        stress-ng.switch.ops_per_sec
--
  25884344 ± 15%      +5.3%   27252860 ± 12%  stress-ng.nanosleep.ops
    431397 ± 15%      +5.3%     454204 ± 12%  stress-ng.nanosleep.ops_per_sec
--
  5.48e+09 ±  2%      +4.3%  5.713e+09 ±  3%  stress-ng.yield.ops
  91326562 ±  2%      +4.3%   95215138 ±  3%  stress-ng.yield.ops_per_sec
--
   6723205 ±  3%      -2.0%    6585690 ±  9%  stress-ng.tee.ops
    112049 ±  3%      -2.0%     109755 ±  9%  stress-ng.tee.ops_per_sec
--
   3078530            -0.2%    3072610        stress-ng.zombie.ops
     49390            -0.2%      49295        stress-ng.zombie.ops_per_sec
--
   2261601            -0.1%    2259339        stress-ng.cyclic.ops
     37693            -0.1%      37655        stress-ng.cyclic.ops_per_sec
--
 1.543e+09 ± 16%      +0.4%   1.55e+09 ± 19%  stress-ng.fifo.ops
  25721633 ± 16%      +0.4%   25831339 ± 19%  stress-ng.fifo.ops_per_sec
--
    192171            +0.6%     193383        stress-ng.clone.ops
      2998            +0.6%       3016        stress-ng.clone.ops_per_sec
--
  88703410            -0.5%   88257268        stress-ng.sleep.ops
   1470841            -0.5%    1463509        stress-ng.sleep.ops_per_sec
--
    401598            +0.1%     401990        stress-ng.netlink-proc.ops
      6692            +0.1%       6699        stress-ng.netlink-proc.ops_per_sec
--
 1.182e+08            +2.5%  1.212e+08        stress-ng.softlockup.ops
   1969636            +2.5%    2019730        stress-ng.softlockup.ops_per_sec
--
 1.866e+08            -0.2%  1.863e+08        stress-ng.fanotify.ops
   4754116 ± 12%      -2.8%    4620280 ±  5%  stress-ng.fanotify.ops_per_sec
--
  17926280            -0.3%   17866738        stress-ng.vforkmany.ops
    286256            +0.4%     287262        stress-ng.vforkmany.ops_per_sec
--
 1.381e+09 ± 13%      +8.7%  1.501e+09 ± 19%  stress-ng.fifo.ops
  23013090 ± 13%      +8.7%   25020209 ± 19%  stress-ng.fifo.ops_per_sec
--
 1.165e+08 ±  2%      -2.0%  1.141e+08 ±  3%  stress-ng.splice.ops
   1941398 ±  2%      -2.0%    1901755 ±  3%  stress-ng.splice.ops_per_sec
--
   5476091            -0.2%    5466962        stress-ng.vfork.ops
     91262            -0.2%      91116        stress-ng.vfork.ops_per_sec
--
  21878786            -1.4%   21574984        stress-ng.netlink-task.ops
    364646            -1.4%     359582        stress-ng.netlink-task.ops_per_sec
--
   5023286            +1.6%    5101458        stress-ng.pthread.ops
     83554            +1.6%      84869        stress-ng.pthread.ops_per_sec
--
  92688395 ±  2%      -8.3%   84967931 ±  4%  stress-ng.futex.ops
   1544792 ±  2%      -8.3%    1416117 ±  4%  stress-ng.futex.ops_per_sec
--
   5416582 ±  2%      +1.6%    5504734 ±  5%  stress-ng.wait.ops
     90275 ±  2%      +1.6%      91744 ±  5%  stress-ng.wait.ops_per_sec
--
 7.939e+08 ±  4%      +0.7%  7.998e+08        stress-ng.mq.ops
  13231931 ±  4%      +0.7%   13329018        stress-ng.mq.ops_per_sec
--
 3.936e+08 ± 53%     +51.5%  5.965e+08 ± 37%  stress-ng.hrtimers.ops
   6416615 ± 52%     +49.8%    9609730 ± 38%  stress-ng.hrtimers.ops_per_sec
--
  35268682            -1.1%   34872339        stress-ng.sendfile.ops
    587805            -1.1%     581200        stress-ng.sendfile.ops_per_sec
--
  18126587            -0.7%   18000304 ±  3%  stress-ng.sem-sysv.ops
    302108            -0.7%     300004 ±  3%  stress-ng.sem-sysv.ops_per_sec
--
 9.953e+08 ±  2%      -0.4%  9.912e+08        stress-ng.pipe.ops
  16589030 ±  2%      -0.4%   16519202        stress-ng.pipe.ops_per_sec
--
   1056000            +0.0%    1056000        stress-ng.schedpolicy.ops
     17599            +0.0%      17599        stress-ng.schedpolicy.ops_per_sec
--
 2.481e+09            +1.2%   2.51e+09        stress-ng.vm-splice.ops
  41349053            +1.2%   41828864        stress-ng.vm-splice.ops_per_sec
--
  56805932 ±  2%      +1.1%   57451245 ±  2%  stress-ng.eventfd.ops
    946751 ±  2%      +1.1%     957505 ±  2%  stress-ng.eventfd.ops_per_sec
--
      1605            -1.2%       1585 ±  2%  stress-ng.mmapfork.ops
     25.18 ±  4%      -0.8%      24.98 ±  4%  stress-ng.mmapfork.ops_per_sec
--
   2549139            +1.2%    2578537        stress-ng.fork.ops
     42485            +1.2%      42975        stress-ng.fork.ops_per_sec
--
  16582541 ±  3%     -10.8%   14792067 ± 21%  stress-ng.close.ops
    276357 ±  3%     -10.8%     246515 ± 21%  stress-ng.close.ops_per_sec
--
 1.847e+09            -1.3%  1.822e+09        stress-ng.pipeherd.ops
  30743202            -1.3%   30330538        stress-ng.pipeherd.ops_per_sec
--
   6780757 ±  2%      +4.3%    7070424        stress-ng.tee.ops
    113008 ±  2%      +4.3%     117835        stress-ng.tee.ops_per_sec
--
   3421466            +0.0%    3422853        stress-ng.dnotify.ops
     57024            +0.0%      57047        stress-ng.dnotify.ops_per_sec
--
    749895            -0.9%     742801        stress-ng.kill.ops
     12498            -0.9%      12379        stress-ng.kill.ops_per_sec
--
 3.761e+08            -2.5%  3.668e+08        stress-ng.msg.ops
   6266833            -2.5%    6112949        stress-ng.msg.ops_per_sec
--
   2258479            +0.1%    2260255        stress-ng.daemon.ops
     37641            +0.1%      37670        stress-ng.daemon.ops_per_sec
--
  40560445            -0.8%   40239283        stress-ng.affinity.ops
    668138            -1.6%     657142        stress-ng.affinity.ops_per_sec
--
     21230 ±  4%      -0.9%      21047 ±  2%  stress-ng.inotify.ops
    353.14 ±  4%      -0.9%     349.88 ±  2%  stress-ng.inotify.ops_per_sec
--
  23250444 ±  2%      +1.6%   23627954        stress-ng.fault.ops
    387495 ±  2%      +1.6%     393795        stress-ng.fault.ops_per_sec
--
  3.94e+08            -0.2%  3.933e+08        stress-ng.sem.ops
   6567438            -0.2%    6555371        stress-ng.sem.ops_per_sec
--
    698544            -0.0%     698480        stress-ng.session.ops
     11494            +0.0%      11495        stress-ng.session.ops_per_sec

> 
> This is a follow up to V3 which can be found here:
> 
>      https://lore.kernel.org/r/20210618141823.161158090@linutronix.de
> 
> Changes vs. V3:
> 
>   - Dropped the two bugfixes which are applied already and rebased on top
> 
>   - Addressed review comments (Andy, Boris)
> 
>     Patches: 13, 35, 36, 37, 46, 58, 62, 63
> 
>   - Fixed the math-emu fallout which I had stashed safely on the 32bit
>     testbox (Boris)
> 
>     Patch: 28
> 
>   - Picked up tags
> 
> Thanks to everyone for review, feedback and testing (various teams
> @Intel).
> 
> Note: I've not picked up any tested-by tags. It would be nice to have
> them on this hopefully final version.
> 
> Thanks,
> 
> 	tglx
> ---
>  arch/x86/events/intel/lbr.c          |    6 
>  arch/x86/include/asm/fpu/internal.h  |  202 ++++------
>  arch/x86/include/asm/fpu/xstate.h    |   78 +++-
>  arch/x86/include/asm/pgtable.h       |   57 ---
>  arch/x86/include/asm/pkeys.h         |    9 
>  arch/x86/include/asm/pkru.h          |   62 +++
>  arch/x86/include/asm/processor.h     |    9 
>  arch/x86/include/asm/special_insns.h |   14 
>  arch/x86/kernel/cpu/common.c         |   34 -
>  arch/x86/kernel/fpu/core.c           |  282 +++++++--------
>  arch/x86/kernel/fpu/init.c           |   15 
>  arch/x86/kernel/fpu/regset.c         |  223 ++++++------
>  arch/x86/kernel/fpu/signal.c         |  419 ++++++++++------------
>  arch/x86/kernel/fpu/xstate.c         |  645 +++++++++++++----------------------
>  arch/x86/kernel/process.c            |   22 +
>  arch/x86/kernel/process_64.c         |   28 +
>  arch/x86/kernel/traps.c              |    5 
>  arch/x86/kvm/svm/sev.c               |    1 
>  arch/x86/kvm/x86.c                   |   56 +--
>  arch/x86/math-emu/fpu_proto.h        |    2 
>  arch/x86/math-emu/load_store.c       |    2 
>  arch/x86/math-emu/reg_ld_str.c       |    2 
>  arch/x86/mm/extable.c                |    2 
>  arch/x86/mm/fault.c                  |    2 
>  arch/x86/mm/pkeys.c                  |   22 -
>  include/linux/pkeys.h                |    4 
>  26 files changed, 1037 insertions(+), 1166 deletions(-)
> 
> 

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2021-06-23 12:01 ` [patch V4 09/65] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
  2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
@ 2022-07-14  4:04   ` Andrei Vagin
  2022-07-25 17:47     ` Dave Hansen
  1 sibling, 1 reply; 142+ messages in thread
From: Andrei Vagin @ 2022-07-14  4:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Peter Zijlstra, Kan Liang, Chang Seok Bae, Megha Dey,
	Oliver Sang

On Wed, Jun 23, 2021 at 5:24 AM Thomas Gleixner <tglx@linutronix.de> wrote:
....
>
> -       /*
> -        * mxcsr reserved bits must be masked to zero for security reasons.
> -        */
> -       xsave->i387.mxcsr &= mxcsr_feature_mask;
> -
> -       /*
> -        * In case of failure, mark all states as init:
> -        */
> -       if (ret)
> -               fpstate_init(&fpu->state);
> +       fpu__prepare_write(fpu);
> +       ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);

This change breaks gVisor. Now, when we set a new fpu state without
fp,sse and ymm
via ptrace, mxcsr isn't reset to the default value. The issue is
reproduced only on hosts
without xsaves. On hosts with xsaves, it works as expected.

Thanks,
Andrei

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-07-14  4:04   ` [patch V4 09/65] " Andrei Vagin
@ 2022-07-25 17:47     ` Dave Hansen
  2022-07-25 17:57       ` Andrei Vagin
  0 siblings, 1 reply; 142+ messages in thread
From: Dave Hansen @ 2022-07-25 17:47 UTC (permalink / raw)
  To: Andrei Vagin, Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Peter Zijlstra, Kan Liang, Chang Seok Bae, Megha Dey,
	Oliver Sang

On 7/13/22 21:04, Andrei Vagin wrote:
>> -       /*
>> -        * mxcsr reserved bits must be masked to zero for security reasons.
>> -        */
>> -       xsave->i387.mxcsr &= mxcsr_feature_mask;
>> -
>> -       /*
>> -        * In case of failure, mark all states as init:
>> -        */
>> -       if (ret)
>> -               fpstate_init(&fpu->state);
>> +       fpu__prepare_write(fpu);
>> +       ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
> This change breaks gVisor. Now, when we set a new fpu state without 
> fp,sse and ymm via ptrace, mxcsr isn't reset to the default value.
> The issue is reproduced only on hosts without xsaves. On hosts with
> xsaves, it works as expected.
Is gVisor some out-of-tree kernel code or is it just an proprietary KVM
user?  In other words, is this an issue with the upstream kernel itself
or does it require kernel modification?

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-07-25 17:47     ` Dave Hansen
@ 2022-07-25 17:57       ` Andrei Vagin
  2022-07-25 21:26         ` Dave Hansen
  0 siblings, 1 reply; 142+ messages in thread
From: Andrei Vagin @ 2022-07-25 17:57 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu,
	Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Borislav Petkov, Peter Zijlstra, Kan Liang, Chang Seok Bae,
	Megha Dey, Oliver Sang

On Mon, Jul 25, 2022 at 10:47 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 7/13/22 21:04, Andrei Vagin wrote:
> >> -       /*
> >> -        * mxcsr reserved bits must be masked to zero for security reasons.
> >> -        */
> >> -       xsave->i387.mxcsr &= mxcsr_feature_mask;
> >> -
> >> -       /*
> >> -        * In case of failure, mark all states as init:
> >> -        */
> >> -       if (ret)
> >> -               fpstate_init(&fpu->state);
> >> +       fpu__prepare_write(fpu);
> >> +       ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
> > This change breaks gVisor. Now, when we set a new fpu state without
> > fp,sse and ymm via ptrace, mxcsr isn't reset to the default value.
> > The issue is reproduced only on hosts without xsaves. On hosts with
> > xsaves, it works as expected.
> Is gVisor some out-of-tree kernel code or is it just an proprietary KVM
> user?  In other words, is this an issue with the upstream kernel itself
> or does it require kernel modification?

This is on the upstream kernel without any modifications. And this is the case
when gVisor uses ptrace to trap syscalls. KVM isn't used in this case.

Thanks,
Andrei

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-07-25 17:57       ` Andrei Vagin
@ 2022-07-25 21:26         ` Dave Hansen
  2022-07-28 23:32           ` Chang S. Bae
  0 siblings, 1 reply; 142+ messages in thread
From: Dave Hansen @ 2022-07-25 21:26 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu,
	Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Borislav Petkov, Peter Zijlstra, Kan Liang, Chang Seok Bae,
	Megha Dey, Oliver Sang

On 7/25/22 10:57, Andrei Vagin wrote:
> On Mon, Jul 25, 2022 at 10:47 AM Dave Hansen <dave.hansen@intel.com> wrote:
>> On 7/13/22 21:04, Andrei Vagin wrote:
>>>> -       /*
>>>> -        * mxcsr reserved bits must be masked to zero for security reasons.
>>>> -        */
>>>> -       xsave->i387.mxcsr &= mxcsr_feature_mask;
>>>> -
>>>> -       /*
>>>> -        * In case of failure, mark all states as init:
>>>> -        */
>>>> -       if (ret)
>>>> -               fpstate_init(&fpu->state);
>>>> +       fpu__prepare_write(fpu);
>>>> +       ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
>>> This change breaks gVisor. Now, when we set a new fpu state without
>>> fp,sse and ymm via ptrace, mxcsr isn't reset to the default value.
>>> The issue is reproduced only on hosts without xsaves. On hosts with
>>> xsaves, it works as expected.
>> Is gVisor some out-of-tree kernel code or is it just an proprietary KVM
>> user?  In other words, is this an issue with the upstream kernel itself
>> or does it require kernel modification?
> This is on the upstream kernel without any modifications. And this is the case
> when gVisor uses ptrace to trap syscalls. KVM isn't used in this case.

Do you happen to have a quick reproducer for this, or at least the
contents of the buffer that you are trying to restore?

I'm struggling to if we broke a meaningful part of the ABI, or if this
is fallout from the kernel being more hesitant to zap the whole buffer
when it has problems.

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-07-25 21:26         ` Dave Hansen
@ 2022-07-28 23:32           ` Chang S. Bae
  2022-08-05 12:12             ` Andrei Vagin
  0 siblings, 1 reply; 142+ messages in thread
From: Chang S. Bae @ 2022-07-28 23:32 UTC (permalink / raw)
  To: Dave Hansen, Andrei Vagin
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu,
	Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Borislav Petkov, Peter Zijlstra, Kan Liang, Megha Dey,
	Oliver Sang

On 7/25/2022 2:26 PM, Dave Hansen wrote:
> 
> Do you happen to have a quick reproducer for this, or at least the
> contents of the buffer that you are trying to restore?

While not following this report, I think there is a regression along 
with the changes:

As looking into the spec, this state load does not depend on XSTATE_BV:

      RFBM := XCR0 AND EDX:EAX;
      COMPMASK := XCOMP_BV field from XSAVE header;

      IF COMPMASK[63] = 0
          THEN
          ...
          IF RFBM[1] = 1 OR RFBM[2] = 1
              THEN load MXCSR from legacy region of XSAVE area;
          FI;
          ...
      ELSE
      ...

But our upstream code does reference XSTATE_BV instead of RFBM [1,2].

My test case [3] fails with the upstream but works with 5.13, which is 
before the series. Then, this change looks to make it work at least for it:

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8340156bfd2..db4ab5c7ba8b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1094,7 +1094,7 @@ void __copy_xstate_to_uabi_buf(struct membuf to,
struct fpstate *fpstate,
                       &xinit->i387, off_mxcsr);

          /* Copy MXCSR when SSE or YMM are set in the feature mask */
-       copy_feature(header.xfeatures & (XFEATURE_MASK_SSE |
XFEATURE_MASK_YMM),
+       copy_feature(fpstate->user_xfeatures & (XFEATURE_MASK_SSE |
XFEATURE_MASK_YMM),
                       &to, &xsave->i387.mxcsr, &xinit->i387.mxcsr,
                       MXCSR_AND_FLAGS_SIZE);

@@ -1214,7 +1214,7 @@ static int copy_uabi_to_xstate(struct fpstate
*fpstate, const void *kbuf,

          /* Validate MXCSR when any of the related features is in use */
          mask = XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM;
-       if (hdr.xfeatures & mask) {
+       if (fpstate->user_xfeatures & mask) {
                  u32 mxcsr[2];

                  offset = offsetof(struct fxregs_state, mxcsr);
@@ -1226,7 +1226,7 @@ static int copy_uabi_to_xstate(struct fpstate
*fpstate, const void *kbuf,
                          return -EINVAL;

                  /* SSE and YMM require MXCSR even when FP is not in 
use. */
-               if (!(hdr.xfeatures & XFEATURE_MASK_FP)) {
+               if (fpstate->user_xfeatures & (XFEATURE_MASK_SSE |
XFEATURE_MASK_YMM)) {
                          xsave->i387.mxcsr = mxcsr[0];
                          xsave->i387.mxcsr_mask = mxcsr[1];
                  }

Thanks,
Chang

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n1097
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n1217
[3] test case:

#include <err.h>
#include <elf.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <x86intrin.h>

#include <sys/ptrace.h>
#include <sys/syscall.h>
#include <sys/wait.h>
#include <sys/uio.h>

#define PAGE_SIZE       4096

typedef uint8_t         u8;
typedef uint16_t        u16;
typedef uint32_t        u32;
typedef uint64_t        u64;

/* The below struct and define are copied from 
arch/x86/include/asm/fpu/types.h */

struct fxregs_state {
         u16                     cwd; /* Control Word                    */
         u16                     swd; /* Status Word                     */
         u16                     twd; /* Tag Word                        */
         u16                     fop; /* Last Instruction Opcode         */
         union {
                 struct {
                         u64     rip; /* Instruction Pointer             */
                         u64     rdp; /* Data Pointer                    */
                 };
                 struct {
                         u32     fip; /* FPU IP Offset                   */
                         u32     fcs; /* FPU IP Selector                 */
                         u32     foo; /* FPU Operand Offset              */
                         u32     fos; /* FPU Operand Selector            */
                 };
         };
         u32                     mxcsr;          /* MXCSR Register State */
         u32                     mxcsr_mask;     /* MXCSR Mask           */

         /* 8*16 bytes for each FP-reg = 128 bytes:                      */
         u32                     st_space[32];

         /* 16*16 bytes for each XMM-reg = 256 bytes:                    */
         u32                     xmm_space[64];

         u32                     padding[12];

         union {
                 u32             padding1[12];
                 u32             sw_reserved[12];
         };

} __attribute__((aligned(16)));

struct xstate_header {
         u64                             xfeatures;
         u64                             xcomp_bv;
         u64                             reserved[6];
} __attribute__((packed));

struct xregs_state {
         struct fxregs_state             i387;
         struct xstate_header            header;
         u8                              extended_state_area[0];
} __attribute__ ((packed, aligned (64)));

union fpregs_state {
         struct xregs_state              xsave;
         u8 __padding[PAGE_SIZE];
};

/*
  * List of XSAVE features Linux knows about:
  */
enum xfeature {
         XFEATURE_FP,
         XFEATURE_SSE,
};

#define XFEATURE_MASK_SSE               (1 << XFEATURE_SSE)

/* Default value for fxregs_state.mxcsr: */
#define MXCSR_DEFAULT           0x1f80

void main(void)
{
         union fpregs_state *xbuf;
         struct iovec iov;
         pid_t child;
         int status;
         u32 mxcsr;

         xbuf = aligned_alloc(64, sizeof(union fpregs_state));
         if (!xbuf)
                 err(1, "aligned_alloc()");
         memset(xbuf, 0, sizeof(union fpregs_state));

         iov.iov_base = xbuf;
         iov.iov_len = sizeof(union fpregs_state);

         child = fork();
         if (!child) {
                 if (ptrace(PTRACE_TRACEME, 0, NULL, NULL))
                         err(1, "PTRACE_TRACEME");

                 raise(SIGTRAP);
                 _exit(0);
         }

         do {
                 wait(&status);
         } while (WSTOPSIG(status) != SIGTRAP);

         printf("[RUN]\tCheck the default MXCSR state.\n");

         if (ptrace(PTRACE_GETREGSET, child, (uint32_t)NT_X86_XSTATE, &iov))
                 err(1, "PTRACE_GETREGSET");

         printf("[%svalid init value.\n",
                (xbuf->xsave.i387.mxcsr == MXCSR_DEFAULT) ?
                "OK]\twith the " : "FAIL]\twith an in");

         printf("[RUN]\tTest MXCSR state write.\n");
         xbuf->xsave.i387.mxcsr = 0;
         /* the MXCSR state should be loaded regardless of XSTATE_BV */
         xbuf->xsave.header.xfeatures = 0;

         if (ptrace(PTRACE_SETREGSET, child, (uint32_t)NT_X86_XSTATE, &iov))
                 err(1, "PTRACE_SETREGSET");

	/* ditch the MXCSR state */
         xbuf->xsave.i387.mxcsr = 0xff;
         xbuf->xsave.i387.mxcsr_mask = 0xff;

         if (ptrace(PTRACE_GETREGSET, child, (uint32_t)NT_X86_XSTATE, &iov))
                 err(1, "PTRACE_GETREGSET");

         printf("[%s]\tthe state was %swritten correctly\n",
                !xbuf->xsave.i387.mxcsr ? "OK" : "FAIL",
                !xbuf->xsave.i387.mxcsr ? "" : "not ");

         ptrace(PTRACE_DETACH, child, NULL, NULL);
         wait(&status);
         if (!WIFEXITED(status) || WEXITSTATUS(status))
                 err(1, "PTRACE_DETACH");

         free(xbuf);
}

^ permalink raw reply related	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-07-28 23:32           ` Chang S. Bae
@ 2022-08-05 12:12             ` Andrei Vagin
  2022-08-05 18:24               ` Chang S. Bae
  0 siblings, 1 reply; 142+ messages in thread
From: Andrei Vagin @ 2022-08-05 12:12 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: Dave Hansen, Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen,
	Fenghua Yu, Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Borislav Petkov, Peter Zijlstra, Kan Liang, Megha Dey,
	Oliver Sang

On Thu, Jul 28, 2022 at 4:32 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> On 7/25/2022 2:26 PM, Dave Hansen wrote:
> >
> > Do you happen to have a quick reproducer for this, or at least the
> > contents of the buffer that you are trying to restore?
>
> While not following this report, I think there is a regression along
> with the changes:
>
> As looking into the spec, this state load does not depend on XSTATE_BV:
>
>       RFBM := XCR0 AND EDX:EAX;
>       COMPMASK := XCOMP_BV field from XSAVE header;
>
>       IF COMPMASK[63] = 0
>           THEN
>           ...
>           IF RFBM[1] = 1 OR RFBM[2] = 1
>               THEN load MXCSR from legacy region of XSAVE area;
>           FI;
>           ...
>       ELSE
>       ...
>
> But our upstream code does reference XSTATE_BV instead of RFBM [1,2].
>
> My test case [3] fails with the upstream but works with 5.13, which is
> before the series. Then, this change looks to make it work at least for it:

gVisor test passes with this change too. Chang, are you going to send a patch?

Thanks,
Andrei

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-08-05 12:12             ` Andrei Vagin
@ 2022-08-05 18:24               ` Chang S. Bae
  2022-08-05 18:35                 ` Dave Hansen
  0 siblings, 1 reply; 142+ messages in thread
From: Chang S. Bae @ 2022-08-05 18:24 UTC (permalink / raw)
  To: Andrei Vagin
  Cc: Dave Hansen, Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen,
	Fenghua Yu, Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Borislav Petkov, Peter Zijlstra, Kan Liang, Megha Dey,
	Oliver Sang

On 8/5/2022 5:12 AM, Andrei Vagin wrote:
> gVisor test passes with this change too. Chang, are you going to send a patch?

Oh, sounds good then.

Will post patches. But I guess I need to revise them a bit.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 142+ messages in thread

* Re: [patch V4 09/65] x86/fpu: Sanitize xstateregs_set()
  2022-08-05 18:24               ` Chang S. Bae
@ 2022-08-05 18:35                 ` Dave Hansen
  0 siblings, 0 replies; 142+ messages in thread
From: Dave Hansen @ 2022-08-05 18:35 UTC (permalink / raw)
  To: Chang S. Bae, Andrei Vagin
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu,
	Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Borislav Petkov, Peter Zijlstra, Kan Liang, Megha Dey,
	Oliver Sang

On 8/5/22 11:24, Chang S. Bae wrote:
> On 8/5/2022 5:12 AM, Andrei Vagin wrote:
>> gVisor test passes with this change too. Chang, are you going to send
>> a patch?
> 
> Oh, sounds good then.
> 
> Will post patches. But I guess I need to revise them a bit.

I've also got a series that Andy, Thomas and I were poking at last week.
 I'll try and get it out the door today.

^ permalink raw reply	[flat|nested] 142+ messages in thread

end of thread, other threads:[~2022-08-05 18:35 UTC | newest]

Thread overview: 142+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-23 12:01 [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
2021-06-23 12:01 ` [patch V4 01/65] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 02/65] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 03/65] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] x86/fpu: Mark various FPU state variables __ro_after_init tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 04/65] x86/fpu: Make xfeatures_mask_all __ro_after_init Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 05/65] x86/fpu: Get rid of fpu__get_supported_xfeatures_mask() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 06/65] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 07/65] x86/fpu: Move inlines where they belong Thomas Gleixner
2021-06-23 18:43   ` Bae, Chang Seok
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 08/65] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 09/65] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2022-07-14  4:04   ` [patch V4 09/65] " Andrei Vagin
2022-07-25 17:47     ` Dave Hansen
2022-07-25 17:57       ` Andrei Vagin
2022-07-25 21:26         ` Dave Hansen
2022-07-28 23:32           ` Chang S. Bae
2022-08-05 12:12             ` Andrei Vagin
2022-08-05 18:24               ` Chang S. Bae
2022-08-05 18:35                 ` Dave Hansen
2021-06-23 12:01 ` [patch V4 10/65] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 11/65] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
2021-06-23 12:01 ` [patch V4 12/65] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
2021-06-23 12:01 ` [patch V4 13/65] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
2021-06-23 12:01 ` [patch V4 14/65] x86/fpu: Clean up fpregs_set() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
2021-06-23 12:01 ` [patch V4 15/65] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-24 15:09   ` [PATCH] x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again Thomas Gleixner
2021-06-24 15:41     ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 16/65] x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 17/65] x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 18/65] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 19/65] x86/fpu/regset: Move fpu__read_begin() into regset Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 20/65] x86/fpu: Move fpu__write_begin() to regset Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 21/65] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 22/65] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
2021-06-23 12:01 ` [patch V4 23/65] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 24/65] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 25/65] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 26/65] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 27/65] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] x86/fpu: Rename fxregs-related " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 28/65] x86/math-emu: Rename frstor() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 29/65] x86/fpu: Rename fregs related copy functions Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] x86/fpu: Rename fregs-related " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 30/65] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 31/65] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:01 ` [patch V4 32/65] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 33/65] x86/fpu: Get rid of the FNSAVE optimization Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 34/65] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_fpstate() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 35/65] x86/fpu: Rename initstate copy functions Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 36/65] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
2021-06-23 12:02 ` [patch V4 37/65] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 38/65] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
2021-06-23 12:02 ` [patch V4 39/65] x86/fpu: Rename and sanitize fpu__save/copy() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 40/65] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 41/65] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 42/65] x86/pkru: Provide pkru_write_default() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 43/65] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 44/65] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 45/65] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 46/65] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
2021-06-23 12:02 ` [patch V4 47/65] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 48/65] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 49/65] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 50/65] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 51/65] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
2021-06-23 22:09   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
2021-06-23 12:02 ` [patch V4 52/65] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Dave Hansen
2021-06-23 12:02 ` [patch V4 53/65] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 54/65] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 55/65] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] x86/fpu: Don't " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 56/65] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 57/65] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 58/65] x86/fpu/signal: Move initial checks into fpu__sig_restore() Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] x86/fpu/signal: Move initial checks into fpu__restore_sig() tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 59/65] x86/fpu/signal: Remove the legacy alignment check Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 60/65] x86/fpu/signal: Sanitize the xstate check on sigframe Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 61/65] x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing() Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 62/65] x86/fpu/signal: Split out the direct restore code Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 63/65] x86/fpu: Return proper error codes from user access functions Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 64/65] x86/fpu/signal: Handle #PF in the direct restore path Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-23 12:02 ` [patch V4 65/65] x86/fpu/signal: Let xrstor handle the features to init Thomas Gleixner
2021-06-23 22:08   ` [tip: x86/fpu] " tip-bot2 for Thomas Gleixner
2021-06-25 14:50 ` [patch V4 00/65] x86/fpu: Spring cleaning and PKRU sanitizing Oliver Sang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.