linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing
@ 2021-06-14 15:44 Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE Thomas Gleixner
                   ` (53 more replies)
  0 siblings, 54 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The main parts of this series are:

  - Yet more bug fixes

  - Simplification and removal/replacement of redundant and/or
    overengineered code.

  - Name space cleanup as the existing names were just a permanent source
    of confusion.

  - Clear seperation of user ABI and kernel internal state handling.

  - Removal of PKRU from being XSTATE managed in the kernel because PKRU
    has to be eagerly restored on context switch and keeping it in sync
    in the xstate buffer is just pointless overhead and fragile.

    The kernel still XSAVEs PKRU on context switch but the value in the
    buffer is not longer used and never restored from the buffer.

    This still needs to be cleaned up, but the series is already 40+
    patches large and the cleanup of this is not a functional problem.

    The functional issues of PKRU management are fully addressed with the
    series as is.

It applies on top of

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

and is also available via git:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu

This is a follow up to V1 which can be found here:

     https://lore.kernel.org/r/20210611161523.508908024@linutronix.de

Changes vs. V1:

  - Fix the broken init_fpstate initialization

  - Make xstate copy to ptrace work correctly

  - Sanitize the regset functions more and get rid of
    fpstate_sanitize_xstate().

  - Addressed review comments

  - Picked up tags

Thanks,

	tglx
---
 arch/x86/events/intel/lbr.c          |    6 
 arch/x86/include/asm/fpu/internal.h  |  179 +++-------
 arch/x86/include/asm/fpu/xstate.h    |   70 ++-
 arch/x86/include/asm/pgtable.h       |   57 ---
 arch/x86/include/asm/pkeys.h         |    9 
 arch/x86/include/asm/pkru.h          |   62 +++
 arch/x86/include/asm/processor.h     |    9 
 arch/x86/include/asm/special_insns.h |   14 
 arch/x86/kernel/cpu/common.c         |   29 -
 arch/x86/kernel/fpu/core.c           |  242 +++++++++----
 arch/x86/kernel/fpu/init.c           |    4 
 arch/x86/kernel/fpu/regset.c         |  177 ++++-----
 arch/x86/kernel/fpu/signal.c         |   59 +--
 arch/x86/kernel/fpu/xstate.c         |  620 ++++++++++++++---------------------
 arch/x86/kernel/process.c            |   19 +
 arch/x86/kernel/process_64.c         |   28 +
 arch/x86/kvm/svm/sev.c               |    1 
 arch/x86/kvm/x86.c                   |   56 +--
 arch/x86/mm/extable.c                |    2 
 arch/x86/mm/fault.c                  |    2 
 arch/x86/mm/pkeys.c                  |   22 -
 include/linux/pkeys.h                |    4 
 22 files changed, 818 insertions(+), 853 deletions(-)



^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 19:15   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
                   ` (52 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The XSAVE init code initializes all enabled and supported components with
XRSTOR(S) to init state. Then it XSAVEs the state of the components back
into init_fpstate which is used in several places to fill in the init state
of components.

This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because
those use the init optimization and skip writing state of components which
are in init state. So init_fpstate.xsave still contains all zeroes after
this operation.

There are two ways to solve that:

   1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when
      XSAVES is enabled because XSAVES uses compacted format.

   2) Save the components which are known to have a non-zero init state by other
      means.

Looking deeper #2 is the right thing to do because all components the
kernel supports have all-zeroes init state except the legacy features (FP,
SSE). Those cannot be hardcoded because the states are not identical on all
CPUs, but they can be saved with FXSAVE which avoids all conditionals.

Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with
a BUILD_BUG_ON() which reminds developers to validate that a newly added
component has all zeroes init state. As a bonus remove the now unused
copy_xregs_to_kernel_booting() crutch.

The XSAVE and reshuffle method can still be implemented in the unlikely
case that components are added which have a non-zero init state and no
other means to save them. For now FXSAVE is just simple and good enough.

Add fxsave_to_kernel() for that purpose. The duplication with
copy_fxregs_to_kernel() is cleaned up later.

Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
V2: New patch.
---
 arch/x86/include/asm/fpu/internal.h |   30 +++++++--------------------
 arch/x86/kernel/fpu/xstate.c        |   39 +++++++++++++++++++++++++++++++++---
 2 files changed, 44 insertions(+), 25 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -204,6 +204,14 @@ static inline void copy_fxregs_to_kernel
 		asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
 }
 
+static inline void fxsave_to_kernel(struct fxregs_state *fx)
+{
+	if (IS_ENABLED(CONFIG_X86_32))
+		asm volatile( "fxsave %[fx]" : [fx] "=m" (*fx));
+	else
+		asm volatile("fxsaveq %[fx]" : [fx] "=m" (*fx));
+}
+
 /* These macros all use (%edi)/(%rdi) as the single memory argument. */
 #define XSAVE		".byte " REX_PREFIX "0x0f,0xae,0x27"
 #define XSAVEOPT	".byte " REX_PREFIX "0x0f,0xae,0x37"
@@ -270,28 +278,6 @@ static inline void copy_fxregs_to_kernel
 
 /*
  * This function is called only during boot time when x86 caps are not set
- * up and alternative can not be used yet.
- */
-static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
-{
-	u64 mask = xfeatures_mask_all;
-	u32 lmask = mask;
-	u32 hmask = mask >> 32;
-	int err;
-
-	WARN_ON(system_state != SYSTEM_BOOTING);
-
-	if (boot_cpu_has(X86_FEATURE_XSAVES))
-		XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-	else
-		XSTATE_OP(XSAVE, xstate, lmask, hmask, err);
-
-	/* We should never fault when copying to a kernel buffer: */
-	WARN_ON_FPU(err);
-}
-
-/*
- * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
 static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -441,12 +441,35 @@ static void __init print_xstate_offset_s
 }
 
 /*
+ * All supported features have either init state all zeros or are
+ * handled in setup_init_fpu() individually. This is an explicit
+ * feature list and does not use XFEATURE_MASK*SUPPORTED to catch
+ * newly added supported features at build time and make people
+ * actually look at the init state for the new feature.
+ */
+#define XFEATURES_INIT_FPSTATE_HANDLED		\
+	(XFEATURE_MASK_FP |			\
+	 XFEATURE_MASK_SSE |			\
+	 XFEATURE_MASK_YMM |			\
+	 XFEATURE_MASK_OPMASK |			\
+	 XFEATURE_MASK_ZMM_Hi256 |		\
+	 XFEATURE_MASK_Hi16_ZMM	 |		\
+	 XFEATURE_MASK_PKRU |			\
+	 XFEATURE_MASK_BNDREGS |		\
+	 XFEATURE_MASK_BNDCSR |			\
+	 XFEATURE_MASK_PASID)
+
+/*
  * setup the xstate image representing the init state
  */
 static void __init setup_init_fpu_buf(void)
 {
 	static int on_boot_cpu __initdata = 1;
 
+	BUILD_BUG_ON((XFEATURE_MASK_USER_SUPPORTED |
+		      XFEATURE_MASK_SUPERVISOR_SUPPORTED) !=
+		     XFEATURES_INIT_FPSTATE_HANDLED);
+
 	WARN_ON_FPU(!on_boot_cpu);
 	on_boot_cpu = 0;
 
@@ -466,10 +489,20 @@ static void __init setup_init_fpu_buf(vo
 	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
 
 	/*
-	 * Dump the init state again. This is to identify the init state
-	 * of any feature which is not represented by all zero's.
+	 * All components are now in init state. Read the state back so
+	 * that init_fpstate contains all non-zero init state. This is only
+	 * working with XSAVE, but not with XSAVEOPT and XSAVES because
+	 * those use the init optimization which skips writing data for
+	 * components in init state. So XSAVE could be used, but that would
+	 * require to reshuffle the data when XSAVES is available because
+	 * XSAVES uses xstate compaction. But doing so is a pointless
+	 * exercise because most components have an all zeros init state
+	 * except for the legacy ones (FP and SSE). Those can be saved with
+	 * FXSAVE into the legacy area. Adding new features requires to
+	 * ensure that init state is all zeroes or if not to add the
+	 * necessary handling here.
 	 */
-	copy_xregs_to_kernel_booting(&init_fpstate.xsave);
+	fxsave_to_kernel(&init_fpstate.fxsave);
 }
 
 static int xfeature_uncompacted_offset(int xfeature_nr)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-15 11:07   ` Borislav Petkov
  2021-06-16 22:02   ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
                   ` (51 subsequent siblings)
  53 siblings, 2 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The gap handling in copy_xstate_to_kernel() is wrong in two aspects when
XSAVES is in use.

  1) Copying of xstate.i387.xmm_space is only copied when the SSE feature
     bit is set. This is not correct because YMM (AVX) shares the XMM space
     and that state must also be copied if only the YMM feature bit set
     like already done for MXCSR.

  2) Using init_fpstate for copying the init state of features which are
     not set in the xstate header is only correct for the legacy area, but
     not for the extended features area because when XSAVES is in use then
     init_fpstate is in compacted form which means the xstate offsets which
     are used to copy from init_fpstate are not valid.

     Fortunately this is not a real problem today because all extended
     features in use have an all zeros init state, but it is wrong
     nevertheless and with a potentially dynamically sized init_fpstate
     this would result in access outside of the init_fpstate.

Fix this by:

  1) Copying XMM space when the SSE or the YMM feature bits are set

  2) Keeping track of the last copied state in the target buffer and
     explicitely zero it when there is a feature or alignment gap.

     Use the compacted offset when accessing the extended feature space
     in init_fpstate.

Fixes: b8be15d58806 ("x86/fpu/xstate: Re-enable XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
V2: New patch
---
 arch/x86/kernel/fpu/xstate.c |  105 ++++++++++++++++++++++++-------------------
 1 file changed, 61 insertions(+), 44 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1082,20 +1082,10 @@ static inline bool xfeatures_mxcsr_quirk
 	return true;
 }
 
-static void fill_gap(struct membuf *to, unsigned *last, unsigned offset)
+static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
+			 void *init_xstate, unsigned int size)
 {
-	if (*last >= offset)
-		return;
-	membuf_write(to, (void *)&init_fpstate.xsave + *last, offset - *last);
-	*last = offset;
-}
-
-static void copy_part(struct membuf *to, unsigned *last, unsigned offset,
-		      unsigned size, void *from)
-{
-	fill_gap(to, last, offset);
-	membuf_write(to, from, size);
-	*last = offset + size;
+	membuf_write(to, from_xstate ? xstate : init_xstate, size);
 }
 
 /*
@@ -1107,10 +1097,10 @@ static void copy_part(struct membuf *to,
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {
+	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
-	const unsigned off_mxcsr = offsetof(struct fxregs_state, mxcsr);
-	unsigned size = to.left;
-	unsigned last = 0;
+	unsigned int zerofrom;
 	int i;
 
 	/*
@@ -1120,41 +1110,68 @@ void copy_xstate_to_kernel(struct membuf
 	header.xfeatures = xsave->header.xfeatures;
 	header.xfeatures &= xfeatures_mask_user();
 
-	if (header.xfeatures & XFEATURE_MASK_FP)
-		copy_part(&to, &last, 0, off_mxcsr, &xsave->i387);
-	if (header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM))
-		copy_part(&to, &last, off_mxcsr,
-			  MXCSR_AND_FLAGS_SIZE, &xsave->i387.mxcsr);
-	if (header.xfeatures & XFEATURE_MASK_FP)
-		copy_part(&to, &last, offsetof(struct fxregs_state, st_space),
-			  128, &xsave->i387.st_space);
-	if (header.xfeatures & XFEATURE_MASK_SSE)
-		copy_part(&to, &last, xstate_offsets[XFEATURE_SSE],
-			  256, &xsave->i387.xmm_space);
-	/*
-	 * Fill xsave->i387.sw_reserved value for ptrace frame:
-	 */
-	copy_part(&to, &last, offsetof(struct fxregs_state, sw_reserved),
-		  48, xstate_fx_sw_bytes);
-	/*
-	 * Copy xregs_state->header:
-	 */
-	copy_part(&to, &last, offsetof(struct xregs_state, header),
-		  sizeof(header), &header);
+	/* Copy FP state up to MXCSR */
+	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
+		     &xinit->i387, off_mxcsr);
+
+	/* Copy MXCSR when SSE or YMM are set in the feature mask */
+	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
+		     &to, &xsave->i387.mxcsr, &xinit->i387.mxcsr,
+		     MXCSR_AND_FLAGS_SIZE);
+
+	/* Copy the remaining FP state */
+	copy_feature(header.xfeatures & XFEATURE_MASK_FP,
+		     &to, &xsave->i387.st_space, &xinit->i387.st_space,
+		     sizeof(xsave->i387.st_space));
+
+	/* Copy the SSE state - shared with YMM */
+	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
+		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
+		     16 * 16);
+
+	/* Zero the padding area */
+	membuf_zero(&to, sizeof(xsave->i387.padding));
+
+	/* Copy xsave->i387.sw_reserved */
+	membuf_write(&to, xstate_fx_sw_bytes, sizeof(xsave->i387.sw_reserved));
+
+	/* Copy the user space relevant state of @xsave->header */
+	membuf_write(&to, &header, sizeof(header));
+
+	zerofrom = offsetof(struct xregs_state, extended_state_area);
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
 		/*
-		 * Copy only in-use xstates:
+		 * The ptrace buffer is XSAVE format which is non-compacted.
+		 * In non-compacted format disabled features still occupy
+		 * state space, but there is no state to copy from in the
+		 * compacted init_fpstate. The gap tracking will zero this
+		 * later.
+		 */
+		if (!(xfeatures_mask_user() & BIT_ULL(i)))
+			continue;
+
+		/*
+		 * If there was a feature or alignment gap, zero the space
+		 * in the destination buffer.
 		 */
-		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, i);
+		if (zerofrom < xstate_offsets[i])
+			membuf_zero(&to, xstate_offsets[i] - zerofrom);
 
-			copy_part(&to, &last, xstate_offsets[i],
-				  xstate_sizes[i], src);
-		}
+		copy_feature(header.xfeatures & BIT_ULL(i), &to,
+			     __raw_xsave_addr(xsave, i),
+			     __raw_xsave_addr(xinit, i),
+			     xstate_sizes[i]);
 
+		/*
+		 * Keep track of the last copied state in the non-compacted
+		 * target buffer for gap zeroing.
+		 */
+		zerofrom = xstate_offsets[i] + xstate_sizes[i];
 	}
-	fill_gap(&to, &last, size);
+
+	if (to.left)
+		membuf_zero(&to, to.left);
 }
 
 /*


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-15 13:15   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 04/52] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
                   ` (50 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

This cannot work and it's unclear how that ever made a difference.

init_fpstate.xsave.header.xfeatures is always 0 so get_xsave_addr() will
always return a NULL pointer, which will prevent storing the default PKRU
value in initfp_state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Fix subject
---
 arch/x86/kernel/cpu/common.c |    5 -----
 arch/x86/mm/pkeys.c          |    6 ------
 2 files changed, 11 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,8 +466,6 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	struct pkru_state *pk;
-
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@@ -478,9 +476,6 @@ static __always_inline void setup_pku(st
 		return;
 
 	cr4_set_bits(X86_CR4_PKE);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (pk)
-		pk->pkru = init_pkru_value;
 	/*
 	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/fpu/internal.h>		/* init_fpstate			*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -154,7 +153,6 @@ static ssize_t init_pkru_read_file(struc
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
-	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@@ -177,10 +175,6 @@ static ssize_t init_pkru_write_file(stru
 		return -EINVAL;
 
 	WRITE_ONCE(init_pkru_value, new_init_pkru);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (!pk)
-		return -EINVAL;
-	pk->pkru = new_init_pkru;
 	return count;
 }
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 04/52] x86/fpu: Mark various FPU states __ro_after_init
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (2 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 05/52] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
                   ` (49 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Nothing modifies these after booting.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/fpu/init.c   |    4 ++--
 arch/x86/kernel/fpu/xstate.c |   16 ++++++++++------
 2 files changed, 12 insertions(+), 8 deletions(-)

--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -89,7 +89,7 @@ static void fpu__init_system_early_gener
 /*
  * Boot time FPU feature detection code:
  */
-unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
+unsigned int mxcsr_feature_mask __ro_after_init = 0xffffffffu;
 EXPORT_SYMBOL_GPL(mxcsr_feature_mask);
 
 static void __init fpu__init_system_mxcsr(void)
@@ -135,7 +135,7 @@ static void __init fpu__init_system_gene
  * This is inherent to the XSAVE architecture which puts all state
  * components into a single, continuous memory block:
  */
-unsigned int fpu_kernel_xstate_size;
+unsigned int fpu_kernel_xstate_size __ro_after_init;
 EXPORT_SYMBOL_GPL(fpu_kernel_xstate_size);
 
 /* Get alignment of the TYPE. */
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -59,19 +59,23 @@ static short xsave_cpuid_features[] __in
  * This represents the full set of bits that should ever be set in a kernel
  * XSAVE buffer, both supervisor and user xstates.
  */
-u64 xfeatures_mask_all __read_mostly;
+u64 xfeatures_mask_all __ro_after_init;
 
-static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
  * it is always in standard format for user mode. This is the user
  * mode standard format size used for signal and ptrace frames.
  */
-unsigned int fpu_user_xstate_size;
+unsigned int fpu_user_xstate_size __ro_after_init;
 
 /*
  * Return whether the system supports a given xfeature.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 05/52] x86/fpu: Remove unused get_xsave_field_ptr()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (3 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 04/52] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 06/52] x86/fpu: Move inlines where they belong Thomas Gleixner
                   ` (48 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/xstate.h |    1 -
 arch/x86/kernel/fpu/xstate.c      |   30 ------------------------------
 2 files changed, 31 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -992,36 +992,6 @@ void *get_xsave_addr(struct xregs_state
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
-/*
- * This wraps up the common operations that need to occur when retrieving
- * data from xsave state.  It first ensures that the current task was
- * using the FPU and retrieves the data in to a buffer.  It then calculates
- * the offset of the requested field in the buffer.
- *
- * This function is safe to call whether the FPU is in use or not.
- *
- * Note that this only works on the current task.
- *
- * Inputs:
- *	@xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
- *	XFEATURE_SSE, etc...)
- * Output:
- *	address of the state in the xsave area or NULL if the state
- *	is not present or is in its 'init state'.
- */
-const void *get_xsave_field_ptr(int xfeature_nr)
-{
-	struct fpu *fpu = &current->thread.fpu;
-
-	/*
-	 * fpu__save() takes the CPU's xstate registers
-	 * and saves them off to the 'fpu memory buffer.
-	 */
-	fpu__save(fpu);
-
-	return get_xsave_addr(&fpu->state.xsave, xfeature_nr);
-}
-
 #ifdef CONFIG_ARCH_HAS_PKEYS
 
 /*


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 06/52] x86/fpu: Move inlines where they belong
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (4 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 05/52] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 07/52] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
                   ` (47 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

They are only used in fpstate_init() and there is no point to have them in
a header just to make reading the code harder.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h |   14 --------------
 arch/x86/kernel/fpu/core.c          |   15 +++++++++++++++
 2 files changed, 15 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -87,20 +87,6 @@ extern void fpstate_init_soft(struct swr
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-static inline void fpstate_init_xstate(struct xregs_state *xsave)
-{
-	/*
-	 * XRSTORS requires these bits set in xcomp_bv, or it will
-	 * trigger #GP:
-	 */
-	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
-}
-
-static inline void fpstate_init_fxstate(struct fxregs_state *fx)
-{
-	fx->cwd = 0x37f;
-	fx->mxcsr = MXCSR_DEFAULT;
-}
 extern void fpstate_sanitize_xstate(struct fpu *fpu);
 
 #define user_insn(insn, output, input...)				\
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -181,6 +181,21 @@ void fpu__save(struct fpu *fpu)
 	fpregs_unlock();
 }
 
+static inline void fpstate_init_xstate(struct xregs_state *xsave)
+{
+	/*
+	 * XRSTORS requires these bits set in xcomp_bv, or it will
+	 * trigger #GP:
+	 */
+	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
+}
+
+static inline void fpstate_init_fxstate(struct fxregs_state *fx)
+{
+	fx->cwd = 0x37f;
+	fx->mxcsr = MXCSR_DEFAULT;
+}
+
 /*
  * Legacy x87 fpstate state init:
  */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 07/52] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (5 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 06/52] x86/fpu: Move inlines where they belong Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 08/52] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
                   ` (46 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

If the count argument is larger than the xstate size, this will happily
copy beyond the end of xstate.

Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/regset.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -117,7 +117,7 @@ int xstateregs_set(struct task_struct *t
 	/*
 	 * A whole standard-format XSAVE buffer is needed:
 	 */
-	if ((pos != 0) || (count < fpu_user_xstate_size))
+	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
 	xsave = &fpu->state.xsave;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 08/52] x86/fpu: Sanitize xstateregs_set()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (6 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 07/52] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-15 17:40   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
                   ` (45 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

xstateregs_set() operates on a stopped task and tries to copy the provided
buffer into the task's fpu.state.xsave buffer.

Any error while copying or invalid state detected after copying results in
wiping the target task's FPU state completely including supervisor states.

That's just wrong. The caller supplied invalid data or has a problem with
unmapped memory, so there is absolutely no justification to corrupt the
target state.

Fix this with the following modifications:

 1) If data has to be copied from userspace, allocate a buffer and copy from
    user first.

 2) Use copy_kernel_to_xstate() unconditionally so that header checking
    works correctly.

 3) Return on error without corrupting the target state.

This prevents corrupting states and lets the caller deal with the problem
it caused in the first place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/xstate.h |    4 ---
 arch/x86/kernel/fpu/regset.c      |   44 +++++++++++++++-----------------------
 arch/x86/kernel/fpu/xstate.c      |   14 ++++++------
 3 files changed, 26 insertions(+), 36 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,8 +111,4 @@ void copy_supervisor_to_kernel(struct xr
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
-
-/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr);
-
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -2,11 +2,13 @@
 /*
  * FPU register's regset abstraction, for ptrace, core dumps, etc.
  */
+#include <linux/sched/task_stack.h>
+#include <linux/vmalloc.h>
+
 #include <asm/fpu/internal.h>
 #include <asm/fpu/signal.h>
 #include <asm/fpu/regset.h>
 #include <asm/fpu/xstate.h>
-#include <linux/sched/task_stack.h>
 
 /*
  * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
@@ -108,10 +110,10 @@ int xstateregs_set(struct task_struct *t
 		  const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
+	struct xregs_state *tmpbuf = NULL;
 	int ret;
 
-	if (!boot_cpu_has(X86_FEATURE_XSAVE))
+	if (!static_cpu_has(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
 	/*
@@ -120,32 +122,22 @@ int xstateregs_set(struct task_struct *t
 	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
-	xsave = &fpu->state.xsave;
-
-	fpu__prepare_write(fpu);
-
-	if (using_compacted_format()) {
-		if (kbuf)
-			ret = copy_kernel_to_xstate(xsave, kbuf);
-		else
-			ret = copy_user_to_xstate(xsave, ubuf);
-	} else {
-		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
-		if (!ret)
-			ret = validate_user_xstate_header(&xsave->header);
+	if (!kbuf) {
+		tmpbuf = vmalloc(count);
+		if (!tmpbuf)
+			return -ENOMEM;
+
+		if (copy_from_user(tmpbuf, ubuf, count)) {
+			ret = -EFAULT;
+			goto out;
+		}
 	}
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
-	/*
-	 * In case of failure, mark all states as init:
-	 */
-	if (ret)
-		fpstate_init(&fpu->state);
+	fpu__prepare_write(fpu);
+	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
+out:
+	vfree(tmpbuf);
 	return ret;
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -552,7 +552,7 @@ int using_compacted_format(void)
 }
 
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr)
+static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
 	if (hdr->xfeatures & ~xfeatures_mask_user())
@@ -1149,7 +1149,7 @@ void copy_xstate_to_kernel(struct membuf
 }
 
 /*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVES format
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
@@ -1196,14 +1196,16 @@ int copy_kernel_to_xstate(struct xregs_s
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
+	/* mxcsr reserved bits must be masked to zero for historical reasons. */
+	xsave->i387.mxcsr &= mxcsr_feature_mask;
+
 	return 0;
 }
 
 /*
- * Convert from a ptrace or sigreturn standard-format user-space buffer to
- * kernel XSAVES format and copy to the target thread. This is called from
- * xstateregs_set(), as well as potentially from the sigreturn() and
- * rt_sigreturn() system calls.
+ * Convert from a sigreturn standard-format user-space buffer to kernel
+ * XSAVE[S] format and copy to the target thread. This is called from the
+ * sigreturn() and rt_sigreturn() system calls.
  */
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (7 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 08/52] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16 15:02   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 10/52] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
                   ` (44 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Instead of masking out reserved bits, check them and reject the provided
state as invalid if not zero.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/kernel/fpu/xstate.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1166,6 +1166,14 @@ int copy_kernel_to_xstate(struct xregs_s
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
 
+	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
+		const u32 *mxcsr = kbuf + offsetof(struct fxregs_state, mxcsr);
+
+		/* Reserved bits in MXCSR must be zero. */
+		if (*mxcsr & ~mxcsr_feature_mask)
+			return -EINVAL;
+	}
+
 	for (i = 0; i < XFEATURE_MAX; i++) {
 		u64 mask = ((u64)1 << i);
 
@@ -1196,9 +1204,6 @@ int copy_kernel_to_xstate(struct xregs_s
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
-	/* mxcsr reserved bits must be masked to zero for historical reasons. */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 10/52] x86/fpu: Simplify PTRACE_GETREGS code
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (8 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 11/52] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
                   ` (43 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

ptrace() has interfaces that let a ptracer inspect a ptracee's register state.
This includes XSAVE state.  The ptrace() ABI includes a hardware-format XSAVE
buffer for both the SETREGS and GETREGS interfaces.

In the old days, the kernel buffer and the ptrace() ABI buffer were the
same boring non-compacted format.  But, since the advent of supervisor
states and the compacted format, the kernel buffer has diverged from the
format presented in the ABI.

This leads to two paths in the kernel:
1. Effectively a verbatim copy_to_user() which just copies the kernel buffer
   out to userspace.  This is used when the kernel buffer is kept in the
   non-compacted form which means that it shares a format with the ptrace
   ABI.
2. A one-state-at-a-time path: copy_xstate_to_kernel().  This is theoretically
   slower since it does a bunch of piecemeal copies.

Remove the verbatim copy case.  Speed probably does not matter in this path,
and the vast majority of new hardware will use the one-state-at-a-time path
anyway.  This ensures greater testing for the "slow" path.

This also makes enabling PKRU in this interface easier since a single path
can be patched instead of two.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/regset.c |   22 ++--------------------
 arch/x86/kernel/fpu/xstate.c |    6 +++---
 2 files changed, 5 insertions(+), 23 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -77,32 +77,14 @@ int xstateregs_get(struct task_struct *t
 		struct membuf to)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
 
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	xsave = &fpu->state.xsave;
-
 	fpu__prepare_read(fpu);
 
-	if (using_compacted_format()) {
-		copy_xstate_to_kernel(to, xsave);
-		return 0;
-	} else {
-		fpstate_sanitize_xstate(fpu);
-		/*
-		 * Copy the 48 bytes defined by the software into the xsave
-		 * area in the thread struct, so that we can copy the whole
-		 * area to user using one user_regset_copyout().
-		 */
-		memcpy(&xsave->i387.sw_reserved, xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
-
-		/*
-		 * Copy the xstate memory layout.
-		 */
-		return membuf_write(&to, xsave, fpu_user_xstate_size);
-	}
+	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	return 0;
 }
 
 int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1063,11 +1063,11 @@ static void copy_feature(bool from_xstat
 }
 
 /*
- * Convert from kernel XSAVES compacted format to standard format and copy
- * to a kernel-space ptrace buffer.
+ * Convert from kernel XSAVE or XSAVES compacted format to UABI
+ * non-compacted format and copy to a kernel-space ptrace buffer.
  *
  * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVES.
+ * from xstateregs_get() and there we check the CPU has XSAVE.
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 11/52] x86/fpu: Rewrite xfpregs_set()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (9 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 10/52] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16 15:22   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
                   ` (42 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

xfpregs_set() was incomprehensible.  Almost all of the complexity was due
to trying to support nonsensically sized writes or -EFAULT errors that
would have partially or completely overwritten the destination before
failing.  Nonsensically sized input would only have been possible using
PTRACE_SETREGSET on REGSET_XFP.  Fortunately, it appears (based on Debian
code search results) that no one uses that API at all, let alone with the
wrong sized buffer.  Failed user access can be handled more cleanly by
first copying to kernel memory.

Just rewrite it to require sensible input.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch picked up from Andy
---
 arch/x86/kernel/fpu/regset.c |   40 +++++++++++++++++++++++++---------------
 1 file changed, 25 insertions(+), 15 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -47,30 +47,40 @@ int xfpregs_set(struct task_struct *targ
 		const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
+	struct user32_fxsr_struct newstate;
 	int ret;
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
+	BUILD_BUG_ON(sizeof(newstate) != sizeof(struct fxregs_state));
+
+	if (!static_cpu_has(X86_FEATURE_FXSR))
 		return -ENODEV;
 
-	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
+	/* No funny business with partial or oversized writes is permitted. */
+	if (pos != 0 || count != sizeof(newstate))
+		return -EINVAL;
 
 	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-				 &fpu->state.fxsave, 0, -1);
+				 &newstate, 0, -1);
+	if (ret)
+		return ret;
+
+	/* Mask invalid MXCSR bits (for historical reasons). */
+	newstate.mxcsr &= mxcsr_feature_mask;
+
+	fpu__prepare_write(fpu);
+
+	/* Copy the state  */
+	memcpy(&fpu->state.fxsave, &newstate, sizeof(newstate));
+
+	/* Clear xmm8..15 */
+	BUILD_BUG_ON(sizeof(fpu->state.fxsave.xmm_space) != 16 * 16);
+	memset(&fpu->state.fxsave.xmm_space[8], 0, 8 * 16);
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	fpu->state.fxsave.mxcsr &= mxcsr_feature_mask;
-
-	/*
-	 * update the header bits in the xsave header, indicating the
-	 * presence of FP and SSE state.
-	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
+	/* Mark FP and SSE as in use when XSAVE is enabled */
+	if (use_xsave())
 		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FPSSE;
 
-	return ret;
+	return 0;
 }
 
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (10 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 11/52] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16 15:31   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 13/52] x86/fpu: Clean up fpregs_set() Thomas Gleixner
                   ` (41 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

We're not doing anyone any favors by accepting and silently changing an
invalid MXCSR value supplied via ptrace().  Instead, return -EINVAL on
invalid input.

If this breaks something, we can revert it.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch. Picked up from Andy.
---
 arch/x86/kernel/fpu/regset.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
---
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -65,8 +65,9 @@ int xfpregs_set(struct task_struct *targ
 	if (ret)
 		return ret;
 
-	/* Mask invalid MXCSR bits (for historical reasons). */
-	newstate.mxcsr &= mxcsr_feature_mask;
+	/* Do not allow an invalid MXCSR value. */
+	if (newstate.mxcsr & ~mxcsr_feature_mask)
+		ret = -EINVAL;
 
 	fpu__prepare_write(fpu);
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 13/52] x86/fpu: Clean up fpregs_set()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (11 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16 15:42   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
                   ` (40 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

fpregs_set() had unnecessary complexity to support short or nonzero-offset
writes and to handle the case in which a copy from userspace overwrites
some of the target buffer and then fails.  Support for partial writes is
useless -- just require that the write have offset 0 and the correct size,
and copy into a temporary kernel buffer to avoid clobbering the state if
the user access fails.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch. Picked up from Andy
---
 arch/x86/kernel/fpu/regset.c |   27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)
---
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -305,31 +305,32 @@ int fpregs_set(struct task_struct *targe
 	struct user_i387_ia32_struct env;
 	int ret;
 
-	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
+	/* No funny business with partial or oversized writes is permitted. */
+	if (pos != 0 || count != sizeof(struct user_i387_ia32_struct))
+		return -EINVAL;
 
 	if (!boot_cpu_has(X86_FEATURE_FPU))
 		return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
-		return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-					  &fpu->state.fsave, 0,
-					  -1);
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
+	if (ret)
+		return ret;
 
-	if (pos > 0 || count < sizeof(env))
-		convert_from_fxsr(&env, target);
+	fpu__prepare_write(fpu);
 
-	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
-	if (!ret)
+	if (static_cpu_has(X86_FEATURE_FXSR))
 		convert_to_fxsr(&target->thread.fpu.state.fxsave, &env);
+	else
+		memcpy(&target->thread.fpu.state.fsave, &env, sizeof(env));
 
 	/*
-	 * update the header bit in the xsave header, indicating the
+	 * Update the header bit in the xsave header, indicating the
 	 * presence of FP.
 	 */
-	if (boot_cpu_has(X86_FEATURE_XSAVE))
+	if (static_cpu_has(X86_FEATURE_XSAVE))
 		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FP;
-	return ret;
+
+	return 0;
 }
 
 #endif	/* CONFIG_X86_32 || CONFIG_IA32_EMULATION */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (12 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 13/52] x86/fpu: Clean up fpregs_set() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16 16:13   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get() Thomas Gleixner
                   ` (39 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

When xsave with init state optimiziation is used then a component's state
in the task's xsave buffer can be stale when the corresponding feature bit
is not set.

fpregs_get() and xfpregs_get() invoke fpstate_sanitize_xstate() to update
the task's xsave buffer before retrieving the FX or FP state. That's just
duplicated code as copy_xstate_to_kernel() already handles this correctly.

Add a copy mode argument to the function which allows to restrict the state
copy to the FP and SSE features.

Also rename the function to copy_uabi_xstate_to_membuf() so the name
reflects what it is doing.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/fpu/xstate.h |   12 +++++++++-
 arch/x86/kernel/fpu/regset.c      |    2 -
 arch/x86/kernel/fpu/xstate.c      |   42 ++++++++++++++++++++++++++++----------
 3 files changed, 42 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -103,12 +103,20 @@ extern void __init update_regset_xstate_
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
-struct membuf;
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
+enum xstate_copy_mode {
+	XSTATE_COPY_FP,
+	XSTATE_COPY_FX,
+	XSTATE_COPY_XSAVE,
+};
+
+struct membuf;
+void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave,
+				enum xstate_copy_mode mode);
+
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -94,7 +94,7 @@ int xstateregs_get(struct task_struct *t
 
 	fpu__prepare_read(fpu);
 
-	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1062,14 +1062,20 @@ static void copy_feature(bool from_xstat
 	membuf_write(to, from_xstate ? xstate : init_xstate, size);
 }
 
-/*
- * Convert from kernel XSAVE or XSAVES compacted format to UABI
- * non-compacted format and copy to a kernel-space ptrace buffer.
+/**
+ * copy_uabi_xstate_to_membuf - Copy kernel saved xstate to a UABI buffer
+ * @to:		membuf descriptor
+ * @xsave:	The kernel xstate buffer to copy from
+ * @copy_mode:	The requested copy mode
  *
- * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVE.
+ * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
+ * format, i.e. from the kernel internal hardware dependent storage format
+ * to the requested @mode. UABI XSTATE is always uncompacted!
+ *
+ * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
+void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave,
+				enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
 	struct xregs_state *xinit = &init_fpstate.xsave;
@@ -1077,12 +1083,22 @@ void copy_xstate_to_kernel(struct membuf
 	unsigned int zerofrom;
 	int i;
 
-	/*
-	 * The destination is a ptrace buffer; we put in only user xstates:
-	 */
-	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
-	header.xfeatures &= xfeatures_mask_user();
+
+	/* Mask out the feature bits depending on copy mode */
+	switch (copy_mode) {
+	case XSTATE_COPY_FP:
+		header.xfeatures &= XFEATURE_MASK_FP;
+		break;
+
+	case XSTATE_COPY_FX:
+		header.xfeatures &= XFEATURE_MASK_FP | XFEATURE_MASK_SSE;
+		break;
+
+	case XSTATE_COPY_XSAVE:
+		header.xfeatures &= xfeatures_mask_user();
+		break;
+	}
 
 	/* Copy FP state up to MXCSR */
 	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
@@ -1103,6 +1119,9 @@ void copy_xstate_to_kernel(struct membuf
 		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
 		     16 * 16);
 
+	if (copy_mode != XSTATE_COPY_XSAVE)
+		goto out;
+
 	/* Zero the padding area */
 	membuf_zero(&to, sizeof(xsave->i387.padding));
 
@@ -1144,6 +1163,7 @@ void copy_xstate_to_kernel(struct membuf
 		zerofrom = xstate_offsets[i] + xstate_sizes[i];
 	}
 
+out:
 	if (to.left)
 		membuf_zero(&to, to.left);
 }


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (13 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17  8:59   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get() Thomas Gleixner
                   ` (38 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Use the new functionality of copy_uabi_xstate_to_membuf() to retrieve the
FX state when XSAVE* is in use. This avoids to overwrite the FPU state
buffer with fpstate_sanitize_xstate() which is error prone and duplicated
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/kernel/fpu/regset.c |   11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -33,13 +33,18 @@ int xfpregs_get(struct task_struct *targ
 {
 	struct fpu *fpu = &target->thread.fpu;
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR))
+	if (!static_cpu_has(X86_FEATURE_FXSR))
 		return -ENODEV;
 
 	fpu__prepare_read(fpu);
-	fpstate_sanitize_xstate(fpu);
 
-	return membuf_write(&to, &fpu->state.fxsave, sizeof(struct fxregs_state));
+	if (!use_xsave()) {
+		return membuf_write(&to, &fpu->state.fxsave,
+				    sizeof(fpu->state.fxsave));
+	}
+
+	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave, XSTATE_COPY_FX);
+	return 0;
 }
 
 int xfpregs_set(struct task_struct *target, const struct user_regset *regset,


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (14 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17 11:50   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 17/52] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
                   ` (37 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Use the new functionality of copy_uabi_xstate_to_membuf() to retrieve the
FX state when XSAVE* is in use. This avoids to overwrite the FPU state
buffer with fpstate_sanitize_xstate() which is error prone and duplicated
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/kernel/fpu/regset.c |   30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -211,10 +211,10 @@ static inline u32 twd_fxsr_to_i387(struc
  * FXSR floating point environment conversions.
  */
 
-void
-convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+static void __convert_from_fxsr(struct user_i387_ia32_struct *env,
+				struct task_struct *tsk,
+				struct fxregs_state *fxsave)
 {
-	struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
 	struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
 	struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
 	int i;
@@ -248,6 +248,12 @@ convert_from_fxsr(struct user_i387_ia32_
 		memcpy(&to[i], &from[i], sizeof(to[0]));
 }
 
+void
+convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
+{
+	__convert_from_fxsr(env, tsk, &tsk->thread.fpu.state.fxsave);
+}
+
 void convert_to_fxsr(struct fxregs_state *fxsave,
 		     const struct user_i387_ia32_struct *env)
 
@@ -280,25 +286,29 @@ int fpregs_get(struct task_struct *targe
 {
 	struct fpu *fpu = &target->thread.fpu;
 	struct user_i387_ia32_struct env;
+	struct fxregs_state fxsave, *fx;
 
 	fpu__prepare_read(fpu);
 
-	if (!boot_cpu_has(X86_FEATURE_FPU))
+	if (!static_cpu_has(X86_FEATURE_FPU))
 		return fpregs_soft_get(target, regset, to);
 
-	if (!boot_cpu_has(X86_FEATURE_FXSR)) {
+	if (!static_cpu_has(X86_FEATURE_FXSR)) {
 		return membuf_write(&to, &fpu->state.fsave,
 				    sizeof(struct fregs_state));
 	}
 
-	fpstate_sanitize_xstate(fpu);
+	if (use_xsave()) {
+		struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) };
 
-	if (to.left == sizeof(env)) {
-		convert_from_fxsr(to.p, target);
-		return 0;
+		/* Handle init state optimized xstate correctly */
+		copy_uabi_xstate_to_membuf(mb, &fpu->state.xsave, XSTATE_COPY_FP);
+		fx = &fxsave;
+	} else {
+		fx = &fpu->state.fxsave;
 	}
 
-	convert_from_fxsr(&env, target);
+	__convert_from_fxsr(&env, target, fx);
 	return membuf_write(&to, &env, sizeof(env));
 }
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 17/52] x86/fpu: Remove fpstate_sanitize_xstate()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (15 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 18/52] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
                   ` (36 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

No more users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/fpu/internal.h |    2 
 arch/x86/kernel/fpu/xstate.c        |   79 ------------------------------------
 2 files changed, 81 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -87,8 +87,6 @@ extern void fpstate_init_soft(struct swr
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-extern void fpstate_sanitize_xstate(struct fpu *fpu);
-
 #define user_insn(insn, output, input...)				\
 ({									\
 	int err;							\
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -129,85 +129,6 @@ static bool xfeature_is_supervisor(int x
 }
 
 /*
- * When executing XSAVEOPT (or other optimized XSAVE instructions), if
- * a processor implementation detects that an FPU state component is still
- * (or is again) in its initialized state, it may clear the corresponding
- * bit in the header.xfeatures field, and can skip the writeout of registers
- * to the corresponding memory layout.
- *
- * This means that when the bit is zero, the state component might still contain
- * some previous - non-initialized register state.
- *
- * Before writing xstate information to user-space we sanitize those components,
- * to always ensure that the memory layout of a feature will be in the init state
- * if the corresponding header bit is zero. This is to ensure that user-space doesn't
- * see some stale state in the memory layout during signal handling, debugging etc.
- */
-void fpstate_sanitize_xstate(struct fpu *fpu)
-{
-	struct fxregs_state *fx = &fpu->state.fxsave;
-	int feature_bit;
-	u64 xfeatures;
-
-	if (!use_xsaveopt())
-		return;
-
-	xfeatures = fpu->state.xsave.header.xfeatures;
-
-	/*
-	 * None of the feature bits are in init state. So nothing else
-	 * to do for us, as the memory layout is up to date.
-	 */
-	if ((xfeatures & xfeatures_mask_all) == xfeatures_mask_all)
-		return;
-
-	/*
-	 * FP is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_FP)) {
-		fx->cwd = 0x37f;
-		fx->swd = 0;
-		fx->twd = 0;
-		fx->fop = 0;
-		fx->rip = 0;
-		fx->rdp = 0;
-		memset(fx->st_space, 0, sizeof(fx->st_space));
-	}
-
-	/*
-	 * SSE is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_SSE))
-		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
-
-	/*
-	 * First two features are FPU and SSE, which above we handled
-	 * in a special way already:
-	 */
-	feature_bit = 0x2;
-	xfeatures = (xfeatures_mask_user() & ~xfeatures) >> 2;
-
-	/*
-	 * Update all the remaining memory layouts according to their
-	 * standard xstate layout, if their header bit is in the init
-	 * state:
-	 */
-	while (xfeatures) {
-		if (xfeatures & 0x1) {
-			int offset = xstate_comp_offsets[feature_bit];
-			int size = xstate_sizes[feature_bit];
-
-			memcpy((void *)fx + offset,
-			       (void *)&init_fpstate.xsave + offset,
-			       size);
-		}
-
-		xfeatures >>= 1;
-		feature_bit++;
-	}
-}
-
-/*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
  */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 18/52] x86/fpu: Get rid of using_compacted_format()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (16 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 17/52] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17 11:59   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
                   ` (35 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

This function is pointlessly global and a complete misnomer because it's
usage is related to both supervisor state checks and compacted format
checks. Remove it and just make the conditions check the XSAVES feature.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch.
---
 arch/x86/include/asm/fpu/xstate.h |    1 -
 arch/x86/kernel/fpu/xstate.c      |   22 ++++------------------
 2 files changed, 4 insertions(+), 19 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -458,20 +458,6 @@ int xfeature_size(int xfeature_nr)
 	return eax;
 }
 
-/*
- * 'XSAVES' implies two different things:
- * 1. saving of supervisor/system state
- * 2. using the compacted format
- *
- * Use this function when dealing with the compacted format so
- * that it is obvious which aspect of 'XSAVES' is being handled
- * by the calling code.
- */
-int using_compacted_format(void)
-{
-	return boot_cpu_has(X86_FEATURE_XSAVES);
-}
-
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
@@ -590,9 +576,9 @@ static void do_extra_xstate_size_checks(
 		check_xstate_against_struct(i);
 		/*
 		 * Supervisor state components can be managed only by
-		 * XSAVES, which is compacted-format only.
+		 * XSAVES.
 		 */
-		if (!using_compacted_format())
+		if (!static_cpu_has(X86_FEATURE_XSAVES))
 			XSTATE_WARN_ON(xfeature_is_supervisor(i));
 
 		/* Align from the end of the previous feature */
@@ -602,9 +588,9 @@ static void do_extra_xstate_size_checks(
 		 * The offset of a given state in the non-compacted
 		 * format is given to us in a CPUID leaf.  We check
 		 * them for being ordered (increasing offsets) in
-		 * setup_xstate_features().
+		 * setup_xstate_features(). XSAVES uses compacted format.
 		 */
-		if (!using_compacted_format())
+		if (!static_cpu_has(X86_FEATURE_XSAVES))
 			paranoid_xstate_size = xfeature_uncompacted_offset(i);
 		/*
 		 * The compacted-format offset always depends on where


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (17 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 18/52] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17 12:09   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
                   ` (34 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

PKRU is being removed from the kernel XSAVE/FPU buffers.  This removal
will probably include warnings for code that look up PKRU in those
buffers.

KVM currently looks up the location of PKRU but doesn't even use the
pointer that it gets back.  Rework the code to avoid calling
get_xsave_addr() except in cases where its result is actually used.

This makes the code more clear and also avoids the inevitable PKRU
warnings.

This is probably a good cleanup and could go upstream idependently
of any PKRU rework.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kvm/x86.c |   41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4589,20 +4589,21 @@ static void fill_xsave(u8 *dest, struct
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *src = get_xsave_addr(xsave, xfeature_nr);
+		void *src;
 
-		if (src) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(dest + offset, &vcpu->arch.pkru,
-				       sizeof(vcpu->arch.pkru));
-			else
-				memcpy(dest + offset, src, size);
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
 
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(dest + offset, &vcpu->arch.pkru,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			src = get_xsave_addr(xsave, xfeature_nr);
+			if (src)
+				memcpy(dest + offset, src, size);
 		}
 
 		valid -= xfeature_mask;
@@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *dest = get_xsave_addr(xsave, xfeature_nr);
 
-		if (dest) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(&vcpu->arch.pkru, src + offset,
-				       sizeof(vcpu->arch.pkru));
-			else
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
+
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(&vcpu->arch.pkru, src + offset,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			void *dest = get_xsave_addr(xsave, xfeature_nr);
+
+			if (dest)
 				memcpy(dest, src + offset, size);
 		}
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (18 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17 12:22   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
                   ` (33 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function is having a sanity check with a WARN_ON_ONCE() but happily
proceeds when the pkey argument is out of range.

Clean it up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -887,11 +887,10 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
  * rights for @pkey to @init_val.
  */
 int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
-		unsigned long init_val)
+			      unsigned long init_val)
 {
-	u32 old_pkru;
-	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
-	u32 new_pkru_bits = 0;
+	u32 old_pkru, new_pkru_bits = 0;
+	int pkey_shift;
 
 	/*
 	 * This check implies XSAVE support.  OSPKE only gets
@@ -905,7 +904,8 @@ int arch_set_user_pkey_access(struct tas
 	 * values originating from in-kernel users.  Complain
 	 * if a bad value is observed.
 	 */
-	WARN_ON_ONCE(pkey >= arch_max_pkey());
+	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
+		return -EINVAL;
 
 	/* Set the bits we need in PKRU:  */
 	if (init_val & PKEY_DISABLE_ACCESS)
@@ -914,6 +914,7 @@ int arch_set_user_pkey_access(struct tas
 		new_pkru_bits |= PKRU_WD_BIT;
 
 	/* Shift the bits in to the correct place in PKRU for pkey: */
+	pkey_shift = pkey * PKRU_BITS_PER_PKEY;
 	new_pkru_bits <<= pkey_shift;
 
 	/* Get old PKRU and mask off any old bits in place: */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (19 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17 12:41   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
                   ` (32 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

If the fast path of restoring the FPU state on sigreturn fails or is not
taken and the current task's FPU is active then the FPU has to be
deactivated for the slow path to allow a safe update of the tasks FPU
memory state.

With supervisor states enabled, this requires to save the supervisor state
in the memory state first. Supervisor states require XSAVES so saving only
the supervisor state requires to reshuffle the memory buffer because XSAVES
uses the compacted format and therefore stores the supervisor states at the
beginning of the memory state. That's just an overengineered optimization.

Get rid of it and save the full state for this case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/fpu/xstate.h |    1 
 arch/x86/kernel/fpu/signal.c      |   13 +++++---
 arch/x86/kernel/fpu/xstate.c      |   55 --------------------------------------
 3 files changed, 8 insertions(+), 61 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,7 +104,6 @@ void *get_xsave_addr(struct xregs_state
 int xfeature_size(int xfeature_nr);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -411,15 +411,18 @@ static int __fpu__restore_sig(void __use
 	 * the optimisation).
 	 */
 	fpregs_lock();
-
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-
 		/*
-		 * Supervisor states are not modified by user space input.  Save
-		 * current supervisor states first and invalidate the FPU regs.
+		 * If supervisor states are available then save the
+		 * hardware state in current's fpstate so that the
+		 * supervisor state is preserved. Save the full state for
+		 * simplicity. There is no point in optimizing this by only
+		 * saving the supervisor states and then shuffle them to
+		 * the right place in memory. This is the slow path and the
+		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_supervisor_to_kernel(&fpu->state.xsave);
+			copy_xregs_to_kernel(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1185,61 +1185,6 @@ int copy_user_to_xstate(struct xregs_sta
 	return 0;
 }
 
-/*
- * Save only supervisor states to the kernel buffer.  This blows away all
- * old states, and is intended to be used only in __fpu__restore_sig(), where
- * user states are restored from the user buffer.
- */
-void copy_supervisor_to_kernel(struct xregs_state *xstate)
-{
-	struct xstate_header *header;
-	u64 max_bit, min_bit;
-	u32 lmask, hmask;
-	int err, i;
-
-	if (WARN_ON(!boot_cpu_has(X86_FEATURE_XSAVES)))
-		return;
-
-	if (!xfeatures_mask_supervisor())
-		return;
-
-	max_bit = __fls(xfeatures_mask_supervisor());
-	min_bit = __ffs(xfeatures_mask_supervisor());
-
-	lmask = xfeatures_mask_supervisor();
-	hmask = xfeatures_mask_supervisor() >> 32;
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* We should never fault when copying to a kernel buffer: */
-	if (WARN_ON_FPU(err))
-		return;
-
-	/*
-	 * At this point, the buffer has only supervisor states and must be
-	 * converted back to normal kernel format.
-	 */
-	header = &xstate->header;
-	header->xcomp_bv |= xfeatures_mask_all;
-
-	/*
-	 * This only moves states up in the buffer.  Start with
-	 * the last state and move backwards so that states are
-	 * not overwritten until after they are moved.  Note:
-	 * memmove() allows overlapping src/dst buffers.
-	 */
-	for (i = max_bit; i >= min_bit; i--) {
-		u8 *xbuf = (u8 *)xstate;
-
-		if (!((header->xfeatures >> i) & 1))
-			continue;
-
-		/* Move xfeature 'i' into its normal location */
-		memmove(xbuf + xstate_comp_offsets[i],
-			xbuf + xstate_supervisor_only_offsets[i],
-			xstate_sizes[i]);
-	}
-}
-
 /**
  * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
  *                                       an xsave area


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (20 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-17 12:48   ` Borislav Petkov
  2021-06-14 15:44 ` [patch V2 23/52] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
                   ` (31 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_kernel() to xsave_to_kernel()
	copy_kernel_to_xregs() to xrstor_from_kernel()

so it's entirely clear what this is about.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   10 +++++-----
 arch/x86/kernel/fpu/core.c          |    6 +++---
 arch/x86/kernel/fpu/signal.c        |   14 +++++++-------
 arch/x86/kernel/fpu/xstate.c        |    2 +-
 4 files changed, 16 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -264,7 +264,7 @@ static inline void fxsave_to_kernel(stru
  * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
-static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
+static inline void xrstor_from_kernel_booting(struct xregs_state *xstate)
 {
 	u64 mask = -1;
 	u32 lmask = mask;
@@ -288,7 +288,7 @@ static inline void copy_kernel_to_xregs_
 /*
  * Save processor xstate to xsave area.
  */
-static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
+static inline void xsave_to_kernel(struct xregs_state *xstate)
 {
 	u64 mask = xfeatures_mask_all;
 	u32 lmask = mask;
@@ -306,7 +306,7 @@ static inline void copy_xregs_to_kernel(
 /*
  * Restore processor xstate from xsave area.
  */
-static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask)
+static inline void xrstor_from_kernel(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
@@ -367,7 +367,7 @@ static inline int copy_user_to_xregs(str
  * Restore xstate from kernel space xsave area, return an error code instead of
  * an exception.
  */
-static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
+static inline int xrstor_from_kernel_err(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
@@ -386,7 +386,7 @@ extern int copy_fpregs_to_fpstate(struct
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
-		copy_kernel_to_xregs(&fpstate->xsave, mask);
+		xrstor_from_kernel(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
 			copy_kernel_to_fxregs(&fpstate->fxsave);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -95,7 +95,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
 int copy_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
-		copy_xregs_to_kernel(&fpu->state.xsave);
+		xsave_to_kernel(&fpu->state.xsave);
 
 		/*
 		 * AVX512 state is tracked here because its use is
@@ -358,7 +358,7 @@ void fpu__drop(struct fpu *fpu)
 static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
 {
 	if (use_xsave())
-		copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
+		xrstor_from_kernel(&init_fpstate.xsave, features_mask);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		copy_kernel_to_fxregs(&init_fpstate.fxsave);
 	else
@@ -389,7 +389,7 @@ static void fpu__clear(struct fpu *fpu,
 	if (user_only) {
 		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
 		    xfeatures_mask_supervisor())
-			copy_kernel_to_xregs(&fpu->state.xsave,
+			xrstor_from_kernel(&fpu->state.xsave,
 					     xfeatures_mask_supervisor());
 		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
 	} else {
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -271,14 +271,14 @@ static int copy_user_to_fpregs_zeroing(v
 
 			r = copy_user_to_fxregs(buf);
 			if (!r)
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
 			r = copy_user_to_xregs(buf, xbv);
 			if (!r && unlikely(init_bv))
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
 		}
 	} else if (use_fxsr()) {
@@ -367,7 +367,7 @@ static int __fpu__restore_sig(void __use
 			 */
 			if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
 			    xfeatures_mask_supervisor())
-				copy_kernel_to_xregs(&fpu->state.xsave,
+				xrstor_from_kernel(&fpu->state.xsave,
 						     xfeatures_mask_supervisor());
 			fpregs_mark_activate();
 			fpregs_unlock();
@@ -422,7 +422,7 @@ static int __fpu__restore_sig(void __use
 		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_xregs_to_kernel(&fpu->state.xsave);
+			xsave_to_kernel(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
@@ -440,13 +440,13 @@ static int __fpu__restore_sig(void __use
 
 		fpregs_lock();
 		if (unlikely(init_bv))
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 
 		/*
 		 * Restore previously saved supervisor xstates along with
 		 * copied-in user xstates.
 		 */
-		ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
+		ret = xrstor_from_kernel_err(&fpu->state.xsave,
 					       user_xfeatures | xfeatures_mask_supervisor());
 
 	} else if (use_fxsr()) {
@@ -464,7 +464,7 @@ static int __fpu__restore_sig(void __use
 			u64 init_bv;
 
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 		}
 
 		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -411,7 +411,7 @@ static void __init setup_init_fpu_buf(vo
 	/*
 	 * Init all the features state with header.xfeatures being 0x0
 	 */
-	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
+	xrstor_from_kernel_booting(&init_fpstate.xsave);
 
 	/*
 	 * All components are now in init state. Read the state back so


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 23/52] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (21 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 24/52] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
                   ` (30 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_user() to xsave_to_user_sigframe()
	copy_user_to_xregs() to xrstor_from_user_sigframe()

so it's entirely clear what this is about. This is also a clear indicator
of the potentially different storage format because this is user ABI and
cannot use compacted format.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/kernel/fpu/signal.c        |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -324,7 +324,7 @@ static inline void xrstor_from_kernel(st
  * backward compatibility for old applications which don't understand
  * compacted format of xsave area.
  */
-static inline int copy_xregs_to_user(struct xregs_state __user *buf)
+static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
 	u64 mask = xfeatures_mask_user();
 	u32 lmask = mask;
@@ -349,7 +349,7 @@ static inline int copy_xregs_to_user(str
 /*
  * Restore xstate from user space xsave area.
  */
-static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
+static inline int xrstor_from_user_sigframe(struct xregs_state __user *buf, u64 mask)
 {
 	struct xregs_state *xstate = ((__force struct xregs_state *)buf);
 	u32 lmask = mask;
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -129,7 +129,7 @@ static inline int copy_fpregs_to_sigfram
 	int err;
 
 	if (use_xsave())
-		err = copy_xregs_to_user(buf);
+		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
 		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
 	else
@@ -276,7 +276,7 @@ static int copy_user_to_fpregs_zeroing(v
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
-			r = copy_user_to_xregs(buf, xbv);
+			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
 				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 24/52] x86/fpu: Rename fxregs related copy functions
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (22 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 23/52] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 25/52] x86/fpu: Rename fregs " Thomas Gleixner
                   ` (29 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for fxsave/fxrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_fxregs_to_kernel() to fxsave_to_kernel()
	copy_kernel_to_fxregs() to fxrstor_from_kernel()
	copy_fxregs_to_user() to fxsave_to_user_sigframe()
	copy_user_to_fxregs() to fxrstor_from_user_sigframe()

so it's entirely clear what this is about.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   18 +++++-------------
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kernel/fpu/signal.c        |   10 +++++-----
 3 files changed, 12 insertions(+), 20 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -132,7 +132,7 @@ static inline int copy_fregs_to_user(str
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
 
-static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)
+static inline int fxsave_to_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxsave %[fx], [fx] "=m" (*fx), "m" (*fx));
@@ -141,7 +141,7 @@ static inline int copy_fxregs_to_user(st
 
 }
 
-static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
+static inline void fxrstor_from_kernel(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		kernel_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -149,7 +149,7 @@ static inline void copy_kernel_to_fxregs
 		kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
+static inline int fxrstor_from_kernel_err(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return kernel_insn_err(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -157,7 +157,7 @@ static inline int copy_kernel_to_fxregs_
 		return kernel_insn_err(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
+static inline int fxrstor_from_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -180,14 +180,6 @@ static inline int copy_user_to_fregs(str
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_fxregs_to_kernel(struct fpu *fpu)
-{
-	if (IS_ENABLED(CONFIG_X86_32))
-		asm volatile( "fxsave %[fx]" : [fx] "=m" (fpu->state.fxsave));
-	else
-		asm volatile("fxsaveq %[fx]" : [fx] "=m" (fpu->state.fxsave));
-}
-
 static inline void fxsave_to_kernel(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
@@ -389,7 +381,7 @@ static inline void __copy_kernel_to_fpre
 		xrstor_from_kernel(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
-			copy_kernel_to_fxregs(&fpstate->fxsave);
+			fxrstor_from_kernel(&fpstate->fxsave);
 		else
 			copy_kernel_to_fregs(&fpstate->fsave);
 	}
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -107,7 +107,7 @@ int copy_fpregs_to_fpstate(struct fpu *f
 	}
 
 	if (likely(use_fxsr())) {
-		copy_fxregs_to_kernel(fpu);
+		fxsave_to_kernel(&fpu->state.fxsave);
 		return 1;
 	}
 
@@ -360,7 +360,7 @@ static inline void copy_init_fpstate_to_
 	if (use_xsave())
 		xrstor_from_kernel(&init_fpstate.xsave, features_mask);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
-		copy_kernel_to_fxregs(&init_fpstate.fxsave);
+		fxrstor_from_kernel(&init_fpstate.fxsave);
 	else
 		copy_kernel_to_fregs(&init_fpstate.fsave);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -64,7 +64,7 @@ static inline int save_fsave_header(stru
 
 		fpregs_lock();
 		if (!test_thread_flag(TIF_NEED_FPU_LOAD))
-			copy_fxregs_to_kernel(&tsk->thread.fpu);
+			fxsave_to_kernel(&tsk->thread.fpu.state.fxsave);
 		fpregs_unlock();
 
 		convert_from_fxsr(&env, tsk);
@@ -131,7 +131,7 @@ static inline int copy_fpregs_to_sigfram
 	if (use_xsave())
 		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
-		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
+		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
 		err = copy_fregs_to_user((struct fregs_state __user *) buf);
 
@@ -269,7 +269,7 @@ static int copy_user_to_fpregs_zeroing(v
 		if (fx_only) {
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
 
-			r = copy_user_to_fxregs(buf);
+			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
@@ -282,7 +282,7 @@ static int copy_user_to_fpregs_zeroing(v
 			return r;
 		}
 	} else if (use_fxsr()) {
-		return copy_user_to_fxregs(buf);
+		return fxrstor_from_user_sigframe(buf);
 	} else
 		return copy_user_to_fregs(buf);
 }
@@ -467,7 +467,7 @@ static int __fpu__restore_sig(void __use
 			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 		}
 
-		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
+		ret = fxrstor_from_kernel_err(&fpu->state.fxsave);
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 25/52] x86/fpu: Rename fregs related copy functions
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (23 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 24/52] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 26/52] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
                   ` (28 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for fnsave/fnrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_fregs_to_kernel() to fnsave_to_kernel()
	copy_kernel_to_fregs() to fnrstor_from_kernel()
	copy_fregs_to_user()   to fnsave_to_user_sigframe()
	copy_user_to_fregs()   to fnrstor_from_user_sigframe()

so it's entirely clear what this is about.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   10 +++++-----
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |    6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -127,7 +127,7 @@ static inline void fpstate_init_soft(str
 		     _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_fprestore)	\
 		     : output : input)
 
-static inline int copy_fregs_to_user(struct fregs_state __user *fx)
+static inline int fnsave_to_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
@@ -165,17 +165,17 @@ static inline int fxrstor_from_user_sigf
 		return user_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_kernel_to_fregs(struct fregs_state *fx)
+static inline void frstor_from_kernel(struct fregs_state *fx)
 {
 	kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fregs_err(struct fregs_state *fx)
+static inline int frstor_from_kernel_err(struct fregs_state *fx)
 {
 	return kernel_insn_err(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fregs(struct fregs_state __user *fx)
+static inline int frstor_from_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
@@ -383,7 +383,7 @@ static inline void __copy_kernel_to_fpre
 		if (use_fxsr())
 			fxrstor_from_kernel(&fpstate->fxsave);
 		else
-			copy_kernel_to_fregs(&fpstate->fsave);
+			frstor_from_kernel(&fpstate->fsave);
 	}
 }
 
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -362,7 +362,7 @@ static inline void copy_init_fpstate_to_
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		fxrstor_from_kernel(&init_fpstate.fxsave);
 	else
-		copy_kernel_to_fregs(&init_fpstate.fsave);
+		frstor_from_kernel(&init_fpstate.fsave);
 
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -133,7 +133,7 @@ static inline int copy_fpregs_to_sigfram
 	else if (use_fxsr())
 		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
-		err = copy_fregs_to_user((struct fregs_state __user *) buf);
+		err = fnsave_to_user_sigframe((struct fregs_state __user *) buf);
 
 	if (unlikely(err) && __clear_user(buf, fpu_user_xstate_size))
 		err = -EFAULT;
@@ -284,7 +284,7 @@ static int copy_user_to_fpregs_zeroing(v
 	} else if (use_fxsr()) {
 		return fxrstor_from_user_sigframe(buf);
 	} else
-		return copy_user_to_fregs(buf);
+		return frstor_from_user_sigframe(buf);
 }
 
 static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
@@ -474,7 +474,7 @@ static int __fpu__restore_sig(void __use
 			goto out;
 
 		fpregs_lock();
-		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
+		ret = frstor_from_kernel_err(&fpu->state.fsave);
 	}
 	if (!ret)
 		fpregs_mark_activate();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 26/52] x86/fpu: Rename xstate copy functions which are related to UABI
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (24 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 25/52] x86/fpu: Rename fregs " Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 27/52] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
                   ` (27 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Rename them to reflect that these functions deal with user space format
XSAVE buffers.

      copy_xstate_to_kernel() -> copy_uabi_xstate_to_membuf()
      copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate()
      copy_user_to_xstate()   -> copy_sigframe_from_user_to_xstate()

Again a clear statement that these functions deal with user space ABI.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/xstate.h |    4 ++--
 arch/x86/kernel/fpu/regset.c      |    2 +-
 arch/x86/kernel/fpu/signal.c      |    2 +-
 arch/x86/kernel/fpu/xstate.c      |    5 +++--
 4 files changed, 7 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -102,8 +102,8 @@ extern void __init update_regset_xstate_
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -132,7 +132,7 @@ int xstateregs_set(struct task_struct *t
 	}
 
 	fpu__prepare_write(fpu);
-	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
+	ret = copy_uabi_from_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
 out:
 	vfree(tmpbuf);
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -431,7 +431,7 @@ static int __fpu__restore_sig(void __use
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
-		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
+		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			goto out;
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1080,7 +1080,7 @@ void copy_uabi_xstate_to_membuf(struct m
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 {
 	unsigned int offset, size;
 	int i;
@@ -1140,7 +1140,8 @@ int copy_kernel_to_xstate(struct xregs_s
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
  */
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
+				      const void __user *ubuf)
 {
 	unsigned int offset, size;
 	int i;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 27/52] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (25 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 26/52] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 28/52] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
                   ` (26 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

copy_uabi_from_user_to_xstate() and copy_uabi_from_kernel_to_xstate() are
almost identical except for the copy function.

Unify them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/fpu/xstate.c |   84 +++++++++++++++----------------------------
 1 file changed, 30 insertions(+), 54 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1076,20 +1076,30 @@ void copy_uabi_xstate_to_membuf(struct m
 		membuf_zero(&to, to.left);
 }
 
-/*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
- * and copy to the target thread. This is called from xstateregs_set().
- */
-int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+static int copy_from_buffer(void *dst, unsigned int offset, unsigned int size,
+			    const void *kbuf, const void __user *ubuf)
+{
+	if (kbuf) {
+		memcpy(dst, kbuf + offset, size);
+	} else {
+		if (copy_from_user(dst, ubuf + offset, size))
+			return -EFAULT;
+	}
+	return 0;
+}
+
+static int copy_uabi_to_xstate(struct xregs_state *xsave, const void *kbuf,
+			       const void __user *ubuf)
 {
 	unsigned int offset, size;
-	int i;
 	struct xstate_header hdr;
+	int i;
 
 	offset = offsetof(struct xregs_state, header);
 	size = sizeof(hdr);
 
-	memcpy(&hdr, kbuf + offset, size);
+	if (copy_from_buffer(&hdr, offset, size, kbuf, ubuf))
+		return -EFAULT;
 
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
@@ -1111,7 +1121,8 @@ int copy_uabi_from_kernel_to_xstate(stru
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
 
-			memcpy(dst, kbuf + offset, size);
+			if (copy_from_buffer(dst, offset, size, kbuf, ubuf))
+				return -EFAULT;
 		}
 	}
 
@@ -1136,6 +1147,16 @@ int copy_uabi_from_kernel_to_xstate(stru
 }
 
 /*
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S]
+ * format and copy to the target thread. This is called from
+ * xstateregs_set().
+ */
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+{
+	return copy_uabi_to_xstate(xsave, kbuf, NULL);
+}
+
+/*
  * Convert from a sigreturn standard-format user-space buffer to kernel
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
@@ -1143,52 +1164,7 @@ int copy_uabi_from_kernel_to_xstate(stru
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
 				      const void __user *ubuf)
 {
-	unsigned int offset, size;
-	int i;
-	struct xstate_header hdr;
-
-	offset = offsetof(struct xregs_state, header);
-	size = sizeof(hdr);
-
-	if (copy_from_user(&hdr, ubuf + offset, size))
-		return -EFAULT;
-
-	if (validate_user_xstate_header(&hdr))
-		return -EINVAL;
-
-	for (i = 0; i < XFEATURE_MAX; i++) {
-		u64 mask = ((u64)1 << i);
-
-		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, i);
-
-			offset = xstate_offsets[i];
-			size = xstate_sizes[i];
-
-			if (copy_from_user(dst, ubuf + offset, size))
-				return -EFAULT;
-		}
-	}
-
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		if (copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
-			return -EFAULT;
-	}
-
-	/*
-	 * The state that came in from userspace was user-state only.
-	 * Mask all the user states out of 'xfeatures':
-	 */
-	xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR_ALL;
-
-	/*
-	 * Add back in the features that came in from userspace:
-	 */
-	xsave->header.xfeatures |= hdr.xfeatures;
-
-	return 0;
+	return copy_uabi_to_xstate(xsave, NULL, ubuf);
 }
 
 /**


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 28/52] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (26 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 27/52] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 29/52] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
                   ` (25 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

A copy is guaranteed to leave the source intact, which is not the case when
FNSAVE is used as that reinitilizes the registers.

Rename it to save_fpregs_to_fpstate() which does not make such guarantees.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/kernel/fpu/core.c          |   10 +++++-----
 arch/x86/kvm/x86.c                  |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -373,7 +373,7 @@ static inline int xrstor_from_kernel_err
 	return err;
 }
 
-extern int copy_fpregs_to_fpstate(struct fpu *fpu);
+extern int save_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
@@ -505,7 +505,7 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!copy_fpregs_to_fpstate(old_fpu))
+		if (!save_fpregs_to_fpstate(old_fpu))
 			old_fpu->last_cpu = -1;
 		else
 			old_fpu->last_cpu = cpu;
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -92,7 +92,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
  * Modern FPU state can be kept in registers, if there are
  * no pending FP exceptions.
  */
-int copy_fpregs_to_fpstate(struct fpu *fpu)
+int save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		xsave_to_kernel(&fpu->state.xsave);
@@ -119,7 +119,7 @@ int copy_fpregs_to_fpstate(struct fpu *f
 
 	return 0;
 }
-EXPORT_SYMBOL(copy_fpregs_to_fpstate);
+EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -137,7 +137,7 @@ void kernel_fpu_begin_mask(unsigned int
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
 		 */
-		copy_fpregs_to_fpstate(&current->thread.fpu);
+		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
 
@@ -172,7 +172,7 @@ void fpu__save(struct fpu *fpu)
 	trace_x86_fpu_before_save(fpu);
 
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!copy_fpregs_to_fpstate(fpu)) {
+		if (!save_fpregs_to_fpstate(fpu)) {
 			copy_kernel_to_fpregs(&fpu->state);
 		}
 	}
@@ -255,7 +255,7 @@ int fpu__copy(struct task_struct *dst, s
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!copy_fpregs_to_fpstate(dst_fpu))
+	else if (!save_fpregs_to_fpstate(dst_fpu))
 		copy_kernel_to_fpregs(&dst_fpu->state);
 
 	fpregs_unlock();
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9618,7 +9618,7 @@ static void kvm_save_current_fpu(struct
 		memcpy(&fpu->state, &current->thread.fpu.state,
 		       fpu_kernel_xstate_size);
 	else
-		copy_fpregs_to_fpstate(fpu);
+		save_fpregs_to_fpstate(fpu);
 }
 
 /* Swap (qemu) user FPU context for the guest FPU context. */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 29/52] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (27 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 28/52] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 30/52] x86/fpu: Rename initstate copy functions Thomas Gleixner
                   ` (24 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

This is not a copy functionality. It restores the register state from the
supplied kernel buffer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    8 ++++----
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kvm/x86.c                  |    4 ++--
 arch/x86/mm/extable.c               |    2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -375,7 +375,7 @@ static inline int xrstor_from_kernel_err
 
 extern int save_fpregs_to_fpstate(struct fpu *fpu);
 
-static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
+static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
 		xrstor_from_kernel(&fpstate->xsave, mask);
@@ -387,7 +387,7 @@ static inline void __copy_kernel_to_fpre
 	}
 }
 
-static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
+static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
 	/*
 	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
@@ -402,7 +402,7 @@ static inline void copy_kernel_to_fpregs
 			: : [addr] "m" (fpstate));
 	}
 
-	__copy_kernel_to_fpregs(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
@@ -473,7 +473,7 @@ static inline void __fpregs_load_activat
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		copy_kernel_to_fpregs(&fpu->state);
+		restore_fpregs_from_fpstate(&fpu->state);
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -173,7 +173,7 @@ void fpu__save(struct fpu *fpu)
 
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		if (!save_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
+			restore_fpregs_from_fpstate(&fpu->state);
 		}
 	}
 
@@ -256,7 +256,7 @@ int fpu__copy(struct task_struct *dst, s
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
 	else if (!save_fpregs_to_fpstate(dst_fpu))
-		copy_kernel_to_fpregs(&dst_fpu->state);
+		restore_fpregs_from_fpstate(&dst_fpu->state);
 
 	fpregs_unlock();
 
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9634,7 +9634,7 @@ static void kvm_load_guest_fpu(struct kv
 	 */
 	if (vcpu->arch.guest_fpu)
 		/* PKRU is separately restored in kvm_x86_ops.run. */
-		__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state,
+		__restore_fpregs_from_fpstate(&vcpu->arch.guest_fpu->state,
 					~XFEATURE_MASK_PKRU);
 
 	fpregs_mark_activate();
@@ -9655,7 +9655,7 @@ static void kvm_put_guest_fpu(struct kvm
 	if (vcpu->arch.guest_fpu)
 		kvm_save_current_fpu(vcpu->arch.guest_fpu);
 
-	copy_kernel_to_fpregs(&vcpu->arch.user_fpu->state);
+	restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state);
 
 	fpregs_mark_activate();
 	fpregs_unlock();
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__copy_kernel_to_fpregs(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, -1);
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 30/52] x86/fpu: Rename initstate copy functions
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (28 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 29/52] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 31/52] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
                   ` (23 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Again this not a copy. It's loading register state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/core.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -355,7 +355,7 @@ void fpu__drop(struct fpu *fpu)
  * Clear FPU registers by setting them up from the init fpstate.
  * Caller must do fpregs_[un]lock() around it.
  */
-static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
+static inline void load_fpregs_from_init_fpstate(u64 features_mask)
 {
 	if (use_xsave())
 		xrstor_from_kernel(&init_fpstate.xsave, features_mask);
@@ -391,9 +391,9 @@ static void fpu__clear(struct fpu *fpu,
 		    xfeatures_mask_supervisor())
 			xrstor_from_kernel(&fpu->state.xsave,
 					     xfeatures_mask_supervisor());
-		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+		load_fpregs_from_init_fpstate(xfeatures_mask_user());
 	} else {
-		copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+		load_fpregs_from_init_fpstate(xfeatures_mask_all);
 	}
 
 	fpregs_mark_activate();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 31/52] x86/fpu: Rename "dynamic" XSTATEs to "independent"
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (29 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 30/52] x86/fpu: Rename initstate copy functions Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
                   ` (22 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

The salient feature of "dynamic" XSTATEs is that they are not part of the
main task XSTATE buffer.  The fact that they are dynamically allocated is
irrelevant and will become quite confusing when user math XSTATEs start
being dynamically allocated.  Rename them to "independent" because they
are independent of the main XSTATE code.

This is just a search-and-replace with some whitespace updates to keep
things aligned.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1eecb0e4f3e07828ebe5d737ec77dc3b708fad2d.1623388344.git.luto@kernel.org
---
 arch/x86/events/intel/lbr.c       |    6 +--
 arch/x86/include/asm/fpu/xstate.h |   14 ++++----
 arch/x86/kernel/fpu/xstate.c      |   62 +++++++++++++++++++-------------------
 3 files changed, 41 insertions(+), 41 deletions(-)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_dynamic_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_dynamic_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_dynamic_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -56,7 +56,7 @@
  * - Don't set the bit corresponding to the dynamic supervisor feature in
  *   IA32_XSS at run time, since it has been set at boot time.
  */
-#define XFEATURE_MASK_DYNAMIC (XFEATURE_MASK_LBR)
+#define XFEATURE_MASK_INDEPENDENT (XFEATURE_MASK_LBR)
 
 /*
  * Unsupported supervisor features. When a supervisor feature in this mask is
@@ -66,7 +66,7 @@
 
 /* All supervisor states including supported and unsupported states. */
 #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
-				      XFEATURE_MASK_DYNAMIC | \
+				      XFEATURE_MASK_INDEPENDENT | \
 				      XFEATURE_MASK_SUPERVISOR_UNSUPPORTED)
 
 #ifdef CONFIG_X86_64
@@ -87,12 +87,12 @@ static inline u64 xfeatures_mask_user(vo
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_dynamic(void)
+static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
-		return XFEATURE_MASK_DYNAMIC & ~XFEATURE_MASK_LBR;
+		return XFEATURE_MASK_INDEPENDENT & ~XFEATURE_MASK_LBR;
 
-	return XFEATURE_MASK_DYNAMIC;
+	return XFEATURE_MASK_INDEPENDENT;
 }
 
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
@@ -104,8 +104,8 @@ void *get_xsave_addr(struct xregs_state
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
+void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
+void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
 
 enum xstate_copy_mode {
 	XSTATE_COPY_FP,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -162,7 +162,7 @@ void fpu__init_cpu_xstate(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() |
-				     xfeatures_mask_dynamic());
+				     xfeatures_mask_independent());
 	}
 }
 
@@ -560,7 +560,7 @@ static void check_xstate_against_struct(
  * how large the XSAVE buffer needs to be.  We are recalculating
  * it to be safe.
  *
- * Dynamic XSAVE features allocate their own buffers and are not
+ * Independent XSAVE features allocate their own buffers and are not
  * covered by these checks. Only the size of the buffer for task->fpu
  * is checked here.
  */
@@ -626,18 +626,18 @@ static unsigned int __init get_xsaves_si
 }
 
 /*
- * Get the total size of the enabled xstates without the dynamic supervisor
+ * Get the total size of the enabled xstates without the independent supervisor
  * features.
  */
-static unsigned int __init get_xsaves_size_no_dynamic(void)
+static unsigned int __init get_xsaves_size_no_independent(void)
 {
-	u64 mask = xfeatures_mask_dynamic();
+	u64 mask = xfeatures_mask_independent();
 	unsigned int size;
 
 	if (!mask)
 		return get_xsaves_size();
 
-	/* Disable dynamic features. */
+	/* Disable independent features. */
 	wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor());
 
 	/*
@@ -646,7 +646,7 @@ static unsigned int __init get_xsaves_si
 	 */
 	size = get_xsaves_size();
 
-	/* Re-enable dynamic features so XSAVES will work on them again. */
+	/* Re-enable independent features so XSAVES will work on them again. */
 	wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() | mask);
 
 	return size;
@@ -689,7 +689,7 @@ static int __init init_xstate_size(void)
 	xsave_size = get_xsave_size();
 
 	if (boot_cpu_has(X86_FEATURE_XSAVES))
-		possible_xstate_size = get_xsaves_size_no_dynamic();
+		possible_xstate_size = get_xsaves_size_no_independent();
 	else
 		possible_xstate_size = xsave_size;
 
@@ -831,7 +831,7 @@ void fpu__resume_cpu(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor()  |
-				     xfeatures_mask_dynamic());
+				     xfeatures_mask_independent());
 	}
 }
 
@@ -1163,34 +1163,34 @@ int copy_sigframe_from_user_to_xstate(st
 }
 
 /**
- * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
- *                                       an xsave area
+ * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
+ *                                           an xsave area
  * @xstate: A pointer to an xsave area
- * @mask: Represent the dynamic supervisor features saved into the xsave area
+ * @mask: Represent the independent supervisor features saved into the xsave area
  *
- * Only the dynamic supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_DYNAMIC for the details of dynamic
- * supervisor feature). Besides the dynamic supervisor states, the legacy
+ * Only the independent supervisor states sets in the mask are saved into the xsave
+ * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
+ * supervisor feature). Besides the independent supervisor states, the legacy
  * region and XSAVE header are also saved into the xsave area. The supervisor
  * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
  * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
  *
  * The xsave area must be 64-bytes aligned.
  */
-void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 dynamic_mask = xfeatures_mask_dynamic() & mask;
+	u64 independent_mask = xfeatures_mask_independent() & mask;
 	u32 lmask, hmask;
 	int err;
 
 	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
 		return;
 
-	if (WARN_ON_FPU(!dynamic_mask))
+	if (WARN_ON_FPU(!independent_mask))
 		return;
 
-	lmask = dynamic_mask;
-	hmask = dynamic_mask >> 32;
+	lmask = independent_mask;
+	hmask = independent_mask >> 32;
 
 	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
 
@@ -1199,34 +1199,34 @@ void copy_dynamic_supervisor_to_kernel(s
 }
 
 /**
- * copy_kernel_to_dynamic_supervisor() - Restore dynamic supervisor states from
- *                                       an xsave area
+ * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
+ *                                           an xsave area
  * @xstate: A pointer to an xsave area
- * @mask: Represent the dynamic supervisor features restored from the xsave area
+ * @mask: Represent the independent supervisor features restored from the xsave area
  *
- * Only the dynamic supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_DYNAMIC for the details of
- * dynamic supervisor feature). Besides the dynamic supervisor states, the
+ * Only the independent supervisor states sets in the mask are restored from the
+ * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
+ * independent supervisor feature). Besides the independent supervisor states, the
  * legacy region and XSAVE header are also restored from the xsave area. The
  * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
  * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
  *
  * The xsave area must be 64-bytes aligned.
  */
-void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask)
+void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
 {
-	u64 dynamic_mask = xfeatures_mask_dynamic() & mask;
+	u64 independent_mask = xfeatures_mask_independent() & mask;
 	u32 lmask, hmask;
 	int err;
 
 	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
 		return;
 
-	if (WARN_ON_FPU(!dynamic_mask))
+	if (WARN_ON_FPU(!independent_mask))
 		return;
 
-	lmask = dynamic_mask;
-	hmask = dynamic_mask >> 32;
+	lmask = independent_mask;
+	hmask = independent_mask >> 32;
 
 	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (30 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 31/52] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16 20:04   ` Liang, Kan
  2021-06-14 15:44 ` [patch V2 33/52] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
                   ` (21 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The copy functions for the independent features are horribly named and the
supervisor and independent part is just overengineered.

The point is that the supplied mask has either to be a subset of the
independent feature or a subset of the task->fpu.xstate managed features.

Rewrite it so it checks check for invalid overlaps of these areas in the
caller supplied feature mask. Rename it so it follows the new naming
convention for these operations. Mop up the function documentation.

This allows to use that function for other purposes as well.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/lbr.c       |    6 +-
 arch/x86/include/asm/fpu/xstate.h |    5 +-
 arch/x86/kernel/fpu/xstate.c      |   93 +++++++++++++++++++-------------------
 3 files changed, 53 insertions(+), 51 deletions(-)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xrstors_from_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xsaves_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	xsaves_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,8 +104,9 @@ void *get_xsave_addr(struct xregs_state
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
+
+void xsaves_to_kernel(struct xregs_state *xsave, u64 mask);
+void xrstors_from_kernel(struct xregs_state *xsave, u64 mask);
 
 enum xstate_copy_mode {
 	XSTATE_COPY_FP,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1163,75 +1163,76 @@ int copy_sigframe_from_user_to_xstate(st
 }
 
 /**
- * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features saved into the xsave area
+ * xsaves_to_kernel - Save selected components to a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to save
  *
- * Only the independent supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
- * supervisor feature). Besides the independent supervisor states, the legacy
- * region and XSAVE header are also saved into the xsave area. The supervisor
- * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
+ * The @xstate buffer must be 64 byte aligned and correctly initialized as
+ * XSAVES does not write the full xstate header. Before first use the
+ * buffer should be zeroed otherwise a consecutive XRSTORS from that buffer
+ * can #GP.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features
  */
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void xsaves_to_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
+	u64 xchk;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (WARN_ON_FPU(!cpu_feature_enabled(X86_FEATURE_XSAVES)))
 		return;
+	/*
+	 * Validate that this is either a task->fpstate related component
+	 * subset or an independent one.
+	 */
+	if (mask & xfeatures_mask_independent())
+		xchk = ~xfeatures_mask_independent();
+	else
+		xchk = ~xfeatures_mask_all;
 
-	if (WARN_ON_FPU(!independent_mask))
+	if (WARN_ON_ONCE(!mask || mask & xchk))
 		return;
 
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying to a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XSAVES, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 /**
- * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features restored from the xsave area
+ * xrstors_from_kernel - Restore selected components from a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to restore
+ *
+ * The @xstate buffer must be 64 byte aligned and correctly initialized
+ * otherwise XRSTORS from that buffer can #GP.
  *
- * Only the independent supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
- * independent supervisor feature). Besides the independent supervisor states, the
- * legacy region and XSAVE header are also restored from the xsave area. The
- * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
+ * Proper usage is to restore the state which was saved with
+ * xsaves_to_kernel() into @xstate.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features
  */
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
+void xrstors_from_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
+	u64 xchk;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (WARN_ON_FPU(!cpu_feature_enabled(X86_FEATURE_XSAVES)))
 		return;
+	/*
+	 * Validate that this is either a task->fpstate related component
+	 * subset or an independent one.
+	 */
+	if (mask & xfeatures_mask_independent())
+		xchk = ~xfeatures_mask_independent();
+	else
+		xchk = ~xfeatures_mask_all;
 
-	if (WARN_ON_FPU(!independent_mask))
+	if (WARN_ON_ONCE(!mask || mask & xchk))
 		return;
 
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying from a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XRSTORS, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 #ifdef CONFIG_PROC_PID_ARCH_STATUS


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 33/52] x86/pkeys: Move read_pkru() and write_pkru()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (31 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
                   ` (20 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

write_pkru() was originally used just to write to the PKRU register.  It
was mercifully short and sweet and was not out of place in pgtable.h with
some other pkey-related code.

But, later work included a requirement to also modify the task XSAVE
buffer when updating the register.  This really is more related to the
XSAVE architecture than to paging.

Move the read/write_pkru() to asm/pkru.h.  pgtable.h won't miss them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/xstate.h |    1 
 arch/x86/include/asm/pgtable.h    |   57 -----------------------------------
 arch/x86/include/asm/pkru.h       |   61 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/process_64.c      |    1 
 arch/x86/kvm/svm/sev.c            |    1 
 arch/x86/kvm/x86.c                |    1 
 arch/x86/mm/pkeys.c               |    1 
 7 files changed, 67 insertions(+), 56 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -6,6 +6,7 @@
 #include <linux/types.h>
 
 #include <asm/processor.h>
+#include <asm/fpu/api.h>
 #include <asm/user.h>
 
 /* Bit 63 of XCR0 is reserved for future expansion */
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,7 +23,7 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
-#include <asm/fpu/xstate.h>
+#include <asm/pkru.h>
 #include <asm/fpu/api.h>
 #include <asm-generic/pgtable_uffd.h>
 
@@ -126,35 +126,6 @@ static inline int pte_dirty(pte_t pte)
 	return pte_flags(pte) & _PAGE_DIRTY;
 }
 
-
-static inline u32 read_pkru(void)
-{
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return rdpkru();
-	return 0;
-}
-
-static inline void write_pkru(u32 pkru)
-{
-	struct pkru_state *pk;
-
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
-	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
-	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
-}
-
 static inline int pte_young(pte_t pte)
 {
 	return pte_flags(pte) & _PAGE_ACCESSED;
@@ -1360,32 +1331,6 @@ static inline pmd_t pmd_swp_clear_uffd_w
 }
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
 
-#define PKRU_AD_BIT 0x1
-#define PKRU_WD_BIT 0x2
-#define PKRU_BITS_PER_PKEY 2
-
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-extern u32 init_pkru_value;
-#else
-#define init_pkru_value	0
-#endif
-
-static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
-}
-
-static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	/*
-	 * Access-disable disables writes too so we need to check
-	 * both bits here.
-	 */
-	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
-}
-
 static inline u16 pte_flags_pkey(unsigned long pte_flags)
 {
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--- /dev/null
+++ b/arch/x86/include/asm/pkru.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PKRU_H
+#define _ASM_X86_PKRU_H
+
+#include <asm/fpu/xstate.h>
+
+#define PKRU_AD_BIT 0x1
+#define PKRU_WD_BIT 0x2
+#define PKRU_BITS_PER_PKEY 2
+
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+extern u32 init_pkru_value;
+#else
+#define init_pkru_value	0
+#endif
+
+static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+}
+
+static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	/*
+	 * Access-disable disables writes too so we need to check
+	 * both bits here.
+	 */
+	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+}
+
+static inline u32 read_pkru(void)
+{
+	if (boot_cpu_has(X86_FEATURE_OSPKE))
+		return rdpkru();
+	return 0;
+}
+
+static inline void write_pkru(u32 pkru)
+{
+	struct pkru_state *pk;
+
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return;
+
+	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
+
+	/*
+	 * The PKRU value in xstate needs to be in sync with the value that is
+	 * written to the CPU. The FPU restore on return to userland would
+	 * otherwise load the previous value again.
+	 */
+	fpregs_lock();
+	if (pk)
+		pk->pkru = pkru;
+	__write_pkru(pkru);
+	fpregs_unlock();
+}
+
+#endif
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -41,6 +41,7 @@
 #include <linux/syscalls.h>
 
 #include <asm/processor.h>
+#include <asm/pkru.h>
 #include <asm/fpu/internal.h>
 #include <asm/mmu_context.h>
 #include <asm/prctl.h>
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -19,6 +19,7 @@
 #include <linux/trace_events.h>
 #include <asm/fpu/internal.h>
 
+#include <asm/pkru.h>
 #include <asm/trapnr.h>
 
 #include "x86.h"
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -65,6 +65,7 @@
 #include <asm/msr.h>
 #include <asm/desc.h>
 #include <asm/mce.h>
+#include <asm/pkru.h>
 #include <linux/kernel_stat.h>
 #include <asm/fpu/internal.h> /* Ugh! */
 #include <asm/pvclock.h>
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,6 +10,7 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
+#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (32 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 33/52] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-18 12:21   ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 35/52] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
                   ` (19 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

There are three ways in the ISA to bulk save FPU state:
 1. XSAVE, every CPU newer than 2008 does this
 2. FXSAVE, from ~2000->2007
 3. FNSAVE, pre-2000

XSAVE and FXSAVE are nice.  They just copy FPU state to memory.  FNSAVE is
nasty; it destroys the FPU state when it writes it to memory.  It is more
of a "move".

Currently, copy_fpregs_to_fpstate() returns a number to its caller to say
whether it used the nice, non-destructive XSAVE/FXSAVE or used the mean,
clobbering FNSAVE.  Some sites need special handling for the FNSAVE case to
restore any FNSAVE-clobbered state.  Others don't care, like when they are
about to load new state anyway.

The nasty part about the copy_fpregs_to_fpstate() interface is that it's
hard to tell if callers expect the "move" or the "copy" behavior.

Create a new, explicit "move" interface for callers that can handle
clobbering register state.  Make "copy" only do copies and never clobber
register state.

== switch_fpu_prepare() optimization ==

switch_fpu_prepare() had a nice optimization for the FNSAVE case.  It can
handle either clobbering or preserving register state.  For the
XSAVE/FXSAVE case, it records that the fpregs state is still loaded, just
in case a later "restore" operation can be elided.  For the FNSAVE case, it
marks the fpregs as not loaded on the CPU, since they were clobbered.

Instead of having switch_fpu_prepare() modify its behavior based on whether
registers were clobbered or not, simply switch its behavior based on
whether FNSAVE is in use.  This makes it much more clear what is going on
and what the common path is.

It would be simpler to just remove this FNSAVE optimization: Always save
and restore in the FNSAVE case.  This may incur the cost of the restore
even in cases where the restored state is never used.  But, it would only
hurt painfully ancient (>20 years old) processors.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   14 ++++--
 arch/x86/kernel/fpu/core.c          |   83 ++++++++++++++++++++----------------
 2 files changed, 58 insertions(+), 39 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -378,7 +378,8 @@ static inline int xrstor_from_kernel_err
 	return err;
 }
 
-extern int save_fpregs_to_fpstate(struct fpu *fpu);
+extern void save_fpregs_to_fpstate(struct fpu *fpu);
+extern void copy_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
@@ -510,10 +511,15 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!save_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
-		else
+		/*
+		 * Avoid the expense of restoring fpregs with FNSAVE when it
+		 * might be unnecssary. XSAVE and FXSAVE preserve the FPU state.
+		 */
+		save_fpregs_to_fpstate(old_fpu);
+		if (likely(use_xsave() || use_fxsr()))
 			old_fpu->last_cpu = cpu;
+		else
+			old_fpu->last_cpu = -1;
 
 		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -83,16 +83,17 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
- *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
+ * Must be called with fpregs locked.
+ *
+ * Returns 'true' if the FPU state has been clobbered and the register
+ * contents are lost.
+ *
+ * The legacy FNSAVE instruction clobebrs all FPU state unconditionally, so
+ * registers are essentially destroyed.
+ *
+ * XSAVE and FXSAVE preserve register contents.
  */
-int save_fpregs_to_fpstate(struct fpu *fpu)
+static bool __clobber_save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		xsave_to_kernel(&fpu->state.xsave);
@@ -103,23 +104,45 @@ int save_fpregs_to_fpstate(struct fpu *f
 		 */
 		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
 			fpu->avx512_timestamp = jiffies;
-		return 1;
+		return false;
 	}
 
 	if (likely(use_fxsr())) {
 		fxsave_to_kernel(&fpu->state.fxsave);
-		return 1;
+		return false;
 	}
 
-	/*
-	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
-	 */
+	/* Legacy FPU register saving, FNSAVE always clears FPU registers. */
 	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
+	return true;
+}
+
+/**
+ * save_fpregs_to_fpstate - Save fpregs in fpstate
+ * @fpu:	Pointer to FPU context
+ *
+ * Hardware register state might be clobbered when the
+ * function returns.
+ */
+void save_fpregs_to_fpstate(struct fpu *fpu)
+{
+	__clobber_save_fpregs_to_fpstate(fpu);
+}
+EXPORT_SYMBOL_GPL(save_fpregs_to_fpstate);
+
+/**
+ * copy_fpregs_to_fpstate - Copy fpregs to fpstate
+ * @fpu:	Pointer to FPU context
+ *
+ * Guarantees that the hardware register state is preserved.
+ */
+void copy_fpregs_to_fpstate(struct fpu *fpu)
+{
+	bool clobbered = __clobber_save_fpregs_to_fpstate(fpu);
 
-	return 0;
+	if (clobbered)
+		restore_fpregs_from_fpstate(&fpu->state);
 }
-EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -133,10 +156,6 @@ void kernel_fpu_begin_mask(unsigned int
 	if (!(current->flags & PF_KTHREAD) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
 		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
@@ -160,7 +179,8 @@ void kernel_fpu_end(void)
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
 /*
- * Save the FPU state (mark it for reload if necessary):
+ * Save the FPU register state. If the registers are active then they are
+ * preserved.
  *
  * This only ever gets called for the current task.
  */
@@ -171,11 +191,8 @@ void fpu__save(struct fpu *fpu)
 	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!save_fpregs_to_fpstate(fpu)) {
-			restore_fpregs_from_fpstate(&fpu->state);
-		}
-	}
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		copy_fpregs_to_fpstate(fpu);
 
 	trace_x86_fpu_after_save(fpu);
 	fpregs_unlock();
@@ -245,18 +262,14 @@ int fpu__copy(struct task_struct *dst, s
 
 	/*
 	 * If the FPU registers are not current just memcpy() the state.
-	 * Otherwise save current FPU registers directly into the child's FPU
-	 * context, without any memory-to-memory copying.
-	 *
-	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to load them back. )
+	 * Otherwise copy current FPU registers directly into the child's
+	 * FPU context.
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
-
-	else if (!save_fpregs_to_fpstate(dst_fpu))
-		restore_fpregs_from_fpstate(&dst_fpu->state);
+	else
+		copy_fpregs_to_fpstate(dst_fpu);
 
 	fpregs_unlock();
 


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 35/52] x86/cpu: Sanitize X86_FEATURE_OSPKE
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (33 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 36/52] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
                   ` (18 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

X86_FEATURE_OSPKE is enabled first on the boot CPU and the feature flag is
set. Secondary CPUs have to enable CR4.PKE as well and set their per CPU
feature flag. That's ineffective because all call sites have checks for
boot_cpu_data.

Make it smarter and force the feature flag when PKU is enabled on the boot
cpu which allows then to use cpu_feature_enabled(X86_FEATURE_OSPKE) all
over the place. That either compiles the code out when PKEY support is
disabled in Kconfig or uses a static_cpu_has() for the feature check which
makes a significant difference in hotpathes, e.g. context switch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkeys.h |    8 ++++----
 arch/x86/include/asm/pkru.h  |    4 ++--
 arch/x86/kernel/cpu/common.c |   24 +++++++++++-------------
 arch/x86/kernel/fpu/core.c   |    2 +-
 arch/x86/kernel/fpu/xstate.c |    2 +-
 arch/x86/kernel/process_64.c |    2 +-
 arch/x86/mm/fault.c          |    2 +-
 7 files changed, 21 insertions(+), 23 deletions(-)

--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -9,14 +9,14 @@
  * will be necessary to ensure that the types that store key
  * numbers and masks have sufficient capacity.
  */
-#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
+#define arch_max_pkey() (cpu_feature_enabled(X86_FEATURE_OSPKE) ? 16 : 1)
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 
 static inline bool arch_pkeys_enabled(void)
 {
-	return boot_cpu_has(X86_FEATURE_OSPKE);
+	return cpu_feature_enabled(X86_FEATURE_OSPKE);
 }
 
 /*
@@ -26,7 +26,7 @@ static inline bool arch_pkeys_enabled(vo
 extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return ARCH_DEFAULT_PKEY;
 
 	return __execute_only_pkey(mm);
@@ -37,7 +37,7 @@ extern int __arch_override_mprotect_pkey
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 		int prot, int pkey)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return 0;
 
 	return __arch_override_mprotect_pkey(vma, prot, pkey);
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -32,7 +32,7 @@ static inline bool __pkru_allows_write(u
 
 static inline u32 read_pkru(void)
 {
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return rdpkru();
 	return 0;
 }
@@ -41,7 +41,7 @@ static inline void write_pkru(u32 pkru)
 {
 	struct pkru_state *pk;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
 
 	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,22 +466,20 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	/* check the boot processor, plus compile options for PKU: */
-	if (!cpu_feature_enabled(X86_FEATURE_PKU))
-		return;
-	/* checks the actual processor's cpuid bits: */
-	if (!cpu_has(c, X86_FEATURE_PKU))
-		return;
-	if (pku_disabled)
+	if (c == &boot_cpu_data) {
+		if (pku_disabled || !cpu_feature_enabled(X86_FEATURE_PKU))
+			return;
+		/*
+		 * Setting CR4.PKE will cause the X86_FEATURE_OSPKE cpuid
+		 * bit to be set.  Enforce it.
+		 */
+		setup_force_cpu_cap(X86_FEATURE_OSPKE);
+
+	} else if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) {
 		return;
+	}
 
 	cr4_set_bits(X86_CR4_PKE);
-	/*
-	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
-	 * cpuid bit to be set.  We need to ensure that we
-	 * update that bit in this CPU's "cpu_info".
-	 */
-	set_cpu_cap(c, X86_FEATURE_OSPKE);
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -377,7 +377,7 @@ static inline void load_fpregs_from_init
 	else
 		frstor_from_kernel(&init_fpstate.fsave);
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -915,7 +915,7 @@ int arch_set_user_pkey_access(struct tas
 	 * This check implies XSAVE support.  OSPKE only gets
 	 * set if we enable XSAVE and we enable PKU in XCR0.
 	 */
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return -EINVAL;
 
 	/*
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -137,7 +137,7 @@ void __show_regs(struct pt_regs *regs, e
 		       log_lvl, d3, d6, d7);
 	}
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		printk("%sPKRU: %08x\n", log_lvl, read_pkru());
 }
 
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -875,7 +875,7 @@ static inline bool bad_area_access_from_
 	/* This code is always called on the current mm */
 	bool foreign = false;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return false;
 	if (error_code & X86_PF_PK)
 		return true;


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 36/52] x86/pkru: Provide pkru_get_init_value()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (34 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 35/52] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 37/52] x86/pkru: Provide pkru_write_default() Thomas Gleixner
                   ` (17 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

When CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is disabled then the following
code fails to compile:

     if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
     	u32 pkru = READ_ONCE(init_pkru_value);
	..
     }

because init_pkru_value is defined as '0' which makes READ_ONCE() upset.

Provide an accessor macro to avoid #ifdeffery all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkru.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -10,8 +10,10 @@
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 extern u32 init_pkru_value;
+#define pkru_get_init_value()	READ_ONCE(init_pkru_value)
 #else
 #define init_pkru_value	0
+#define pkru_get_init_value()	0
 #endif
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 37/52] x86/pkru: Provide pkru_write_default()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (35 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 36/52] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 38/52] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
                   ` (16 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Provide a simple and trivial helper which just writes the PKRU default
value without trying to fiddle with the tasks xsave buffer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkru.h |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -60,4 +60,12 @@ static inline void write_pkru(u32 pkru)
 	fpregs_unlock();
 }
 
+static inline void pkru_write_default(void)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	wrpkru(pkru_get_init_value());
+}
+
 #endif


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 38/52] x86/cpu: Write the default PKRU value when enabling PKE
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (36 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 37/52] x86/pkru: Provide pkru_write_default() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 39/52] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
                   ` (15 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

In preparation of making the PKRU management more independent from XSTATES,
write the default PKRU value into the hardware right after enabling PKRU in
CR4. This ensures that switch_to() and copy_thread() have the correct
setting for init task and the per CPU idle threads right away.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -480,6 +480,8 @@ static __always_inline void setup_pku(st
 	}
 
 	cr4_set_bits(X86_CR4_PKE);
+	/* Load the default PKRU value */
+	pkru_write_default();
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 39/52] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (37 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 38/52] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 40/52] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
                   ` (14 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

There is no point in using copy_init_pkru_to_fpregs() which in turn calls
write_pkru(). write_pkru() tries to fiddle with the task's xstate buffer
for nothing because the XRSTOR[S](init_fpstate) just cleared the xfeature
flag in the xstate header which makes get_xsave_addr() fail.

It's a useless exercise anyway because the reinitialization activates the
FPU so before the task's xstate buffer can be used again a XRSTOR[S] must
happen which in turn dumps the PKRU value.

Get rid of the now unused copy_init_pkru_to_fpregs().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkeys.h |    1 -
 arch/x86/kernel/fpu/core.c   |    3 +--
 arch/x86/mm/pkeys.c          |   17 -----------------
 include/linux/pkeys.h        |    4 ----
 4 files changed, 1 insertion(+), 24 deletions(-)

--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -124,7 +124,6 @@ extern int arch_set_user_pkey_access(str
 		unsigned long init_val);
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
-extern void copy_init_pkru_to_fpregs(void);
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -377,8 +377,7 @@ static inline void load_fpregs_from_init
 	else
 		frstor_from_kernel(&init_fpstate.fsave);
 
-	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
-		copy_init_pkru_to_fpregs();
+	pkru_write_default();
 }
 
 /*
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -125,22 +124,6 @@ u32 init_pkru_value = PKRU_AD_KEY( 1) |
 		      PKRU_AD_KEY(10) | PKRU_AD_KEY(11) | PKRU_AD_KEY(12) |
 		      PKRU_AD_KEY(13) | PKRU_AD_KEY(14) | PKRU_AD_KEY(15);
 
-/*
- * Called from the FPU code when creating a fresh set of FPU
- * registers.  This is called from a very specific context where
- * we know the FPU registers are safe for use and we can use PKRU
- * directly.
- */
-void copy_init_pkru_to_fpregs(void)
-{
-	u32 init_pkru_value_snapshot = READ_ONCE(init_pkru_value);
-	/*
-	 * Override the PKRU state that came from 'init_fpstate'
-	 * with the baseline from the process.
-	 */
-	write_pkru(init_pkru_value_snapshot);
-}
-
 static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 			     size_t count, loff_t *ppos)
 {
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -44,10 +44,6 @@ static inline bool arch_pkeys_enabled(vo
 	return false;
 }
 
-static inline void copy_init_pkru_to_fpregs(void)
-{
-}
-
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 #endif /* _LINUX_PKEYS_H */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 40/52] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (38 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 39/52] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 41/52] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
                   ` (13 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Make it clear what the function is about.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    3 ++-
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kernel/process.c           |    2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -33,9 +33,10 @@ extern int  fpu__restore_sig(void __user
 extern void fpu__drop(struct fpu *fpu);
 extern int  fpu__copy(struct task_struct *dst, struct task_struct *src);
 extern void fpu__clear_user_states(struct fpu *fpu);
-extern void fpu__clear_all(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 
+extern void fpu_flush_thread(void);
+
 /*
  * Boot time FPU initialization functions:
  */
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -417,9 +417,9 @@ void fpu__clear_user_states(struct fpu *
 	fpu__clear(fpu, true);
 }
 
-void fpu__clear_all(struct fpu *fpu)
+void fpu_flush_thread(void)
 {
-	fpu__clear(fpu, false);
+	fpu__clear(&current->thread.fpu, false);
 }
 
 /*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -206,7 +206,7 @@ void flush_thread(void)
 	flush_ptrace_hw_breakpoint(tsk);
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
-	fpu__clear_all(&tsk->thread.fpu);
+	fpu_flush_thread();
 }
 
 void disable_TSC(void)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 41/52] x86/fpu: Clean up the fpu__clear() variants
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (39 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 40/52] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 42/52] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
                   ` (12 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

fpu__clear() currently resets both register state and kernel XSAVE buffer
state.  It has two modes: one for all state (supervisor and user) and
another for user state only.  fpu__clear_all() uses the "all state"
(user_only=0) mode, while a number of signal paths use the user_only=1
mode.

Make fpu__clear() work only for user state (user_only=1) and remove the
"all state" (user_only=0) code.  Rename it to match so it can be used by
the signal paths.

Replace the "all state" (user_only=0) fpu__clear() functionality.  Use the
TIF_NEED_FPU_LOAD functionality instead of making any actual hardware
registers changes in this path.

Instead of invoking fpu__initialize() just memcpy() init_fpstate into the
tasks FPU state because that has already the correct format and in case of
PKRU also contains the default PKRU value. Move the actual PKRU write out
into flush_thread() where it belongs and where it will end up anyway when
PKRU and XSTATE have been distangled.

For bisectability a workaround is required which stores the PKRU value in
the xstate memory until PKRU is distangled from XSTATE for context
switching and return to user.

[ Dave Hansen: Polished changelog ]
[ tglx: Fixed the PKRU fallout ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/core.c |  113 ++++++++++++++++++++++++++++++---------------
 arch/x86/kernel/process.c  |   10 +++
 2 files changed, 87 insertions(+), 36 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -283,19 +283,6 @@ int fpu__copy(struct task_struct *dst, s
 }
 
 /*
- * Activate the current task's in-memory FPU context,
- * if it has not been used before:
- */
-static void fpu__initialize(struct fpu *fpu)
-{
-	WARN_ON_FPU(fpu != &current->thread.fpu);
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-	fpstate_init(&fpu->state);
-	trace_x86_fpu_init_state(fpu);
-}
-
-/*
  * This function must be called before we read a task's fpstate.
  *
  * There's two cases where this gets called:
@@ -381,46 +368,100 @@ static inline void load_fpregs_from_init
 	pkru_write_default();
 }
 
+static inline unsigned int init_fpstate_copy_size(void)
+{
+	if (!use_xsave())
+		return fpu_kernel_xstate_size;
+
+	/* XSAVE(S) just needs the legacy and the xstate header part */
+	return sizeof(init_fpstate.xsave);
+}
+
+/* Temporary workaround. Will be removed once PKRU and XSTATE are distangled. */
+static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
+{
+	struct pkru_state *pk;
+
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+	/*
+	 * Force XFEATURE_PKRU to be set in the header otherwise
+	 * get_xsave_addr() does not work and it also needs to be set to
+	 * make XRSTOR(S) load it.
+	 */
+	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
+	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
+	pk->pkru = pkru_get_init_value();
+}
+
 /*
- * Clear the FPU state back to init state.
- *
- * Called by sys_execve(), by the signal handler code and by various
- * error paths.
+ * Reset current->fpu memory state to the init values.
  */
-static void fpu__clear(struct fpu *fpu, bool user_only)
+static void fpu_reset_fpstate(void)
+{
+	struct fpu *fpu= &current->thread.fpu;
+
+	fpregs_lock();
+	fpu__drop(fpu);
+	/*
+	 * This does not change the actual hardware registers. It just
+	 * resets the memory image and sets TIF_NEED_FPU_LOAD so a
+	 * subsequent return to usermode will reload the registers from the
+	 * tasks memory image.
+	 *
+	 * Do not use fpstate_init() here. Just copy init_fpstate which has
+	 * the correct content already except for PKRU.
+	 */
+	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
+	pkru_set_default_in_xstate(&fpu->state.xsave);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
+	fpregs_unlock();
+}
+
+/*
+ * Reset current's user FPU states to the init states.  current's
+ * supervisor states, if any, are not modified by this function.  The
+ * caller guarantees that the XSTATE header in memory is intact.
+ */
+void fpu__clear_user_states(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
+	fpregs_lock();
 	if (!static_cpu_has(X86_FEATURE_FPU)) {
-		fpu__drop(fpu);
-		fpu__initialize(fpu);
+		fpu_reset_fpstate();
+		fpregs_unlock();
 		return;
 	}
 
-	fpregs_lock();
-
-	if (user_only) {
-		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
-		    xfeatures_mask_supervisor())
-			xrstor_from_kernel(&fpu->state.xsave,
-					     xfeatures_mask_supervisor());
-		load_fpregs_from_init_fpstate(xfeatures_mask_user());
-	} else {
-		load_fpregs_from_init_fpstate(xfeatures_mask_all);
+	/*
+	 * Ensure that current's supervisor states are loaded into their
+	 * corresponding registers.
+	 */
+	if (xfeatures_mask_supervisor() &&
+	    !fpregs_state_valid(fpu, smp_processor_id())) {
+		xrstor_from_kernel(&fpu->state.xsave,
+				   xfeatures_mask_supervisor());
 	}
 
+	/* Reset user states in registers. */
+	load_fpregs_from_init_fpstate(xfeatures_mask_user());
+
+	/*
+	 * Now all FPU registers have their desired values.  Inform the FPU
+	 * state machine that current's FPU registers are in the hardware
+	 * registers. The memory image does not need to be updated because
+	 * any operation relying on it has to save the registers first when
+	 * currents FPU is marked active.
+	 */
 	fpregs_mark_activate();
-	fpregs_unlock();
-}
 
-void fpu__clear_user_states(struct fpu *fpu)
-{
-	fpu__clear(fpu, true);
+	fpregs_unlock();
 }
 
 void fpu_flush_thread(void)
 {
-	fpu__clear(&current->thread.fpu, false);
+	fpu_reset_fpstate();
 }
 
 /*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -199,6 +199,15 @@ int copy_thread(unsigned long clone_flag
 	return ret;
 }
 
+static void pkru_flush_thread(void)
+{
+	/*
+	 * If PKRU is enabled the default PKRU value has to be loaded into
+	 * the hardware right here (similar to context switch).
+	 */
+	pkru_write_default();
+}
+
 void flush_thread(void)
 {
 	struct task_struct *tsk = current;
@@ -207,6 +216,7 @@ void flush_thread(void)
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
 	fpu_flush_thread();
+	pkru_flush_thread();
 }
 
 void disable_TSC(void)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 42/52] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (40 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 41/52] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 43/52] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
                   ` (11 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Rename it so that it becomes entirely clear what this function is
about. It's purpose is to restore the FPU registers to the state which was
saved in the task's FPU memory state either at context switch or by an in
kernel FPU user.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    6 ++----
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |    2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -485,10 +485,8 @@ static inline void fpregs_activate(struc
 	trace_x86_fpu_regs_activated(fpu);
 }
 
-/*
- * Internal helper, do not use directly. Use switch_fpu_return() instead.
- */
-static inline void __fpregs_load_activate(void)
+/* Internal helper for switch_fpu_return() and signal frame setup */
+static inline void fpregs_restore_userregs(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	int cpu = smp_processor_id();
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -472,7 +472,7 @@ void switch_fpu_return(void)
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return;
 
-	__fpregs_load_activate();
+	fpregs_restore_userregs();
 }
 EXPORT_SYMBOL_GPL(switch_fpu_return);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -188,7 +188,7 @@ int copy_fpstate_to_sigframe(void __user
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
-		__fpregs_load_activate();
+		fpregs_restore_userregs();
 
 	pagefault_disable();
 	ret = copy_fpregs_to_sigframe(buf_fx);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 43/52] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (41 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 42/52] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 44/52] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
                   ` (10 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

copy_kernel_to_fpregs() restores all xfeatures but it is also the place
where the AMD FXSAVE_LEAK bug is handled.

That prevents fpregs_restore_userregs() to limit the restored features,
which is required to distangle PKRU and XSTATE handling and also for the
upcoming supervisor state management.

Move the FXSAVE_LEAK quirk into __copy_kernel_to_fpregs() and deinline that
function which has become rather fat.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   25 +------------------------
 arch/x86/kernel/fpu/core.c          |   26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -399,33 +399,10 @@ static inline int xrstor_from_kernel_err
 extern void save_fpregs_to_fpstate(struct fpu *fpu);
 extern void copy_fpregs_to_fpstate(struct fpu *fpu);
 
-static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
-{
-	if (use_xsave()) {
-		xrstor_from_kernel(&fpstate->xsave, mask);
-	} else {
-		if (use_fxsr())
-			fxrstor_from_kernel(&fpstate->fxsave);
-		else
-			frstor_from_kernel(&fpstate->fsave);
-	}
-}
+extern void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask);
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	/*
-	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
-	 * pending. Clear the x87 state here by setting it to fixed values.
-	 * "m" is a random variable that should be in L1.
-	 */
-	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
-		asm volatile(
-			"fnclex\n\t"
-			"emms\n\t"
-			"fildl %P[addr]"	/* set F?P to defined value */
-			: : [addr] "m" (fpstate));
-	}
-
 	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -145,6 +145,32 @@ void copy_fpregs_to_fpstate(struct fpu *
 		restore_fpregs_from_fpstate(&fpu->state);
 }
 
+void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
+{
+	/*
+	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
+	 * pending. Clear the x87 state here by setting it to fixed values.
+	 * "m" is a random variable that should be in L1.
+	 */
+	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %P[addr]"	/* set F?P to defined value */
+			: : [addr] "m" (fpstate));
+	}
+
+	if (use_xsave()) {
+		xrstor_from_kernel(&fpstate->xsave, mask);
+	} else {
+		if (use_fxsr())
+			fxrstor_from_kernel(&fpstate->fxsave);
+		else
+			frstor_from_kernel(&fpstate->fsave);
+	}
+}
+EXPORT_SYMBOL_GPL(__restore_fpregs_from_fpstate);
+
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
 	preempt_disable();


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 44/52] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (42 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 43/52] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
                   ` (9 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Rename it so it's clear that this is about user ABI features which can
differ from the feature set which the kernel saves and restores because the
kernel handles e.g. PKRU differently. But the user ABI (ptrace, signal
frame) expects it to be there.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    7 ++++++-
 arch/x86/include/asm/fpu/xstate.h   |    6 +++++-
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |   10 +++++-----
 arch/x86/kernel/fpu/xstate.c        |   14 +++++++-------
 5 files changed, 24 insertions(+), 15 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -319,7 +319,12 @@ static inline void xrstor_from_kernel(st
  */
 static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
-	u64 mask = xfeatures_mask_user();
+	/*
+	 * Include the features which are not xsaved/rstored by the kernel
+	 * internally, e.g. PKRU. That's user space ABI and also required
+	 * to allow the signal handler to modify PKRU.
+	 */
+	u64 mask = xfeatures_mask_uabi();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -83,7 +83,11 @@ static inline u64 xfeatures_mask_supervi
 	return xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_user(void)
+/*
+ * The xfeatures which are enabled in XCR0 and expected to be in ptrace
+ * buffers and signal frames.
+ */
+static inline u64 xfeatures_mask_uabi(void)
 {
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -470,7 +470,7 @@ void fpu__clear_user_states(struct fpu *
 	}
 
 	/* Reset user states in registers. */
-	load_fpregs_from_init_fpstate(xfeatures_mask_user());
+	load_fpregs_from_init_fpstate(xfeatures_mask_uabi());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -267,14 +267,14 @@ static int copy_user_to_fpregs_zeroing(v
 
 	if (use_xsave()) {
 		if (fx_only) {
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 
 			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
-			init_bv = xfeatures_mask_user() & ~xbv;
+			init_bv = xfeatures_mask_uabi() & ~xbv;
 
 			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
@@ -429,7 +429,7 @@ static int __fpu__restore_sig(void __use
 	fpregs_unlock();
 
 	if (use_xsave() && !fx_only) {
-		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
+		u64 init_bv = xfeatures_mask_uabi() & ~user_xfeatures;
 
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
@@ -463,7 +463,7 @@ static int __fpu__restore_sig(void __use
 		if (use_xsave()) {
 			u64 init_bv;
 
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 		}
 
@@ -558,7 +558,7 @@ void fpu__init_prepare_fx_sw_frame(void)
 
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
-	fx_sw_reserved.xfeatures = xfeatures_mask_user();
+	fx_sw_reserved.xfeatures = xfeatures_mask_uabi();
 	fx_sw_reserved.xstate_size = fpu_user_xstate_size;
 
 	if (IS_ENABLED(CONFIG_IA32_EMULATION) ||
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -155,7 +155,7 @@ void fpu__init_cpu_xstate(void)
 	 * managed by XSAVE{C, OPT, S} and XRSTOR{S}.  Only XSAVE user
 	 * states can be set here.
 	 */
-	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * MSR_IA32_XSS sets supervisor states managed by XSAVES.
@@ -462,7 +462,7 @@ int xfeature_size(int xfeature_nr)
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
-	if (hdr->xfeatures & ~xfeatures_mask_user())
+	if (hdr->xfeatures & ~xfeatures_mask_uabi())
 		return -EINVAL;
 
 	/* Userspace must use the uncompacted format */
@@ -764,7 +764,7 @@ void __init fpu__init_system_xstate(void
 	cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
 	xfeatures_mask_all |= ecx + ((u64)edx << 32);
 
-	if ((xfeatures_mask_user() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
+	if ((xfeatures_mask_uabi() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
 		/*
 		 * This indicates that something really unexpected happened
 		 * with the enumeration.  Disable XSAVE and try to continue
@@ -795,7 +795,7 @@ void __init fpu__init_system_xstate(void
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
 	 */
-	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_user());
+	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_uabi());
 
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
@@ -823,7 +823,7 @@ void fpu__resume_cpu(void)
 	 * Restore XCR0 on xsave capable CPUs:
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE))
-		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * Restore IA32_XSS. The same CPUID bit enumerates support
@@ -1004,7 +1004,7 @@ void copy_uabi_xstate_to_membuf(struct m
 		break;
 
 	case XSTATE_COPY_XSAVE:
-		header.xfeatures &= xfeatures_mask_user();
+		header.xfeatures &= xfeatures_mask_uabi();
 		break;
 	}
 
@@ -1049,7 +1049,7 @@ void copy_uabi_xstate_to_membuf(struct m
 		 * compacted init_fpstate. The gap tracking will zero this
 		 * later.
 		 */
-		if (!(xfeatures_mask_user() & BIT_ULL(i)))
+		if (!(xfeatures_mask_uabi() & BIT_ULL(i)))
 			continue;
 
 		/*


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (43 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 44/52] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-16  0:52   ` Yu, Yu-cheng
  2021-06-14 15:44 ` [patch V2 46/52] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
                   ` (8 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

switch_to(), flush_thread() write the task's PKRU value eagerly so the PKRU
value of current is always valid in the hardware.

That means there is no point in restoring PKRU on exit to user or when
reactivating the task's FPU registers in the signal frame setup path.

This allows to remove all the xstate buffer updates with PKRU values once
the PKRU state is stored in thread struct while a task is scheduled out.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   12 +++++++++++-
 arch/x86/include/asm/fpu/xstate.h   |   19 +++++++++++++++++++
 arch/x86/kernel/fpu/core.c          |    2 +-
 3 files changed, 31 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -455,7 +455,17 @@ static inline void fpregs_restore_userre
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		restore_fpregs_from_fpstate(&fpu->state);
+		/*
+		 * This restores _all_ xstate which has not been
+		 * established yet.
+		 *
+		 * If PKRU is enabled, then the PKRU value is already
+		 * correct because it was either set in switch_to() or in
+		 * flush_thread(). So it is excluded because it might be
+		 * not up to date in current->thread.fpu.xsave state.
+		 */
+		__restore_fpregs_from_fpstate(&fpu->state,
+					      xfeatures_mask_restore_user());
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -35,6 +35,14 @@
 				      XFEATURE_MASK_BNDREGS | \
 				      XFEATURE_MASK_BNDCSR)
 
+/*
+ * Features which are restored when returning to user space.
+ * PKRU is not restored on return to user space because PKRU
+ * is switched eagerly in switch_to() and flush_thread()
+ */
+#define XFEATURE_MASK_USER_RESTORE	\
+	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
@@ -92,6 +100,17 @@ static inline u64 xfeatures_mask_uabi(vo
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
+/*
+ * The xfeatures which are restored by the kernel when returning to user
+ * mode. This is not necessarily the same as xfeatures_mask_uabi() as the
+ * kernel does not manage all XCR0 enabled features via xsave/xrstor as
+ * some of them have to be switched eagerly on context switch and exec().
+ */
+static inline u64 xfeatures_mask_restore_user(void)
+{
+	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -470,7 +470,7 @@ void fpu__clear_user_states(struct fpu *
 	}
 
 	/* Reset user states in registers. */
-	load_fpregs_from_init_fpstate(xfeatures_mask_uabi());
+	load_fpregs_from_init_fpstate(xfeatures_mask_restore_user());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 46/52] x86/fpu: Add PKRU storage outside of task XSAVE buffer
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (44 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 47/52] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
                   ` (7 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

PKRU is currently partly XSAVE-managed and partly not.  It has space in the
task XSAVE buffer and is context-switched by XSAVE/XRSTOR.  However, it is
switched more eagerly than FPU because there may be a need for PKRU to be
up-to-date for things like copy_to/from_user() since PKRU affects
user-permission memory accesses, not just accesses from userspace itself.

This leaves PKRU in a very odd position.  XSAVE brings very little value to
the table for how Linux uses PKRU except for signal related XSTATE
handling.

Prepare to move PKRU away from being XSAVE-managed.  Allocate space in the
thread_struct for it and save/restore it in the context-switch path
separately from the XSAVE-managed features. task->thread_struct.pkru is
only valid when the task is scheduled out. For the current task the
authoritative source is the hardware, i.e. it has to be retrieved via
rdpkru().

Leave the XSAVE code in place for now to ensure bisectability.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/processor.h |    9 +++++++++
 arch/x86/kernel/process.c        |    7 +++++++
 arch/x86/kernel/process_64.c     |   25 +++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -518,6 +518,15 @@ struct thread_struct {
 
 	unsigned int		sig_on_uaccess_err:1;
 
+	/*
+	 * Protection Keys Register for Userspace.  Loaded immediately on
+	 * context switch. Store it in thread_struct to avoid a lookup in
+	 * the tasks's FPU xstate buffer. This value is only valid when a
+	 * task is scheduled out. For 'current' the authoritative source of
+	 * PKRU is the hardware itself.
+	 */
+	u32			pkru;
+
 	/* Floating point and extended processor state */
 	struct fpu		fpu;
 	/*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -157,11 +157,18 @@ int copy_thread(unsigned long clone_flag
 
 	/* Kernel thread ? */
 	if (unlikely(p->flags & PF_KTHREAD)) {
+		p->thread.pkru = pkru_get_init_value();
 		memset(childregs, 0, sizeof(struct pt_regs));
 		kthread_frame_init(frame, sp, arg);
 		return 0;
 	}
 
+	/*
+	 * Clone current's PKRU value from hardware. tsk->thread.pkru
+	 * is only valid when scheduled out.
+	 */
+	p->thread.pkru = rdpkru();
+
 	frame->bx = 0;
 	*childregs = *current_pt_regs();
 	childregs->ax = 0;
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -340,6 +340,29 @@ static __always_inline void load_seg_leg
 	}
 }
 
+/*
+ * Store prev's PKRU value and load next's PKRU value if they differ. PKRU
+ * is not XSTATE managed on context switch because that would require a
+ * lookup in the task's FPU xsave buffer and require to keep that updated
+ * in various places.
+ */
+static __always_inline void x86_pkru_load(struct thread_struct *prev,
+					  struct thread_struct *next)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	/* Stash the prev task's value: */
+	prev->pkru = rdpkru();
+
+	/*
+	 * PKRU writes are slightly expensive.  Avoid them when not
+	 * strictly necessary:
+	 */
+	if (prev->pkru != next->pkru)
+		wrpkru(next->pkru);
+}
+
 static __always_inline void x86_fsgsbase_load(struct thread_struct *prev,
 					      struct thread_struct *next)
 {
@@ -589,6 +612,8 @@ void compat_start_thread(struct pt_regs
 
 	x86_fsgsbase_load(prev, next);
 
+	x86_pkru_load(prev, next);
+
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 47/52] x86/fpu: Hook up PKRU into ptrace()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (45 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 46/52] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 19:29   ` [patch V2-A " Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 48/52] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
                   ` (6 subsequent siblings)
  53 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

One nice thing about having PKRU be XSAVE-managed is that it gets naturally
exposed into the XSAVE-using ABIs.  Now that XSAVE will not be used to
manage PKRU, these ABIs need to be manually enabled to deal with PKRU.

ptrace() uses copy_uabi_xstate_to_kernel() to collect the tracee's
XSTATE. As PKRU is not in the task's XSTATE buffer, use task->thread.pkru
for filling in up the ptrace buffer.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/xstate.h |    2 +-
 arch/x86/kernel/fpu/regset.c      |    6 ++----
 arch/x86/kernel/fpu/xstate.c      |   25 ++++++++++++++++++-------
 3 files changed, 21 insertions(+), 12 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -139,7 +139,7 @@ enum xstate_copy_mode {
 };
 
 struct membuf;
-void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave,
+void copy_uabi_xstate_to_membuf(struct membuf to, struct task_struct *tsk,
 				enum xstate_copy_mode mode);
 
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -93,14 +93,12 @@ int xfpregs_set(struct task_struct *targ
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
-	struct fpu *fpu = &target->thread.fpu;
-
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	fpu__prepare_read(&target->thread.fpu);
 
-	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
+	copy_uabi_xstate_to_membuf(to, target, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -973,7 +973,7 @@ static void copy_feature(bool from_xstat
 /**
  * copy_uabi_xstate_to_membuf - Copy kernel saved xstate to a UABI buffer
  * @to:		membuf descriptor
- * @xsave:	The kernel xstate buffer to copy from
+ * @tsk:	The task from which to copy the saved xstate
  * @copy_mode:	The requested copy mode
  *
  * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
@@ -982,10 +982,11 @@ static void copy_feature(bool from_xstat
  *
  * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave,
+void copy_uabi_xstate_to_membuf(struct membuf to, struct task_struct *tsk,
 				enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
 	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
 	unsigned int zerofrom;
@@ -1059,11 +1060,21 @@ void copy_uabi_xstate_to_membuf(struct m
 		if (zerofrom < xstate_offsets[i])
 			membuf_zero(&to, xstate_offsets[i] - zerofrom);
 
-		copy_feature(header.xfeatures & BIT_ULL(i), &to,
-			     __raw_xsave_addr(xsave, i),
-			     __raw_xsave_addr(xinit, i),
-			     xstate_sizes[i]);
-
+		if (i == XFEATURE_PKRU) {
+			struct pkru_state pkru = {0};
+			/*
+			 * PKRU is not necessarily up to date in the
+			 * thread's XSAVE buffer.  Fill this part from the
+			 * per-thread storage.
+			 */
+			pkru.pkru = target->thread.pkru;
+			membuf_write(&to, &pkru, sizeof(pkru));
+		} else {
+			copy_feature(header.xfeatures & BIT_ULL(i), &to,
+				     __raw_xsave_addr(xsave, i),
+				     __raw_xsave_addr(xinit, i),
+				     xstate_sizes[i]);
+		}
 		/*
 		 * Keep track of the last copied state in the non-compacted
 		 * target buffer for gap zeroing.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 48/52] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (46 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 47/52] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 49/52] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
                   ` (5 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

As the PKRU state is managed seperately restoring it from the xstate buffer
would be counterproductive as it might either restore a stale value or
reinit the PKRU state to 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/include/asm/fpu/xstate.h   |   10 ++++++++++
 arch/x86/kernel/fpu/xstate.c        |    1 +
 arch/x86/mm/extable.c               |    2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -281,7 +281,7 @@ static inline void xsave_to_kernel_booti
  */
 static inline void xrstor_from_kernel_booting(struct xregs_state *xstate)
 {
-	u64 mask = -1;
+	u64 mask = xfeatures_mask_fpstate();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
@@ -408,7 +408,7 @@ extern void __restore_fpregs_from_fpstat
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	__restore_fpregs_from_fpstate(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, xfeatures_mask_fpstate());
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,6 +111,16 @@ static inline u64 xfeatures_mask_restore
 	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
 }
 
+/*
+ * Like xfeatures_mask_restore_user() but additionally restors the
+ * supported supervisor states.
+ */
+static inline u64 xfeatures_mask_fpstate(void)
+{
+	return xfeatures_mask_all & \
+		(XFEATURE_MASK_USER_RESTORE | XFEATURE_MASK_SUPERVISOR_SUPPORTED);
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -60,6 +60,7 @@ static short xsave_cpuid_features[] __in
  * XSAVE buffer, both supervisor and user xstates.
  */
 u64 xfeatures_mask_all __ro_after_init;
+EXPORT_SYMBOL_GPL(xfeatures_mask_all);
 
 static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__restore_fpregs_from_fpstate(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, xfeatures_mask_fpstate());
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 49/52] x86/fpu: Remove PKRU handling from switch_fpu_finish()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (47 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 48/52] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 50/52] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
                   ` (4 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

PKRU is already updated and the xstate is not longer the proper source of
information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   34 ++++------------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -517,39 +517,13 @@ static inline void switch_fpu_prepare(st
  */
 
 /*
- * Load PKRU from the FPU context if available. Delay loading of the
- * complete FPU state until the return to userland.
+ * Delay loading of the complete FPU state until the return to userland.
+ * PKRU is handled seperately.
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu)
 {
-	u32 pkru_val = init_pkru_value;
-	struct pkru_state *pk;
-
-	if (!static_cpu_has(X86_FEATURE_FPU))
-		return;
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-
-	/*
-	 * PKRU state is switched eagerly because it needs to be valid before we
-	 * return to userland e.g. for a copy_to_user() operation.
-	 */
-	if (!(current->flags & PF_KTHREAD)) {
-		/*
-		 * If the PKRU bit in xsave.header.xfeatures is not set,
-		 * then the PKRU component was in init state, which means
-		 * XRSTOR will set PKRU to 0. If the bit is not set then
-		 * get_xsave_addr() will return NULL because the PKRU value
-		 * in memory is not valid. This means pkru_val has to be
-		 * set to 0 and not to init_pkru_value.
-		 */
-		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
-		pkru_val = pk ? pk->pkru : 0;
-	}
-	__write_pkru(pkru_val);
+	if (static_cpu_has(X86_FEATURE_FPU))
+		set_thread_flag(TIF_NEED_FPU_LOAD);
 }
 
 #endif /* _ASM_X86_FPU_INTERNAL_H */


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 50/52] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (48 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 49/52] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:44 ` [patch V2 51/52] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
                   ` (3 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

PKRU for a task is stored in task->thread.pkru when the task is scheduled
out. For 'current' the authoritative source of PKRU is the hardware.

fpu_reset_fpstate() has two callers:

  1) fpu__clear_user_states() for !FPU systems. For those PKRU is irrelevant

  2) fpu_flush_thread() which is invoked from flush_thread(). flush_thread()
     resets the hardware to the kernel restrictive default value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/core.c |   22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -403,23 +403,6 @@ static inline unsigned int init_fpstate_
 	return sizeof(init_fpstate.xsave);
 }
 
-/* Temporary workaround. Will be removed once PKRU and XSTATE are distangled. */
-static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
-{
-	struct pkru_state *pk;
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-	/*
-	 * Force XFEATURE_PKRU to be set in the header otherwise
-	 * get_xsave_addr() does not work and it also needs to be set to
-	 * make XRSTOR(S) load it.
-	 */
-	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
-	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
-	pk->pkru = pkru_get_init_value();
-}
-
 /*
  * Reset current->fpu memory state to the init values.
  */
@@ -437,9 +420,12 @@ static void fpu_reset_fpstate(void)
 	 *
 	 * Do not use fpstate_init() here. Just copy init_fpstate which has
 	 * the correct content already except for PKRU.
+	 *
+	 * PKRU handling does not rely on the xstate when restoring for
+	 * user space as PKRU is eagerly written in switch_to() and
+	 * flush_thread().
 	 */
 	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
-	pkru_set_default_in_xstate(&fpu->state.xsave);
 	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 }


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 51/52] x86/pkru: Remove xstate fiddling from write_pkru()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (49 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 50/52] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
@ 2021-06-14 15:44 ` Thomas Gleixner
  2021-06-14 15:45 ` [patch V2 52/52] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
                   ` (2 subsequent siblings)
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:44 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The PKRU value of a task is stored in task->thread.pkru when the task is
scheduled out. PKRU is restored on schedule in from there. So keeping the
XSAVE buffer up to date is a pointless exercise.

Remove the xstate fiddling and cleanup all related functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pkru.h          |   17 ++++-------------
 arch/x86/include/asm/special_insns.h |   14 +-------------
 arch/x86/kvm/x86.c                   |    4 ++--
 3 files changed, 7 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -41,23 +41,14 @@ static inline u32 read_pkru(void)
 
 static inline void write_pkru(u32 pkru)
 {
-	struct pkru_state *pk;
-
 	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
 	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
 	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
+	if (pkru != rdpkru())
+		wrpkru(pkru);
 }
 
 static inline void pkru_write_default(void)
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -104,25 +104,13 @@ static inline void wrpkru(u32 pkru)
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
 
-static inline void __write_pkru(u32 pkru)
-{
-	/*
-	 * WRPKRU is relatively expensive compared to RDPKRU.
-	 * Avoid WRPKRU when it would not change the value.
-	 */
-	if (pkru == rdpkru())
-		return;
-
-	wrpkru(pkru);
-}
-
 #else
 static inline u32 rdpkru(void)
 {
 	return 0;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 }
 #endif
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -943,7 +943,7 @@ void kvm_load_guest_xsave_state(struct k
 	    (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
 	    vcpu->arch.pkru != vcpu->arch.host_pkru)
-		__write_pkru(vcpu->arch.pkru);
+		write_pkru(vcpu->arch.pkru);
 }
 EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
 
@@ -957,7 +957,7 @@ void kvm_load_host_xsave_state(struct kv
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
 		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vcpu->arch.host_pkru)
-			__write_pkru(vcpu->arch.host_pkru);
+			write_pkru(vcpu->arch.host_pkru);
 	}
 
 	if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {


^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2 52/52] x86/fpu: Mark init_fpstate __ro_after_init
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (50 preceding siblings ...)
  2021-06-14 15:44 ` [patch V2 51/52] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
@ 2021-06-14 15:45 ` Thomas Gleixner
  2021-06-14 20:15 ` [patch] x86/fpu: x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate() Thomas Gleixner
  2021-06-16  0:50 ` [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Yu, Yu-cheng
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 15:45 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Nothing has to write into that state after init

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/kernel/fpu/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -23,7 +23,7 @@
  * Represents the initial FPU state. It's mostly (but not completely) zeroes,
  * depending on the FPU hardware format:
  */
-union fpregs_state init_fpstate __read_mostly;
+union fpregs_state init_fpstate __ro_after_init;
 
 /*
  * Track whether the kernel is using the FPU state


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE
  2021-06-14 15:44 ` [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE Thomas Gleixner
@ 2021-06-14 19:15   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-14 19:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:09PM +0200, Thomas Gleixner wrote:
> @@ -466,10 +489,20 @@ static void __init setup_init_fpu_buf(vo
>  	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
>  
>  	/*
> -	 * Dump the init state again. This is to identify the init state
> -	 * of any feature which is not represented by all zero's.
> +	 * All components are now in init state. Read the state back so
> +	 * that init_fpstate contains all non-zero init state. This is only
> +	 * working with XSAVE,

"This only works with XSAVE, ... "


> but not with XSAVEOPT and XSAVES because
> +	 * those use the init optimization which skips writing data for
> +	 * components in init state.

<--- Add a newline in the comment here so that it is not as dense.

> So XSAVE could be used, but that would
> +	 * require to reshuffle the data when XSAVES is available because
> +	 * XSAVES uses xstate compaction. But doing so is a pointless
> +	 * exercise because most components have an all zeros init state
> +	 * except for the legacy ones (FP and SSE). Those can be saved with
> +	 * FXSAVE into the legacy area. Adding new features requires to
> +	 * ensure that init state is all zeroes or if not to add the
> +	 * necessary handling here.
>  	 */
> -	copy_xregs_to_kernel_booting(&init_fpstate.xsave);
> +	fxsave_to_kernel(&init_fpstate.fxsave);

With those fixed:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch V2-A 47/52] x86/fpu: Hook up PKRU into ptrace()
  2021-06-14 15:44 ` [patch V2 47/52] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
@ 2021-06-14 19:29   ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 19:29 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

One nice thing about having PKRU be XSAVE-managed is that it gets naturally
exposed into the XSAVE-using ABIs.  Now that XSAVE will not be used to
manage PKRU, these ABIs need to be manually enabled to deal with PKRU.

ptrace() uses copy_uabi_xstate_to_kernel() to collect the tracee's
XSTATE. As PKRU is not in the task's XSTATE buffer, use task->thread.pkru
for filling in up the ptrace buffer.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2-A: Managed to post the unfixed variant of this in V2.  Seems the
      script magic to catch unrefreshed patches before posting still has
      a weak spot. Other than that I blame the heat.
---
 arch/x86/include/asm/fpu/xstate.h |    2 +-
 arch/x86/kernel/fpu/regset.c      |   10 ++++------
 arch/x86/kernel/fpu/xstate.c      |   25 ++++++++++++++++++-------
 3 files changed, 23 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -139,7 +139,7 @@ enum xstate_copy_mode {
 };
 
 struct membuf;
-void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave,
+void copy_uabi_xstate_to_membuf(struct membuf to, struct task_struct *tsk,
 				enum xstate_copy_mode mode);
 
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -43,7 +43,7 @@ int xfpregs_get(struct task_struct *targ
 				    sizeof(fpu->state.fxsave));
 	}
 
-	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave, XSTATE_COPY_FX);
+	copy_uabi_xstate_to_membuf(to, target, XSTATE_COPY_FX);
 	return 0;
 }
 
@@ -92,14 +92,12 @@ int xfpregs_set(struct task_struct *targ
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 		struct membuf to)
 {
-	struct fpu *fpu = &target->thread.fpu;
-
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	fpu__prepare_read(&target->thread.fpu);
 
-	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave, XSTATE_COPY_XSAVE);
+	copy_uabi_xstate_to_membuf(to, target, XSTATE_COPY_XSAVE);
 	return 0;
 }
 
@@ -302,7 +300,7 @@ int fpregs_get(struct task_struct *targe
 		struct membuf mb = { .p = &fxsave, .left = sizeof(fxsave) };
 
 		/* Handle init state optimized xstate correctly */
-		copy_uabi_xstate_to_membuf(mb, &fpu->state.xsave, XSTATE_COPY_FP);
+		copy_uabi_xstate_to_membuf(mb, target, XSTATE_COPY_FP);
 		fx = &fxsave;
 	} else {
 		fx = &fpu->state.fxsave;
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -973,7 +973,7 @@ static void copy_feature(bool from_xstat
 /**
  * copy_uabi_xstate_to_membuf - Copy kernel saved xstate to a UABI buffer
  * @to:		membuf descriptor
- * @xsave:	The kernel xstate buffer to copy from
+ * @tsk:	The task from which to copy the saved xstate
  * @copy_mode:	The requested copy mode
  *
  * Converts from kernel XSAVE or XSAVES compacted format to UABI conforming
@@ -982,10 +982,11 @@ static void copy_feature(bool from_xstat
  *
  * It supports partial copy but @to.pos always starts from zero.
  */
-void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave,
+void copy_uabi_xstate_to_membuf(struct membuf to, struct task_struct *tsk,
 				enum xstate_copy_mode copy_mode)
 {
 	const unsigned int off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+	struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
 	struct xregs_state *xinit = &init_fpstate.xsave;
 	struct xstate_header header;
 	unsigned int zerofrom;
@@ -1059,11 +1060,21 @@ void copy_uabi_xstate_to_membuf(struct m
 		if (zerofrom < xstate_offsets[i])
 			membuf_zero(&to, xstate_offsets[i] - zerofrom);
 
-		copy_feature(header.xfeatures & BIT_ULL(i), &to,
-			     __raw_xsave_addr(xsave, i),
-			     __raw_xsave_addr(xinit, i),
-			     xstate_sizes[i]);
-
+		if (i == XFEATURE_PKRU) {
+			struct pkru_state pkru = {0};
+			/*
+			 * PKRU is not necessarily up to date in the
+			 * thread's XSAVE buffer.  Fill this part from the
+			 * per-thread storage.
+			 */
+			pkru.pkru = tsk->thread.pkru;
+			membuf_write(&to, &pkru, sizeof(pkru));
+		} else {
+			copy_feature(header.xfeatures & BIT_ULL(i), &to,
+				     __raw_xsave_addr(xsave, i),
+				     __raw_xsave_addr(xinit, i),
+				     xstate_sizes[i]);
+		}
 		/*
 		 * Keep track of the last copied state in the non-compacted
 		 * target buffer for gap zeroing.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [patch] x86/fpu: x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (51 preceding siblings ...)
  2021-06-14 15:45 ` [patch V2 52/52] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
@ 2021-06-14 20:15 ` Thomas Gleixner
  2021-06-16  0:50 ` [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Yu, Yu-cheng
  53 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-14 20:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra

sanitize_restored_user_xstate() preserves the supervisor states only
when the fx_only argument is zero, which allows unpriviledged user space
to put supervisor states back into init state.

Preserve them unconditionally.

Fixes: 5d6b6a6f9b5c ("x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/fpu/signal.c |   26 ++++++++------------------
 1 file changed, 8 insertions(+), 18 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -221,28 +221,18 @@ sanitize_restored_user_xstate(union fpre
 
 	if (use_xsave()) {
 		/*
-		 * Note: we don't need to zero the reserved bits in the
-		 * xstate_header here because we either didn't copy them at all,
-		 * or we checked earlier that they aren't set.
+		 * Clear all features bit which are not set in
+		 * user_xfeatures and clear all extended features
+		 * for fx_only mode.
 		 */
+		u64 mask = fx_only ? XFEATURE_MASK_FPSSE : user_xfeatures;
 
 		/*
-		 * 'user_xfeatures' might have bits clear which are
-		 * set in header->xfeatures. This represents features that
-		 * were in init state prior to a signal delivery, and need
-		 * to be reset back to the init state.  Clear any user
-		 * feature bits which are set in the kernel buffer to get
-		 * them back to the init state.
-		 *
-		 * Supervisor state is unchanged by input from userspace.
-		 * Ensure supervisor state bits stay set and supervisor
-		 * state is not modified.
+		 * Supervisor state has to be preserved. The sigframe
+		 * restore can only modify user features, i.e. @mask
+		 * cannot contain them.
 		 */
-		if (fx_only)
-			header->xfeatures = XFEATURE_MASK_FPSSE;
-		else
-			header->xfeatures &= user_xfeatures |
-					     xfeatures_mask_supervisor();
+		header->xfeatures &= mask | xfeatures_mask_supervisor();
 	}
 
 	if (use_fxsr()) {

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-14 15:44 ` [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
@ 2021-06-15 11:07   ` Borislav Petkov
  2021-06-15 12:47     ` Thomas Gleixner
  2021-06-16 22:02   ` Thomas Gleixner
  1 sibling, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-15 11:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:10PM +0200, Thomas Gleixner wrote:
>   2) Keeping track of the last copied state in the target buffer and
>      explicitely zero it when there is a feature or alignment gap.

WARNING: 'explicitely' may be misspelled - perhaps 'explicitly'?
#93: 
     explicitely zero it when there is a feature or alignment gap.
     ^^^^^^^^^^^

> -static void fill_gap(struct membuf *to, unsigned *last, unsigned offset)
> +static void copy_feature(bool from_xstate, struct membuf *to, void *xstate,
> +			 void *init_xstate, unsigned int size)
>  {
> -	if (*last >= offset)
> -		return;
> -	membuf_write(to, (void *)&init_fpstate.xsave + *last, offset - *last);
> -	*last = offset;
> -}
> -
> -static void copy_part(struct membuf *to, unsigned *last, unsigned offset,
> -		      unsigned size, void *from)
> -{
> -	fill_gap(to, last, offset);
> -	membuf_write(to, from, size);
> -	*last = offset + size;
> +	membuf_write(to, from_xstate ? xstate : init_xstate, size);

I wonder - since we're making this code more robust anyway - whether
we should add an additional assertion here to check whether that
membuf.left is < size and warn.

It is cheap and having an additional check here would probably catch
some ptrace insanity or so, who knows...

> @@ -1120,41 +1110,68 @@ void copy_xstate_to_kernel(struct membuf
>  	header.xfeatures = xsave->header.xfeatures;
>  	header.xfeatures &= xfeatures_mask_user();
>  
> -	if (header.xfeatures & XFEATURE_MASK_FP)
> -		copy_part(&to, &last, 0, off_mxcsr, &xsave->i387);
> -	if (header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM))
> -		copy_part(&to, &last, off_mxcsr,
> -			  MXCSR_AND_FLAGS_SIZE, &xsave->i387.mxcsr);
> -	if (header.xfeatures & XFEATURE_MASK_FP)
> -		copy_part(&to, &last, offsetof(struct fxregs_state, st_space),
> -			  128, &xsave->i387.st_space);
> -	if (header.xfeatures & XFEATURE_MASK_SSE)
> -		copy_part(&to, &last, xstate_offsets[XFEATURE_SSE],
> -			  256, &xsave->i387.xmm_space);
> -	/*
> -	 * Fill xsave->i387.sw_reserved value for ptrace frame:
> -	 */
> -	copy_part(&to, &last, offsetof(struct fxregs_state, sw_reserved),
> -		  48, xstate_fx_sw_bytes);
> -	/*
> -	 * Copy xregs_state->header:
> -	 */
> -	copy_part(&to, &last, offsetof(struct xregs_state, header),
> -		  sizeof(header), &header);
> +	/* Copy FP state up to MXCSR */
> +	copy_feature(header.xfeatures & XFEATURE_MASK_FP, &to, &xsave->i387,
> +		     &xinit->i387, off_mxcsr);
> +
> +	/* Copy MXCSR when SSE or YMM are set in the feature mask */
> +	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
> +		     &to, &xsave->i387.mxcsr, &xinit->i387.mxcsr,
> +		     MXCSR_AND_FLAGS_SIZE);

Yah, this copies a whopping 8 bytes:

        u32                     mxcsr;          /* MXCSR Register State */
        u32                     mxcsr_mask;     /* MXCSR Mask           */

I know, I know, it was like that before but dammit, that's obscure.

> +	/* Copy the remaining FP state */
> +	copy_feature(header.xfeatures & XFEATURE_MASK_FP,
> +		     &to, &xsave->i387.st_space, &xinit->i387.st_space,
> +		     sizeof(xsave->i387.st_space));
> +
> +	/* Copy the SSE state - shared with YMM */
> +	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
> +		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
> +		     16 * 16);

Why not

	sizeof(xsave->i387.xmm_space)

?

> +
> +	/* Zero the padding area */
> +	membuf_zero(&to, sizeof(xsave->i387.padding));
> +
> +	/* Copy xsave->i387.sw_reserved */
> +	membuf_write(&to, xstate_fx_sw_bytes, sizeof(xsave->i387.sw_reserved));
> +
> +	/* Copy the user space relevant state of @xsave->header */
> +	membuf_write(&to, &header, sizeof(header));
> +
> +	zerofrom = offsetof(struct xregs_state, extended_state_area);
>  
>  	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
>  		/*
> -		 * Copy only in-use xstates:
> +		 * The ptrace buffer is XSAVE format which is non-compacted.

... "is in XSAVE, non-compacted format."

> +		 * In non-compacted format disabled features still occupy
		     ^			  ^
		    the			  ,

> +		 * state space, but there is no state to copy from in the
> +		 * compacted init_fpstate. The gap tracking will zero this
> +		 * later.
> +		 */
> +		if (!(xfeatures_mask_user() & BIT_ULL(i)))
> +			continue;
> +
> +		/*
> +		 * If there was a feature or alignment gap, zero the space
> +		 * in the destination buffer.
>  		 */
> -		if ((header.xfeatures >> i) & 1) {
> -			void *src = __raw_xsave_addr(xsave, i);
> +		if (zerofrom < xstate_offsets[i])
> +			membuf_zero(&to, xstate_offsets[i] - zerofrom);
>  
> -			copy_part(&to, &last, xstate_offsets[i],
> -				  xstate_sizes[i], src);
> -		}
> +		copy_feature(header.xfeatures & BIT_ULL(i), &to,
> +			     __raw_xsave_addr(xsave, i),
> +			     __raw_xsave_addr(xinit, i),
> +			     xstate_sizes[i]);
>  
> +		/*
> +		 * Keep track of the last copied state in the non-compacted
> +		 * target buffer for gap zeroing.
> +		 */
> +		zerofrom = xstate_offsets[i] + xstate_sizes[i];
>  	}
> -	fill_gap(&to, &last, size);
> +
> +	if (to.left)
> +		membuf_zero(&to, to.left);
>  }

Yah, I can certainly follow what's going on here but, mapping that
compacted buffer to the uncompacted, XSAVE one is certainly making my
head spin.

Yah, FPU state handling is nasty.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-15 11:07   ` Borislav Petkov
@ 2021-06-15 12:47     ` Thomas Gleixner
  2021-06-15 12:59       ` Borislav Petkov
  0 siblings, 1 reply; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-15 12:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Tue, Jun 15 2021 at 13:07, Borislav Petkov wrote:
> On Mon, Jun 14, 2021 at 05:44:10PM +0200, Thomas Gleixner wrote:
>>   2) Keeping track of the last copied state in the target buffer and
>>      explicitely zero it when there is a feature or alignment gap.
>
> WARNING: 'explicitely' may be misspelled - perhaps 'explicitly'?
> #93: 
>      explicitely zero it when there is a feature or alignment gap.
>       ^^^^^^^^^^^

I'll never learn that. /me goes to write some elisp.

>> +	membuf_write(to, from_xstate ? xstate : init_xstate, size);
>
> I wonder - since we're making this code more robust anyway - whether
> we should add an additional assertion here to check whether that
> membuf.left is < size and warn.

Nah. The wonder of membug_write() is that it does not write behind the
end of the buffer which is designed to allow partial reads w/o checking
a gazillion times for return values etc.

>> +
>> +	/* Copy MXCSR when SSE or YMM are set in the feature mask */
>> +	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
>> +		     &to, &xsave->i387.mxcsr, &xinit->i387.mxcsr,
>> +		     MXCSR_AND_FLAGS_SIZE);
>
> Yah, this copies a whopping 8 bytes:
>
>         u32                     mxcsr;          /* MXCSR Register State */
>         u32                     mxcsr_mask;     /* MXCSR Mask           */
>
> I know, I know, it was like that before but dammit, that's obscure.

The point is that this gives us the proper init.mxcsr value when SSE and
YMM are not set.

>> +	/* Copy the remaining FP state */
>> +	copy_feature(header.xfeatures & XFEATURE_MASK_FP,
>> +		     &to, &xsave->i387.st_space, &xinit->i387.st_space,
>> +		     sizeof(xsave->i387.st_space));
>> +
>> +	/* Copy the SSE state - shared with YMM */
>> +	copy_feature(header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM),
>> +		     &to, &xsave->i387.xmm_space, &xinit->i387.xmm_space,
>> +		     16 * 16);
>
> Why not
> 	sizeof(xsave->i387.xmm_space)

because I missed that.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-15 12:47     ` Thomas Gleixner
@ 2021-06-15 12:59       ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-15 12:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Tue, Jun 15, 2021 at 02:47:14PM +0200, Thomas Gleixner wrote:
> I'll never learn that. /me goes to write some elisp.

You could also say, this is a way to keep your reviewers awake. :-P

> Nah. The wonder of membug_write() is that it does not write behind the

membug - Freudian slip, yeah, I know which is the word you've been
writing the most, lately. :-P

> end of the buffer which is designed to allow partial reads w/o checking
> a gazillion times for return values etc.

Yeah, right. I'm just being overly paranoid here, as most of the time.

> The point is that this gives us the proper init.mxcsr value when SSE and
> YMM are not set.

Oh sure - what I mean is, this could be a simple assignment into those
mxcsr and mxcsr_mask things but if you add the whole conditional code
around it, it'll become hard to read too and it will be the only copy
which doesn't call copy_feature() and that would throw off people
looking for the same pattern of calling copy_feature() in that whole
function.

And the destination @to would need casting...

Fget about it. :-)

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")
  2021-06-14 15:44 ` [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
@ 2021-06-15 13:15   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-15 13:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:11PM +0200, Thomas Gleixner wrote:
> This cannot work and it's unclear how that ever made a difference.
> 
> init_fpstate.xsave.header.xfeatures is always 0 so get_xsave_addr() will
> always return a NULL pointer, which will prevent storing the default PKRU
> value in initfp_state.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: Fix subject
> ---
>  arch/x86/kernel/cpu/common.c |    5 -----
>  arch/x86/mm/pkeys.c          |    6 ------
>  2 files changed, 11 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 08/52] x86/fpu: Sanitize xstateregs_set()
  2021-06-14 15:44 ` [patch V2 08/52] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
@ 2021-06-15 17:40   ` Borislav Petkov
  2021-06-15 21:32     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-15 17:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:16PM +0200, Thomas Gleixner wrote:
> @@ -108,10 +110,10 @@ int xstateregs_set(struct task_struct *t
>  		  const void *kbuf, const void __user *ubuf)
>  {
>  	struct fpu *fpu = &target->thread.fpu;
> -	struct xregs_state *xsave;
> +	struct xregs_state *tmpbuf = NULL;
>  	int ret;
>  
> -	if (!boot_cpu_has(X86_FEATURE_XSAVE))
> +	if (!static_cpu_has(X86_FEATURE_XSAVE))

cpu_feature_enabled() - we're going to use only that thing from now on
for simplicity.

> +	if (!kbuf) {
> +		tmpbuf = vmalloc(count);
> +		if (!tmpbuf)
> +			return -ENOMEM;
> +
> +		if (copy_from_user(tmpbuf, ubuf, count)) {
> +			ret = -EFAULT;
> +			goto out;
> +		}
>  	}
>  
> -	/*
> -	 * mxcsr reserved bits must be masked to zero for security reasons.
> -	 */
> -	xsave->i387.mxcsr &= mxcsr_feature_mask;
> -
> -	/*
> -	 * In case of failure, mark all states as init:
> -	 */
> -	if (ret)
> -		fpstate_init(&fpu->state);
> +	fpu__prepare_write(fpu);

Yikes, why isn't this function called

fpu_invalidate_state(fpu)

?!

As in, what it does...

> @@ -1196,14 +1196,16 @@ int copy_kernel_to_xstate(struct xregs_s
>  	 */
>  	xsave->header.xfeatures |= hdr.xfeatures;
>  
> +	/* mxcsr reserved bits must be masked to zero for historical reasons. */

Wasn't that comment supposed to get some love?

https://lkml.kernel.org/r/87k0n0w3p8.ffs@nanos.tec.linutronix.de

> +	xsave->i387.mxcsr &= mxcsr_feature_mask;
> +

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 08/52] x86/fpu: Sanitize xstateregs_set()
  2021-06-15 17:40   ` Borislav Petkov
@ 2021-06-15 21:32     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-15 21:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Tue, Jun 15 2021 at 19:40, Borislav Petkov wrote:
> On Mon, Jun 14, 2021 at 05:44:16PM +0200, Thomas Gleixner wrote:
>> @@ -108,10 +110,10 @@ int xstateregs_set(struct task_struct *t
>>  		  const void *kbuf, const void __user *ubuf)
>>  {
>>  	struct fpu *fpu = &target->thread.fpu;
>> -	struct xregs_state *xsave;
>> +	struct xregs_state *tmpbuf = NULL;
>>  	int ret;
>>  
>> -	if (!boot_cpu_has(X86_FEATURE_XSAVE))
>> +	if (!static_cpu_has(X86_FEATURE_XSAVE))
>
> cpu_feature_enabled() - we're going to use only that thing from now on
> for simplicity.

Sure, I just run sed over the set.

>> +	fpu__prepare_write(fpu);
>
> Yikes, why isn't this function called
>
> fpu_invalidate_state(fpu)

Because...

>> +	/* mxcsr reserved bits must be masked to zero for historical reasons. */
>
> Wasn't that comment supposed to get some love?

See the next patch ...

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing
  2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (52 preceding siblings ...)
  2021-06-14 20:15 ` [patch] x86/fpu: x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate() Thomas Gleixner
@ 2021-06-16  0:50 ` Yu, Yu-cheng
  53 siblings, 0 replies; 87+ messages in thread
From: Yu, Yu-cheng @ 2021-06-16  0:50 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/14/2021 8:44 AM, Thomas Gleixner wrote:
> The main parts of this series are:
> 
>    - Yet more bug fixes
> 
>    - Simplification and removal/replacement of redundant and/or
>      overengineered code.
> 
>    - Name space cleanup as the existing names were just a permanent source
>      of confusion.
> 
>    - Clear seperation of user ABI and kernel internal state handling.
> 
>    - Removal of PKRU from being XSTATE managed in the kernel because PKRU
>      has to be eagerly restored on context switch and keeping it in sync
>      in the xstate buffer is just pointless overhead and fragile.
> 
>      The kernel still XSAVEs PKRU on context switch but the value in the
>      buffer is not longer used and never restored from the buffer.
> 
>      This still needs to be cleaned up, but the series is already 40+
>      patches large and the cleanup of this is not a functional problem.
> 
>      The functional issues of PKRU management are fully addressed with the
>      series as is.
> 
> It applies on top of
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master
> 
> and is also available via git:
> 
>    git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu
> 
> This is a follow up to V1 which can be found here:
> 
>       https://lore.kernel.org/r/20210611161523.508908024@linutronix.de
> 
> Changes vs. V1:
> 
>    - Fix the broken init_fpstate initialization
> 
>    - Make xstate copy to ptrace work correctly
> 
>    - Sanitize the regset functions more and get rid of
>      fpstate_sanitize_xstate().
> 
>    - Addressed review comments
> 
>    - Picked up tags
> 
> Thanks,
> 
> 	tglx
> ---
>   arch/x86/events/intel/lbr.c          |    6
>   arch/x86/include/asm/fpu/internal.h  |  179 +++-------
>   arch/x86/include/asm/fpu/xstate.h    |   70 ++-
>   arch/x86/include/asm/pgtable.h       |   57 ---
>   arch/x86/include/asm/pkeys.h         |    9
>   arch/x86/include/asm/pkru.h          |   62 +++
>   arch/x86/include/asm/processor.h     |    9
>   arch/x86/include/asm/special_insns.h |   14
>   arch/x86/kernel/cpu/common.c         |   29 -
>   arch/x86/kernel/fpu/core.c           |  242 +++++++++----
>   arch/x86/kernel/fpu/init.c           |    4
>   arch/x86/kernel/fpu/regset.c         |  177 ++++-----
>   arch/x86/kernel/fpu/signal.c         |   59 +--
>   arch/x86/kernel/fpu/xstate.c         |  620 ++++++++++++++---------------------
>   arch/x86/kernel/process.c            |   19 +
>   arch/x86/kernel/process_64.c         |   28 +
>   arch/x86/kvm/svm/sev.c               |    1
>   arch/x86/kvm/x86.c                   |   56 +--
>   arch/x86/mm/extable.c                |    2
>   arch/x86/mm/fault.c                  |    2
>   arch/x86/mm/pkeys.c                  |   22 -
>   include/linux/pkeys.h                |    4
>   22 files changed, 818 insertions(+), 853 deletions(-)
> 
> 

I applied shadow stack, IBT on top of this series, and ran routine 
tests.  All passed with one small change to patch #45 (see reply to that 
one).

Thanks,
Yu-cheng

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  2021-06-14 15:44 ` [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
@ 2021-06-16  0:52   ` Yu, Yu-cheng
  2021-06-16  8:56     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Yu, Yu-cheng @ 2021-06-16  0:52 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/14/2021 8:44 AM, Thomas Gleixner wrote:
> switch_to(), flush_thread() write the task's PKRU value eagerly so the PKRU
> value of current is always valid in the hardware.
> 
> That means there is no point in restoring PKRU on exit to user or when
> reactivating the task's FPU registers in the signal frame setup path.
> 
> This allows to remove all the xstate buffer updates with PKRU values once
> the PKRU state is stored in thread struct while a task is scheduled out.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>   arch/x86/include/asm/fpu/internal.h |   12 +++++++++++-
>   arch/x86/include/asm/fpu/xstate.h   |   19 +++++++++++++++++++
>   arch/x86/kernel/fpu/core.c          |    2 +-
>   3 files changed, 31 insertions(+), 2 deletions(-)
> 
> --- a/arch/x86/include/asm/fpu/internal.h
> +++ b/arch/x86/include/asm/fpu/internal.h
> @@ -455,7 +455,17 @@ static inline void fpregs_restore_userre
>   		return;
>   
>   	if (!fpregs_state_valid(fpu, cpu)) {
> -		restore_fpregs_from_fpstate(&fpu->state);
> +		/*
> +		 * This restores _all_ xstate which has not been
> +		 * established yet.
> +		 *
> +		 * If PKRU is enabled, then the PKRU value is already
> +		 * correct because it was either set in switch_to() or in
> +		 * flush_thread(). So it is excluded because it might be
> +		 * not up to date in current->thread.fpu.xsave state.
> +		 */
> +		__restore_fpregs_from_fpstate(&fpu->state,
> +					      xfeatures_mask_restore_user());

This needs to be xfeatures_mask_restore_user() | 
xfeatures_mask_supervisor().

>   		fpregs_activate(fpu);
>   		fpu->last_cpu = cpu;
>   	}

[...]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  2021-06-16  0:52   ` Yu, Yu-cheng
@ 2021-06-16  8:56     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-16  8:56 UTC (permalink / raw)
  To: Yu, Yu-cheng, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Tue, Jun 15 2021 at 17:52, Yu-cheng Yu wrote:
> On 6/14/2021 8:44 AM, Thomas Gleixner wrote:
>> +		 * If PKRU is enabled, then the PKRU value is already
>> +		 * correct because it was either set in switch_to() or in
>> +		 * flush_thread(). So it is excluded because it might be
>> +		 * not up to date in current->thread.fpu.xsave state.
>> +		 */
>> +		__restore_fpregs_from_fpstate(&fpu->state,
>> +					      xfeatures_mask_restore_user());
>
> This needs to be xfeatures_mask_restore_user() | 
> xfeatures_mask_supervisor().

Indeed. Good catch!

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()
  2021-06-14 15:44 ` [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
@ 2021-06-16 15:02   ` Borislav Petkov
  2021-06-16 23:51     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-16 15:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:17PM +0200, Thomas Gleixner wrote:
> Instead of masking out reserved bits, check them and reject the provided
> state as invalid if not zero.
> 
> Suggested-by: Andy Lutomirski <luto@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch
> ---
>  arch/x86/kernel/fpu/xstate.c |   11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1166,6 +1166,14 @@ int copy_kernel_to_xstate(struct xregs_s
>  	if (validate_user_xstate_header(&hdr))
>  		return -EINVAL;
>  
> +	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {

Since we're cleaning up this FPU stinking pile - that function needs to
have a verb in the name, something like:

	if (xfeatures_mxcsr_quirk_needed(...))

but that's unrelated to here and as a note to whoever gets to get to it
first.

> +		const u32 *mxcsr = kbuf + offsetof(struct fxregs_state, mxcsr);
> +
> +		/* Reserved bits in MXCSR must be zero. */
> +		if (*mxcsr & ~mxcsr_feature_mask)
> +			return -EINVAL;
> +	}

Btw, that function has another

	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {

branch already below the loop.

Should we merge both? Diff ontop of yours:

---
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5d032c48f39d..30022d3fcd4a 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1172,6 +1172,11 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 		/* Reserved bits in MXCSR must be zero. */
 		if (*mxcsr & ~mxcsr_feature_mask)
 			return -EINVAL;
+
+		offset = offsetof(struct fxregs_state, mxcsr);
+		size = MXCSR_AND_FLAGS_SIZE;
+
+		memcpy(&xsave->i387.mxcsr, kbuf + offset, size);
 	}
 
 	for (i = 0; i < XFEATURE_MAX; i++) {
@@ -1187,12 +1192,6 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 		}
 	}
 
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		memcpy(&xsave->i387.mxcsr, kbuf + offset, size);
-	}
-
 	/*
 	 * The state that came in from userspace was user-state only.
 	 * Mask all the user states out of 'xfeatures':

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 11/52] x86/fpu: Rewrite xfpregs_set()
  2021-06-14 15:44 ` [patch V2 11/52] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
@ 2021-06-16 15:22   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-16 15:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:19PM +0200, Thomas Gleixner wrote:
> From: Andy Lutomirski <luto@kernel.org>
> 
> xfpregs_set() was incomprehensible.  Almost all of the complexity was due
> to trying to support nonsensically sized writes or -EFAULT errors that
> would have partially or completely overwritten the destination before
> failing.  Nonsensically sized input would only have been possible using
> PTRACE_SETREGSET on REGSET_XFP.  Fortunately, it appears (based on Debian
> code search results) that no one uses that API at all, let alone with the
> wrong sized buffer.  Failed user access can be handled more cleanly by
> first copying to kernel memory.
> 
> Just rewrite it to require sensible input.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch picked up from Andy
> ---
>  arch/x86/kernel/fpu/regset.c |   40 +++++++++++++++++++++++++---------------
>  1 file changed, 25 insertions(+), 15 deletions(-)
> 
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -47,30 +47,40 @@ int xfpregs_set(struct task_struct *targ
>  		const void *kbuf, const void __user *ubuf)
>  {
>  	struct fpu *fpu = &target->thread.fpu;
> +	struct user32_fxsr_struct newstate;
>  	int ret;
>  
> -	if (!boot_cpu_has(X86_FEATURE_FXSR))
> +	BUILD_BUG_ON(sizeof(newstate) != sizeof(struct fxregs_state));
> +
> +	if (!static_cpu_has(X86_FEATURE_FXSR))

cpu_feature_enabled

>  		return -ENODEV;
>  
> -	fpu__prepare_write(fpu);
> -	fpstate_sanitize_xstate(fpu);
> +	/* No funny business with partial or oversized writes is permitted. */
> +	if (pos != 0 || count != sizeof(newstate))
> +		return -EINVAL;
>  
>  	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
> -				 &fpu->state.fxsave, 0, -1);
> +				 &newstate, 0, -1);

Let that line stick out.

> +	if (ret)
> +		return ret;
> +
> +	/* Mask invalid MXCSR bits (for historical reasons). */

security reasons became historical reasons huh? :-)

With those fixed:

Reviewed-by: Borislav Petkov <bp@suse.de>

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values
  2021-06-14 15:44 ` [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
@ 2021-06-16 15:31   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-16 15:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:20PM +0200, Thomas Gleixner wrote:
> From: Andy Lutomirski <luto@kernel.org>
> 
> We're not doing anyone any favors by accepting and silently changing an
> invalid MXCSR value supplied via ptrace().  Instead, return -EINVAL on
> invalid input.
> 
> If this breaks something, we can revert it.

Please use passive voice in your commit message: no "we" or "I", etc,

> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch. Picked up from Andy.
> ---
>  arch/x86/kernel/fpu/regset.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> ---
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -65,8 +65,9 @@ int xfpregs_set(struct task_struct *targ
>  	if (ret)
>  		return ret;
>  
> -	/* Mask invalid MXCSR bits (for historical reasons). */
> -	newstate.mxcsr &= mxcsr_feature_mask;
> +	/* Do not allow an invalid MXCSR value. */
> +	if (newstate.mxcsr & ~mxcsr_feature_mask)
> +		ret = -EINVAL;
>  
>  	fpu__prepare_write(fpu);

With that addressed:

Reviewed-by: Borislav Petkov <bp@suse.de>

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 13/52] x86/fpu: Clean up fpregs_set()
  2021-06-14 15:44 ` [patch V2 13/52] x86/fpu: Clean up fpregs_set() Thomas Gleixner
@ 2021-06-16 15:42   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-16 15:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:21PM +0200, Thomas Gleixner wrote:
> From: Andy Lutomirski <luto@kernel.org>
> 
> fpregs_set() had unnecessary complexity to support short or nonzero-offset
> writes and to handle the case in which a copy from userspace overwrites
> some of the target buffer and then fails.  Support for partial writes is
> useless -- just require that the write have offset 0 and the correct size,
> and copy into a temporary kernel buffer to avoid clobbering the state if
> the user access fails.
> 
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch. Picked up from Andy
> ---
>  arch/x86/kernel/fpu/regset.c |   27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> ---
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -305,31 +305,32 @@ int fpregs_set(struct task_struct *targe
>  	struct user_i387_ia32_struct env;
>  	int ret;
>  
> -	fpu__prepare_write(fpu);
> -	fpstate_sanitize_xstate(fpu);
> +	/* No funny business with partial or oversized writes is permitted. */
> +	if (pos != 0 || count != sizeof(struct user_i387_ia32_struct))
> +		return -EINVAL;
>  
>  	if (!boot_cpu_has(X86_FEATURE_FPU))

cpu_feature_enabled(), and below too, while you're at it.

>  		return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
>  
> -	if (!boot_cpu_has(X86_FEATURE_FXSR))
> -		return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
> -					  &fpu->state.fsave, 0,
> -					  -1);
> +	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
> +	if (ret)
> +		return ret;
>  
> -	if (pos > 0 || count < sizeof(env))
> -		convert_from_fxsr(&env, target);
> +	fpu__prepare_write(fpu);
>  
> -	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
> -	if (!ret)
> +	if (static_cpu_has(X86_FEATURE_FXSR))
>  		convert_to_fxsr(&target->thread.fpu.state.fxsave, &env);
> +	else
> +		memcpy(&target->thread.fpu.state.fsave, &env, sizeof(env));
>  
>  	/*
> -	 * update the header bit in the xsave header, indicating the
> +	 * Update the header bit in the xsave header, indicating the
>  	 * presence of FP.
>  	 */
> -	if (boot_cpu_has(X86_FEATURE_XSAVE))
> +	if (static_cpu_has(X86_FEATURE_XSAVE))
>  		fpu->state.xsave.header.xfeatures |= XFEATURE_MASK_FP;
> -	return ret;
> +
> +	return 0;
>  }

With that addressed:

Reviewed-by: Borislav Petkov <bp@suse.de>

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()
  2021-06-14 15:44 ` [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
@ 2021-06-16 16:13   ` Borislav Petkov
  2021-06-17 12:42     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-16 16:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:22PM +0200, Thomas Gleixner wrote:
> When xsave with init state optimiziation is used then a component's state

				optimization

> @@ -1062,14 +1062,20 @@ static void copy_feature(bool from_xstat
>  	membuf_write(to, from_xstate ? xstate : init_xstate, size);
>  }
>  
> -/*
> - * Convert from kernel XSAVE or XSAVES compacted format to UABI
> - * non-compacted format and copy to a kernel-space ptrace buffer.
> +/**
> + * copy_uabi_xstate_to_membuf - Copy kernel saved xstate to a UABI buffer

If this is what it does, then the function should be called:

copy_xstate_to_uabi_buf()

or so.

"membuf" is only an implementation detail anyway. IMHO.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features
  2021-06-14 15:44 ` [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
@ 2021-06-16 20:04   ` Liang, Kan
  2021-06-17  7:15     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Liang, Kan @ 2021-06-16 20:04 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra



On 6/14/2021 11:44 AM, Thomas Gleixner wrote:
> The copy functions for the independent features are horribly named and the
> supervisor and independent part is just overengineered.
> 
> The point is that the supplied mask has either to be a subset of the
> independent feature or a subset of the task->fpu.xstate managed features.
> 
> Rewrite it so it checks check for invalid overlaps of these areas in the
> caller supplied feature mask. Rename it so it follows the new naming
> convention for these operations. Mop up the function documentation.
> 
> This allows to use that function for other purposes as well.
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> ---
>   arch/x86/events/intel/lbr.c       |    6 +-
>   arch/x86/include/asm/fpu/xstate.h |    5 +-
>   arch/x86/kernel/fpu/xstate.c      |   93 +++++++++++++++++++-------------------
>   3 files changed, 53 insertions(+), 51 deletions(-)
> 
> --- a/arch/x86/events/intel/lbr.c
> +++ b/arch/x86/events/intel/lbr.c
> @@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
>   {
>   	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
>   
> -	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
> +	xrstors_from_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
>   }
>   
>   static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
> @@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
>   {
>   	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
>   
> -	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
> +	xsaves_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
>   }
>   
>   static void __intel_pmu_lbr_save(void *ctx)
> @@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
>   		intel_pmu_store_lbr(cpuc, NULL);
>   		return;
>   	}
> -	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
> +	xsaves_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
>   
>   	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
>   }

I tested the LBR Xsave feature on a Alder Lake machine. It looks good.

However, when I did other CPU hotplug test, it gave me an Oops.

$ sudo bash -c 'echo 0 > /sys/devices/system/cpu/cpu1/online'
$ sudo bash -c 'echo 1 > /sys/devices/system/cpu/cpu1/online'

[  108.912963] IRQ 132: no longer affine to CPU1
[  108.913010] IRQ148: set affinity failed(-22).
[  108.913038] IRQ149: set affinity failed(-22).
[  108.913050] IRQ150: set affinity failed(-22).
[  108.917436] smpboot: CPU 1 is now offline
[  111.191655] x86: Booting SMP configuration:
[  111.191661] smpboot: Booting Node 0 Processor 1 APIC 0x1
[  111.200452] BUG: unable to handle page fault for address: 
ffffffff996a96a0
[  111.207312] #PF: supervisor write access in kernel mode
[  111.207325] #PF: error_code(0x0003) - permissions violation
[  111.207335] PGD 40a02b067 P4D 40a02b067 PUD 40a02c063 PMD 106bdb063 
PTE 8000000409ea9161
[  111.218116] Oops: 0003 [#1] PREEMPT SMP NOPTI
[  111.218137] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
5.13.0-rc5-perf+ #123
[  111.218156] RIP: 0010:fpu__init_cpu_xstate+0x3e/0x130
[  111.218184] Code: 00 00 00 48 8b 05 02 08 66 01 48 85 c0 0f 84 a7 00 
00 00 55 48 89 c6 4889 e5 53 81 e6 00 01 00 00 0f 85 b5 00 00 00 80 e4 
fe <48> 89 05 db 07 66 01 9c 58 0f 1f 4400 00 48 89 c3 fa 66 0f 1f 44
[  111.218195] RSP: 0000:ffffa466401abec0 EFLAGS: 00010002
[  111.273756] RAX: 0000000000000207 RBX: 0000000000000008 RCX: 
0000000000000000
[  111.273764] RDX: 0000000000310800 RSI: 0000000000000000 RDI: 
0000000080050033
[  111.273772] RBP: ffffa466401abec8 R08: 00000000fffffe00 R09: 
ffff98729f686078
[  111.273778] R10: ffffffff99826000 R11: ffffa466401abdde R12: 
0000000000000001
[  111.273783] R13: 0000000000000000 R14: ffff986f00c20000 R15: 
000000000000b000
[  111.273789] FS:  0000000000000000(0000) GS:ffff98729f680000(0000) 
knlGS:0000000000000000
[  111.273798] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  111.273805] CR2: ffffffff996a96a0 CR3: 000000040a026001 CR4: 
0000000000330ea0
[  111.273813] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  111.273817] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 
0000000000000400
[  111.273834] invalid opcode: 0000 [#2] PREEMPT SMP NOPTI
[  111.273843] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 
5.13.0-rc5-perf+ #123

Thanks,
Kan

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling
  2021-06-14 15:44 ` [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
  2021-06-15 11:07   ` Borislav Petkov
@ 2021-06-16 22:02   ` Thomas Gleixner
  1 sibling, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-16 22:02 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14 2021 at 17:44, Thomas Gleixner wrote:
> The gap handling in copy_xstate_to_kernel() is wrong in two aspects when
> XSAVES is in use.
>
>   1) Copying of xstate.i387.xmm_space is only copied when the SSE feature
>      bit is set. This is not correct because YMM (AVX) shares the XMM space
>      and that state must also be copied if only the YMM feature bit set
>      like already done for MXCSR.

Thinking more about it. That'd be broken in hardware. When YMM is not in
init state then SSE cannot be in init state.

Of course you can use xsave, then clear the SSE bit and XRSTOR which
blows away the SSE state. Or clear the bit in the sigframe. So copying
it over is silly as XRSTOR will ignore it anyway. If so, user space can
keep the pieces. Let me take that out.

This stuff drives me nuts.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()
  2021-06-16 15:02   ` Borislav Petkov
@ 2021-06-16 23:51     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-16 23:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Wed, Jun 16 2021 at 17:02, Borislav Petkov wrote:
> On Mon, Jun 14, 2021 at 05:44:17PM +0200, Thomas Gleixner wrote:
>> Instead of masking out reserved bits, check them and reject the provided
>> state as invalid if not zero.
>> 
>> Suggested-by: Andy Lutomirski <luto@kernel.org>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>> V2: New patch
>> ---
>>  arch/x86/kernel/fpu/xstate.c |   11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>> 
>> --- a/arch/x86/kernel/fpu/xstate.c
>> +++ b/arch/x86/kernel/fpu/xstate.c
>> @@ -1166,6 +1166,14 @@ int copy_kernel_to_xstate(struct xregs_s
>>  	if (validate_user_xstate_header(&hdr))
>>  		return -EINVAL;
>>  
>> +	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
>
> Since we're cleaning up this FPU stinking pile - that function needs to
> have a verb in the name, something like:
>
> 	if (xfeatures_mxcsr_quirk_needed(...))
>
> but that's unrelated to here and as a note to whoever gets to get to it
> first.
>
>> +		const u32 *mxcsr = kbuf + offsetof(struct fxregs_state, mxcsr);
>> +
>> +		/* Reserved bits in MXCSR must be zero. */
>> +		if (*mxcsr & ~mxcsr_feature_mask)
>> +			return -EINVAL;
>> +	}
>
> Btw, that function has another
>
> 	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
>
> branch already below the loop.
>
> Should we merge both? Diff ontop of yours:

No, because the first usage is wrong. I found that while looking through
this stuff again. Sigh...

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features
  2021-06-16 20:04   ` Liang, Kan
@ 2021-06-17  7:15     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-17  7:15 UTC (permalink / raw)
  To: Liang, Kan, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra

On Wed, Jun 16 2021 at 16:04, Kan Liang wrote:
> On 6/14/2021 11:44 AM, Thomas Gleixner wrote:
>>   	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
>>   }
>
> I tested the LBR Xsave feature on a Alder Lake machine. It looks good.
>
> However, when I did other CPU hotplug test, it gave me an Oops.

Sigh. Yes. I know where this comes from. Brilliant crap that.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get()
  2021-06-14 15:44 ` [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get() Thomas Gleixner
@ 2021-06-17  8:59   ` Borislav Petkov
  2021-06-18 11:19     ` Borislav Petkov
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17  8:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:23PM +0200, Thomas Gleixner wrote:
> Use the new functionality of copy_uabi_xstate_to_membuf() to retrieve the
> FX state when XSAVE* is in use. This avoids to overwrite the FPU state

					avoids overwriting...

> buffer with fpstate_sanitize_xstate() which is error prone and duplicated
> code.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch
> ---
>  arch/x86/kernel/fpu/regset.c |   11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -33,13 +33,18 @@ int xfpregs_get(struct task_struct *targ

So AFAICT, this thing is called by PTRACE_GETFPREGS but looking at ltp:

$ git grep PTRACE_GETFPREGS
$

so this is used - if at all used - by some super duper old binaries
somewhere.

manpage says "PTRACE_GETREGS and PTRACE_GETFPREGS are not present on all
architectures." which could explain why. I wonder if we should add some
stupid test cases so that we can at least exercise this...

>  	struct fpu *fpu = &target->thread.fpu;
>  
> -	if (!boot_cpu_has(X86_FEATURE_FXSR))
> +	if (!static_cpu_has(X86_FEATURE_FXSR))

cpu_feature_enabled

>  		return -ENODEV;
>  
>  	fpu__prepare_read(fpu);
> -	fpstate_sanitize_xstate(fpu);
>  
> -	return membuf_write(&to, &fpu->state.fxsave, sizeof(struct fxregs_state));
> +	if (!use_xsave()) {
> +		return membuf_write(&to, &fpu->state.fxsave,
> +				    sizeof(fpu->state.fxsave));
> +	}
> +
> +	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave, XSTATE_COPY_FX);
> +	return 0;

With the above nitpicks addressed:

Reviewed-by: Borislav Petkov <bp@suse.de>

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get()
  2021-06-14 15:44 ` [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get() Thomas Gleixner
@ 2021-06-17 11:50   ` Borislav Petkov
  2021-06-17 12:43     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17 11:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:24PM +0200, Thomas Gleixner wrote:
> Use the new functionality of copy_uabi_xstate_to_membuf() to retrieve the
> FX state when XSAVE* is in use. This avoids to overwrite the FPU state
> buffer with fpstate_sanitize_xstate() which is error prone and duplicated
> code.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: New patch
> ---
>  arch/x86/kernel/fpu/regset.c |   30 ++++++++++++++++++++----------
>  1 file changed, 20 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -211,10 +211,10 @@ static inline u32 twd_fxsr_to_i387(struc
>   * FXSR floating point environment conversions.
>   */
>  
> -void
> -convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
> +static void __convert_from_fxsr(struct user_i387_ia32_struct *env,
> +				struct task_struct *tsk,
> +				struct fxregs_state *fxsave)
>  {
> -	struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
>  	struct _fpreg *to = (struct _fpreg *) &env->st_space[0];
>  	struct _fpxreg *from = (struct _fpxreg *) &fxsave->st_space[0];
>  	int i;
> @@ -248,6 +248,12 @@ convert_from_fxsr(struct user_i387_ia32_
>  		memcpy(&to[i], &from[i], sizeof(to[0]));
>  }
>  
> +void
> +convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
> +{
> +	__convert_from_fxsr(env, tsk, &tsk->thread.fpu.state.fxsave);
> +}
> +
>  void convert_to_fxsr(struct fxregs_state *fxsave,
>  		     const struct user_i387_ia32_struct *env)
>  
> @@ -280,25 +286,29 @@ int fpregs_get(struct task_struct *targe
>  {
>  	struct fpu *fpu = &target->thread.fpu;
>  	struct user_i387_ia32_struct env;
> +	struct fxregs_state fxsave, *fx;
>  
>  	fpu__prepare_read(fpu);
>  
> -	if (!boot_cpu_has(X86_FEATURE_FPU))
> +	if (!static_cpu_has(X86_FEATURE_FPU))
>  		return fpregs_soft_get(target, regset, to);
>  
> -	if (!boot_cpu_has(X86_FEATURE_FXSR)) {
> +	if (!static_cpu_has(X86_FEATURE_FXSR)) {

both: cpu_feature_enabled

With that:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 18/52] x86/fpu: Get rid of using_compacted_format()
  2021-06-14 15:44 ` [patch V2 18/52] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
@ 2021-06-17 11:59   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17 11:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:26PM +0200, Thomas Gleixner wrote:
> @@ -590,9 +576,9 @@ static void do_extra_xstate_size_checks(
>  		check_xstate_against_struct(i);
>  		/*
>  		 * Supervisor state components can be managed only by
> -		 * XSAVES, which is compacted-format only.
> +		 * XSAVES.
>  		 */
> -		if (!using_compacted_format())
> +		if (!static_cpu_has(X86_FEATURE_XSAVES))
>  			XSTATE_WARN_ON(xfeature_is_supervisor(i));
>  
>  		/* Align from the end of the previous feature */
> @@ -602,9 +588,9 @@ static void do_extra_xstate_size_checks(
>  		 * The offset of a given state in the non-compacted
>  		 * format is given to us in a CPUID leaf.  We check
>  		 * them for being ordered (increasing offsets) in
> -		 * setup_xstate_features().
> +		 * setup_xstate_features(). XSAVES uses compacted format.
>  		 */
> -		if (!using_compacted_format())
> +		if (!static_cpu_has(X86_FEATURE_XSAVES))

both: cpu_feature_enabled()

and yes, I have complained about that one in the past so good riddance.

Reviewed-by: Borislav Petkov <bp@suse.de>

Btw, that patch looks like it could be moved to the beginning of the
patchset, right after the urgent fixes as it is an independent cleanup.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-14 15:44 ` [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
@ 2021-06-17 12:09   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17 12:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:27PM +0200, Thomas Gleixner wrote:
> @@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *
>  	 */
>  	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
>  	while (valid) {
> +		u32 size, offset, ecx, edx;
>  		u64 xfeature_mask = valid & -valid;
>  		int xfeature_nr = fls64(xfeature_mask) - 1;
> -		void *dest = get_xsave_addr(xsave, xfeature_nr);
>  
> -		if (dest) {
> -			u32 size, offset, ecx, edx;
> -			cpuid_count(XSTATE_CPUID, xfeature_nr,
> -				    &size, &offset, &ecx, &edx);
> -			if (xfeature_nr == XFEATURE_PKRU)
> -				memcpy(&vcpu->arch.pkru, src + offset,
> -				       sizeof(vcpu->arch.pkru));
> -			else
> +		cpuid_count(XSTATE_CPUID, xfeature_nr,
> +			    &size, &offset, &ecx, &edx);
> +
> +		if (xfeature_nr == XFEATURE_PKRU) {
> +			memcpy(&vcpu->arch.pkru, src + offset,
> +			       sizeof(vcpu->arch.pkru));
> +		} else {
> +			void *dest = get_xsave_addr(xsave, xfeature_nr);
> +

With that superfluous newline removed:

Reviewed-by: Borislav Petkov <bp@suse.de>

> +			if (dest)
>  				memcpy(dest, src + offset, size);
>  		}

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access()
  2021-06-14 15:44 ` [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
@ 2021-06-17 12:22   ` Borislav Petkov
  2021-06-17 12:49     ` Thomas Gleixner
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17 12:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:28PM +0200, Thomas Gleixner wrote:
> The function is having a sanity check with a WARN_ON_ONCE() but happily

"The function does a sanity check..."

> proceeds when the pkey argument is out of range.
> 
> Clean it up.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/fpu/xstate.c |   11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -887,11 +887,10 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
>   * rights for @pkey to @init_val.
>   */
>  int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> -		unsigned long init_val)
> +			      unsigned long init_val)
>  {
> -	u32 old_pkru;
> -	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
> -	u32 new_pkru_bits = 0;
> +	u32 old_pkru, new_pkru_bits = 0;
> +	int pkey_shift;
>  
>  	/*
>  	 * This check implies XSAVE support.  OSPKE only gets

There's a boot_cpu_has() check

<--- here

Might wanna convert it to cpu_feature_enabled(), while at it.

> @@ -905,7 +904,8 @@ int arch_set_user_pkey_access(struct tas
>  	 * values originating from in-kernel users.  Complain
>  	 * if a bad value is observed.
>  	 */
> -	WARN_ON_ONCE(pkey >= arch_max_pkey());
> +	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
> +		return -EINVAL;
>  
>  	/* Set the bits we need in PKRU:  */
>  	if (init_val & PKEY_DISABLE_ACCESS)
> @@ -914,6 +914,7 @@ int arch_set_user_pkey_access(struct tas
>  		new_pkru_bits |= PKRU_WD_BIT;
>  
>  	/* Shift the bits in to the correct place in PKRU for pkey: */
> +	pkey_shift = pkey * PKRU_BITS_PER_PKEY;
>  	new_pkru_bits <<= pkey_shift;
>  
>  	/* Get old PKRU and mask off any old bits in place: */

With those addressed:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel()
  2021-06-14 15:44 ` [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
@ 2021-06-17 12:41   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17 12:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:29PM +0200, Thomas Gleixner wrote:
> If the fast path of restoring the FPU state on sigreturn fails or is not
> taken and the current task's FPU is active then the FPU has to be
> deactivated for the slow path to allow a safe update of the tasks FPU
> memory state.
> 
> With supervisor states enabled, this requires to save the supervisor state
> in the memory state first. Supervisor states require XSAVES so saving only
> the supervisor state requires to reshuffle the memory buffer because XSAVES
> uses the compacted format and therefore stores the supervisor states at the
> beginning of the memory state. That's just an overengineered optimization.
> 
> Get rid of it and save the full state for this case.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reviewed-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/include/asm/fpu/xstate.h |    1 
>  arch/x86/kernel/fpu/signal.c      |   13 +++++---
>  arch/x86/kernel/fpu/xstate.c      |   55 --------------------------------------
>  3 files changed, 8 insertions(+), 61 deletions(-)

Simplification? To the FPU stinking pile of turds?

Hell yeah!

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()
  2021-06-16 16:13   ` Borislav Petkov
@ 2021-06-17 12:42     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-17 12:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Wed, Jun 16 2021 at 18:13, Borislav Petkov wrote:

> On Mon, Jun 14, 2021 at 05:44:22PM +0200, Thomas Gleixner wrote:
>> When xsave with init state optimiziation is used then a component's state
>
> 				optimization
>
>> @@ -1062,14 +1062,20 @@ static void copy_feature(bool from_xstat
>>  	membuf_write(to, from_xstate ? xstate : init_xstate, size);
>>  }
>>  
>> -/*
>> - * Convert from kernel XSAVE or XSAVES compacted format to UABI
>> - * non-compacted format and copy to a kernel-space ptrace buffer.
>> +/**
>> + * copy_uabi_xstate_to_membuf - Copy kernel saved xstate to a UABI buffer
>
> If this is what it does, then the function should be called:
>
> copy_xstate_to_uabi_buf()
>
> or so.
>
> "membuf" is only an implementation detail anyway. IMHO.

Yes. Makes sense. Fixed.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get()
  2021-06-17 11:50   ` Borislav Petkov
@ 2021-06-17 12:43     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-17 12:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Thu, Jun 17 2021 at 13:50, Borislav Petkov wrote:
>> -	if (!boot_cpu_has(X86_FEATURE_FXSR)) {
>> +	if (!static_cpu_has(X86_FEATURE_FXSR)) {
>
> both: cpu_feature_enabled

I fixed up the whole series already :)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()
  2021-06-14 15:44 ` [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
@ 2021-06-17 12:48   ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-17 12:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Mon, Jun 14, 2021 at 05:44:30PM +0200, Thomas Gleixner wrote:
> The function names for xsave[s]/xrstor[s] operations are horribly named and
> a permanent source of confusion.
> 
> Rename:
> 	copy_xregs_to_kernel() to xsave_to_kernel()
> 	copy_kernel_to_xregs() to xrstor_from_kernel()

Yap, better.

I wonder if simply calling them xsave() and xrstor() won't make it
even easier. The to/from kernel thing is kinda weird. If we need
to differentiate where we're saving, we can call the user variants
"to/from_user" instead, like the copy_to_/from_user things...

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access()
  2021-06-17 12:22   ` Borislav Petkov
@ 2021-06-17 12:49     ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-17 12:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Thu, Jun 17 2021 at 14:22, Borislav Petkov wrote:
> On Mon, Jun 14, 2021 at 05:44:28PM +0200, Thomas Gleixner wrote:
>>  	/*
>>  	 * This check implies XSAVE support.  OSPKE only gets
>
> There's a boot_cpu_has() check
>
> <--- here
>
> Might wanna convert it to cpu_feature_enabled(), while at it.

There's a later patch which cleans up the whole cpu feature mess of
OSPKE. That takes care of it.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get()
  2021-06-17  8:59   ` Borislav Petkov
@ 2021-06-18 11:19     ` Borislav Petkov
  2021-06-18 13:25       ` [PATCH] selftests/x86/ptrace_syscall: Add a PTRACE_GETFPREGS test Borislav Petkov
  0 siblings, 1 reply; 87+ messages in thread
From: Borislav Petkov @ 2021-06-18 11:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Thu, Jun 17, 2021 at 10:59:55AM +0200, Borislav Petkov wrote:
> manpage says "PTRACE_GETREGS and PTRACE_GETFPREGS are not present on all
> architectures." which could explain why. I wonder if we should add some
> stupid test cases so that we can at least exercise this...

How's this rough thing?

What I'd do in the final version is verify the values we preset in
fpstate_init_fstate() and in fpstate_init_fxstate() with what this test
reads and this way we'll catch any changes in that area.

diff --git a/tools/testing/selftests/x86/ptrace_syscall.c b/tools/testing/selftests/x86/ptrace_syscall.c
index 12aaa063196e..ac73cca7300f 100644
--- a/tools/testing/selftests/x86/ptrace_syscall.c
+++ b/tools/testing/selftests/x86/ptrace_syscall.c
@@ -407,7 +407,62 @@ static void test_restart_under_ptrace(void)
 		err(1, "waitpid");
 }
 
-int main()
+static void test_ptrace_a_bit(void)
+{
+	struct user_fpregs_struct regs;
+	int status;
+	pid_t chld;
+
+	printf("[RUN]\tTest some ptrace(2) requests\n");
+
+	chld = fork();
+	if (chld < 0)
+		err(1, "fork");
+
+	if (!chld) {
+		if (ptrace(PTRACE_TRACEME, 0, 0, 0) != 0)
+			err(1, "PTRACE_TRACEME");
+
+		pid_t pid = getpid(), tid = syscall(SYS_gettid);
+
+		printf("\tChild will take a nap until signaled\n");
+		setsigign(SIGUSR1, SA_RESTART);
+		syscall(SYS_tgkill, pid, tid, SIGSTOP);
+
+		syscall(SYS_pause, 0, 0, 0, 0, 0, 0);
+		_exit(0);
+	}
+
+	/* Wait for SIGSTOP. */
+	if (waitpid(chld, &status, 0) != chld || !WIFSTOPPED(status))
+		err(1, "waitpid");
+
+	printf("[RUN]\tGETFPREGS\n");
+	if (ptrace(PTRACE_GETFPREGS, chld, 0, &regs) != 0)
+		err(1, "PTRACE_GETFPREGS");
+
+#ifdef __i386__
+	printf("__i386__\n");
+	printf("cwd: 0x%lx, swd: 0x%lx\n", regs.cwd, regs.swd);
+	printf("twd: 0x%lx, fip: 0x%lx\n", regs.twd, regs.fip);
+	printf("fcs: 0x%lx, foo: 0x%lx\n", regs.fcs, regs.foo);
+	printf("fos: 0x%lx, st_space[0]: 0x%lx\n", regs.fos, regs.st_space[0]);
+#else
+	printf("__x86_64__\n");
+	printf("cwd: 0x%x, swd: 0x%x\n", regs.cwd, regs.swd);
+	printf("ftw: 0x%x, fop: 0x%x\n", regs.ftw, regs.fop);
+	printf("rip: 0x%llx, rdp: 0x%llx\n", regs.rip, regs.rdp);
+	/* Yeah, it is mxcr_mask - sys/user.h has a typo :-) */
+	printf("mxcsr: 0x%x, mxcsr_mask: 0x%x\n", regs.mxcsr, regs.mxcr_mask);
+#endif
+
+	/* Kill it. */
+	kill(chld, SIGKILL);
+	if (waitpid(chld, &status, 0) != chld)
+		err(1, "waitpid");
+}
+
+int main(void)
 {
 	printf("[RUN]\tCheck int80 return regs\n");
 	test_sys32_regs(do_full_int80);
@@ -426,5 +481,7 @@ int main()
 
 	test_restart_under_ptrace();
 
+	test_ptrace_a_bit();
+
 	return 0;
 }

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs
  2021-06-14 15:44 ` [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
@ 2021-06-18 12:21   ` Thomas Gleixner
  0 siblings, 0 replies; 87+ messages in thread
From: Thomas Gleixner @ 2021-06-18 12:21 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Dave,

On Mon, Jun 14 2021 at 17:44, Thomas Gleixner wrote:
> It would be simpler to just remove this FNSAVE optimization: Always save
> and restore in the FNSAVE case.  This may incur the cost of the restore
> even in cases where the restored state is never used.  But, it would only
> hurt painfully ancient (>20 years old) processors.

after staring more at that, I think it's the right thing to do.

Thanks,

        tglx
---
Subject: x86/fpu: Get rid of the FNSAVE optimization
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 18 Jun 2021 13:48:18 +0200

The FNSAVE support requires conditionals in quite some call paths because
FNSAVE reinitialized the FPU hardware. If the save has to preserve the FPU
register state then the caller has to conditionally restore it from memory
when FNSAVE is in use.

This also requires a conditional in context switch because the restore
avoidance optimization cannot work with FNSAVE. As this only affects 20+
years old CPUs there is really no reason to keep this optimization
effective for FNSAVE. It's about time to not optimize for antiques anymore.

Just unconditionally FRSTOR the save content to the registers and clean up
the conditionals all over the place.

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3: New patch
---
 arch/x86/include/asm/fpu/internal.h |   17 +++++++----
 arch/x86/kernel/fpu/core.c          |   54 +++++++++++++++---------------------
 2 files changed, 34 insertions(+), 37 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -375,7 +375,7 @@ static inline int os_xrstor_safe(struct
 	return err;
 }
 
-extern int save_fpregs_to_fpstate(struct fpu *fpu);
+extern void save_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
@@ -507,12 +507,17 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!save_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
-		else
-			old_fpu->last_cpu = cpu;
+		save_fpregs_to_fpstate(old_fpu);
+		/*
+		 * The save operation preserved register state, so the
+		 * fpu_fpregs_owner_ctx is still @old_fpu. Store the
+		 * current CPU number in @old_fpu, so the next return
+		 * to user space can avoid the FPU register restore
+		 * when is returns on the same CPU and still owns the
+		 * context.
+		 */
+		old_fpu->last_cpu = cpu;
 
-		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
 	}
 }
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -83,16 +83,20 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
+ * Save the FPU register state in fpu->state. The register state is
+ * preserved.
  *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
+ * Must be called with fpregs_lock() held.
+ *
+ * The legacy FNSAVE instruction clears all FPU state unconditionally, so
+ * register state has to be reloaded. That might be a pointless exercise
+ * when the FPU is going to be used by another task right after that. But
+ * this only affect 20+ years old 32bit systems and avoids conditionals all
+ * over the place.
+ *
+ * FXSAVE and all XSAVE variants preserve the FPU register state.
  */
-int save_fpregs_to_fpstate(struct fpu *fpu)
+void save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(&fpu->state.xsave);
@@ -103,21 +107,20 @@ int save_fpregs_to_fpstate(struct fpu *f
 		 */
 		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
 			fpu->avx512_timestamp = jiffies;
-		return 1;
+		return;
 	}
 
 	if (likely(use_fxsr())) {
 		fxsave(&fpu->state.fxsave);
-		return 1;
+		return;
 	}
 
 	/*
 	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
+	 * so we have to reload them from the memory state.
 	 */
 	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
-
-	return 0;
+	frstor(&fpu->state.fsave);
 }
 EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
@@ -133,10 +136,6 @@ void kernel_fpu_begin_mask(unsigned int
 	if (!(current->flags & PF_KTHREAD) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
 		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
@@ -171,11 +170,8 @@ void fpu__save(struct fpu *fpu)
 	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!save_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
-		}
-	}
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		save_fpregs_to_fpstate(fpu);
 
 	trace_x86_fpu_after_save(fpu);
 	fpregs_unlock();
@@ -244,20 +240,16 @@ int fpu__copy(struct task_struct *dst, s
 	memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
 
 	/*
-	 * If the FPU registers are not current just memcpy() the state.
-	 * Otherwise save current FPU registers directly into the child's FPU
-	 * context, without any memory-to-memory copying.
-	 *
-	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to load them back. )
+	 * If the FPU registers are not owned by current just memcpy() the
+	 * state.  Otherwise save the FPU registers directly into the
+	 * child's FPU context, without any memory-to-memory copying.
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!save_fpregs_to_fpstate(dst_fpu))
-		copy_kernel_to_fpregs(&dst_fpu->state);
-
+	else
+		save_fpregs_to_fpstate(dst_fpu);
 	fpregs_unlock();
 
 	set_tsk_thread_flag(dst, TIF_NEED_FPU_LOAD);

^ permalink raw reply	[flat|nested] 87+ messages in thread

* [PATCH] selftests/x86/ptrace_syscall: Add a PTRACE_GETFPREGS test
  2021-06-18 11:19     ` Borislav Petkov
@ 2021-06-18 13:25       ` Borislav Petkov
  0 siblings, 0 replies; 87+ messages in thread
From: Borislav Petkov @ 2021-06-18 13:25 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

From: Borislav Petkov <bp@suse.de>
Date: Fri, 18 Jun 2021 15:17:28 +0200

... instead of fumbling with gdb trying to cause it to use this old way
to get FPU regs.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 tools/testing/selftests/x86/ptrace_syscall.c | 65 +++++++++++++++++++-
 1 file changed, 64 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/x86/ptrace_syscall.c b/tools/testing/selftests/x86/ptrace_syscall.c
index 12aaa063196e..09abeb445db0 100644
--- a/tools/testing/selftests/x86/ptrace_syscall.c
+++ b/tools/testing/selftests/x86/ptrace_syscall.c
@@ -407,7 +407,68 @@ static void test_restart_under_ptrace(void)
 		err(1, "waitpid");
 }
 
-int main()
+static void test_ptrace_a_bit(void)
+{
+	struct user_fpregs_struct regs;
+	int status;
+	pid_t chld;
+
+	printf("[RUN]\tTest some ptrace(2) requests\n");
+
+	chld = fork();
+	if (chld < 0)
+		err(1, "fork");
+
+	if (!chld) {
+		if (ptrace(PTRACE_TRACEME, 0, 0, 0) != 0)
+			err(1, "PTRACE_TRACEME");
+
+		pid_t pid = getpid(), tid = syscall(SYS_gettid);
+
+		printf("\tChild will take a nap until signaled\n");
+		setsigign(SIGUSR1, SA_RESTART);
+		syscall(SYS_tgkill, pid, tid, SIGSTOP);
+
+		syscall(SYS_pause, 0, 0, 0, 0, 0, 0);
+		_exit(0);
+	}
+
+	/* Wait for SIGSTOP. */
+	if (waitpid(chld, &status, 0) != chld || !WIFSTOPPED(status))
+		err(1, "waitpid");
+
+	printf("[RUN]\tGETFPREGS\n");
+	if (ptrace(PTRACE_GETFPREGS, chld, 0, &regs) != 0)
+		err(1, "PTRACE_GETFPREGS");
+
+#ifdef __i386__
+	if (regs.cwd != 0xffff037fu || regs.swd != 0xffff0000u ||
+	    regs.twd != 0xffffffffu) {
+		printf("[FAIL]\t32-bit args after PTRACE_GETFPREGS are wrong: ");
+		printf("cwd: 0x%lx, swd: 0x%lx ",  regs.cwd, regs.swd);
+		printf("twd: 0x%lx\n", regs.twd);
+		goto out;
+	}
+#else
+	if (regs.cwd != 0x37f || regs.mxcsr != 0x1f80 || regs.mxcr_mask != 0x2ffff) {
+		printf("[FAIL]\t64-bit args after PTRACE_GETFPREGS are wrong: ");
+		printf("cwd: 0x%x, ", regs.cwd);
+		/* Yeah, it is mxcr_mask - sys/user.h has a typo :-) */
+		printf("mxcsr: 0x%x, mxcsr_mask: 0x%x\n", regs.mxcsr, regs.mxcr_mask);
+		goto out;
+	}
+#endif
+
+	printf("[OK]\tptrace(PTRACE_GETFPREGS)\n");
+
+out:
+	/* Kill it. */
+	kill(chld, SIGKILL);
+	if (waitpid(chld, &status, 0) != chld)
+		err(1, "waitpid");
+}
+
+int main(void)
 {
 	printf("[RUN]\tCheck int80 return regs\n");
 	test_sys32_regs(do_full_int80);
@@ -426,5 +487,7 @@ int main()
 
 	test_restart_under_ptrace();
 
+	test_ptrace_a_bit();
+
 	return 0;
 }
-- 
2.29.2


-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2021-06-18 13:25 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
2021-06-14 15:44 ` [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE Thomas Gleixner
2021-06-14 19:15   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
2021-06-15 11:07   ` Borislav Petkov
2021-06-15 12:47     ` Thomas Gleixner
2021-06-15 12:59       ` Borislav Petkov
2021-06-16 22:02   ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
2021-06-15 13:15   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 04/52] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
2021-06-14 15:44 ` [patch V2 05/52] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 06/52] x86/fpu: Move inlines where they belong Thomas Gleixner
2021-06-14 15:44 ` [patch V2 07/52] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 08/52] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
2021-06-15 17:40   ` Borislav Petkov
2021-06-15 21:32     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
2021-06-16 15:02   ` Borislav Petkov
2021-06-16 23:51     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 10/52] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
2021-06-14 15:44 ` [patch V2 11/52] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
2021-06-16 15:22   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
2021-06-16 15:31   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 13/52] x86/fpu: Clean up fpregs_set() Thomas Gleixner
2021-06-16 15:42   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
2021-06-16 16:13   ` Borislav Petkov
2021-06-17 12:42     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get() Thomas Gleixner
2021-06-17  8:59   ` Borislav Petkov
2021-06-18 11:19     ` Borislav Petkov
2021-06-18 13:25       ` [PATCH] selftests/x86/ptrace_syscall: Add a PTRACE_GETFPREGS test Borislav Petkov
2021-06-14 15:44 ` [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get() Thomas Gleixner
2021-06-17 11:50   ` Borislav Petkov
2021-06-17 12:43     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 17/52] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 18/52] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
2021-06-17 11:59   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
2021-06-17 12:09   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
2021-06-17 12:22   ` Borislav Petkov
2021-06-17 12:49     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
2021-06-17 12:41   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
2021-06-17 12:48   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 23/52] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 24/52] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
2021-06-14 15:44 ` [patch V2 25/52] x86/fpu: Rename fregs " Thomas Gleixner
2021-06-14 15:44 ` [patch V2 26/52] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
2021-06-14 15:44 ` [patch V2 27/52] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 28/52] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 29/52] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 30/52] x86/fpu: Rename initstate copy functions Thomas Gleixner
2021-06-14 15:44 ` [patch V2 31/52] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
2021-06-14 15:44 ` [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
2021-06-16 20:04   ` Liang, Kan
2021-06-17  7:15     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 33/52] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
2021-06-18 12:21   ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 35/52] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
2021-06-14 15:44 ` [patch V2 36/52] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 37/52] x86/pkru: Provide pkru_write_default() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 38/52] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
2021-06-14 15:44 ` [patch V2 39/52] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 40/52] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 41/52] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
2021-06-14 15:44 ` [patch V2 42/52] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 43/52] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 44/52] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
2021-06-16  0:52   ` Yu, Yu-cheng
2021-06-16  8:56     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 46/52] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
2021-06-14 15:44 ` [patch V2 47/52] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
2021-06-14 19:29   ` [patch V2-A " Thomas Gleixner
2021-06-14 15:44 ` [patch V2 48/52] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
2021-06-14 15:44 ` [patch V2 49/52] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 50/52] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 51/52] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
2021-06-14 15:45 ` [patch V2 52/52] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
2021-06-14 20:15 ` [patch] x86/fpu: x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate() Thomas Gleixner
2021-06-16  0:50 ` [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Yu, Yu-cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).