All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing
@ 2021-06-11 16:15 Thomas Gleixner
  2021-06-11 16:15 ` [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
                   ` (41 more replies)
  0 siblings, 42 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

This is a follow up to these patch series:

 - [patch V2 00/14] x86/fpu: Mop up XSAVES and related damage
   https://lore.kernel.org/r/20210605234742.712464974@linutronix.de

 - [PATCH 00/18] x86/pkeys: stop managing PKRU with XSAVE
   https://lore.kernel.org/r/20210603230803.31660AFE@viggo.jf.intel.com

 - [PATCH 0/2] x86/fpu: Clean up "dynamic" APIs
   https://lore.kernel.org/r/cover.1623388344.git.luto@kernel.org

The analysis of the subtle bugs in the FPU code triggered a larger
discussion about the general state of this code. The above patch series are
all related to this and the following series combines them into one because
the already started consolidation work and the PKRU rework collided all
over the place.

The main parts of this series are:

  - Simplification and removal/replacement of redundant and/or
    overengineered code.

  - Name space cleanup as the existing names were just a permanent source
    of confusion.

  - Clear seperation of user ABI and kernel internal state handling.

  - Removal of PKRU from being XSTATE managed in the kernel because PKRU
    has to be eagerly restored on context switch and keeping it in sync
    in the xstate buffer is just pointless overhead and fragile.

    The kernel still XSAVEs PKRU on context switch but the value in the
    buffer is not longer used and never restored from the buffer.

    This still needs to be cleaned up, but the series is already 40+
    patches large and the cleanup of this is not a functional problem.

    The functional issues of PKRU management are fully addressed with the
    series as is.

It applies on top of

  git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master

and is also available via git:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu

Thanks,

	tglx
---
 arch/x86/events/intel/lbr.c          |    6 
 arch/x86/include/asm/fpu/internal.h  |  155 ++++-------
 arch/x86/include/asm/fpu/xstate.h    |   61 +++-
 arch/x86/include/asm/pgtable.h       |   57 ----
 arch/x86/include/asm/pkeys.h         |    9 
 arch/x86/include/asm/processor.h     |    9 
 arch/x86/include/asm/special_insns.h |   14 -
 arch/x86/kernel/cpu/common.c         |   29 --
 arch/x86/kernel/fpu/core.c           |  243 ++++++++++++------
 arch/x86/kernel/fpu/init.c           |    4 
 arch/x86/kernel/fpu/regset.c         |  116 ++++----
 arch/x86/kernel/fpu/signal.c         |   59 ++--
 arch/x86/kernel/fpu/xstate.c         |  456 +++++++++++------------------------
 arch/x86/kernel/process.c            |   19 +
 arch/x86/kernel/process_64.c         |   28 ++
 arch/x86/kvm/svm/sev.c               |    1 
 arch/x86/kvm/x86.c                   |   52 ++-
 arch/x86/mm/extable.c                |    2 
 arch/x86/mm/fault.c                  |    2 
 arch/x86/mm/pkeys.c                  |   22 -
 b/arch/x86/include/asm/pkru.h        |   62 ++++
 include/linux/pkeys.h                |    4 
 22 files changed, 671 insertions(+), 739 deletions(-)




^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 17:04   ` Borislav Petkov
  2021-06-11 16:15 ` [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
                   ` (40 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

This cannot work and it's unclear how that ever made a difference.

init_fpstate.xsave.header.xfeatures is always 0 so get_xsave_addr() will
always return a NULL pointer, which will prevent storing the default PKRU
value in initfp_state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c |    5 -----
 arch/x86/mm/pkeys.c          |    6 ------
 2 files changed, 11 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -466,8 +466,6 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	struct pkru_state *pk;
-
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@@ -478,9 +476,6 @@ static __always_inline void setup_pku(st
 		return;
 
 	cr4_set_bits(X86_CR4_PKE);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (pk)
-		pk->pkru = init_pkru_value;
 	/*
 	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/fpu/internal.h>		/* init_fpstate			*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -154,7 +153,6 @@ static ssize_t init_pkru_read_file(struc
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
-	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@@ -177,10 +175,6 @@ static ssize_t init_pkru_write_file(stru
 		return -EINVAL;
 
 	WRITE_ONCE(init_pkru_value, new_init_pkru);
-	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
-	if (!pk)
-		return -EINVAL;
-	pk->pkru = new_init_pkru;
 	return count;
 }
 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-11 16:15 ` [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 17:21   ` Borislav Petkov
  2021-06-11 18:35   ` Andy Lutomirski
  2021-06-11 16:15 ` [patch 03/41] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
                   ` (39 subsequent siblings)
  41 siblings, 2 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Nothing modifies these after booting.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/init.c   |    4 ++--
 arch/x86/kernel/fpu/xstate.c |   16 ++++++++++------
 2 files changed, 12 insertions(+), 8 deletions(-)

--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -89,7 +89,7 @@ static void fpu__init_system_early_gener
 /*
  * Boot time FPU feature detection code:
  */
-unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
+unsigned int mxcsr_feature_mask __ro_after_init = 0xffffffffu;
 EXPORT_SYMBOL_GPL(mxcsr_feature_mask);
 
 static void __init fpu__init_system_mxcsr(void)
@@ -135,7 +135,7 @@ static void __init fpu__init_system_gene
  * This is inherent to the XSAVE architecture which puts all state
  * components into a single, continuous memory block:
  */
-unsigned int fpu_kernel_xstate_size;
+unsigned int fpu_kernel_xstate_size __ro_after_init;
 EXPORT_SYMBOL_GPL(fpu_kernel_xstate_size);
 
 /* Get alignment of the TYPE. */
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -59,19 +59,23 @@ static short xsave_cpuid_features[] __in
  * This represents the full set of bits that should ever be set in a kernel
  * XSAVE buffer, both supervisor and user xstates.
  */
-u64 xfeatures_mask_all __read_mostly;
+u64 xfeatures_mask_all __ro_after_init;
 
-static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
+	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
  * it is always in standard format for user mode. This is the user
  * mode standard format size used for signal and ptrace frames.
  */
-unsigned int fpu_user_xstate_size;
+unsigned int fpu_user_xstate_size __ro_after_init;
 
 /*
  * Return whether the system supports a given xfeature.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 03/41] x86/fpu: Remove unused get_xsave_field_ptr()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-11 16:15 ` [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
  2021-06-11 16:15 ` [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 18:35   ` Andy Lutomirski
  2021-06-11 16:15 ` [patch 04/41] x86/fpu: Move inlines where they belong Thomas Gleixner
                   ` (38 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/fpu/xstate.h |    1 -
 arch/x86/kernel/fpu/xstate.c      |   30 ------------------------------
 2 files changed, 31 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -101,7 +101,6 @@ extern void __init update_regset_xstate_
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
-const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -955,36 +955,6 @@ void *get_xsave_addr(struct xregs_state
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
-/*
- * This wraps up the common operations that need to occur when retrieving
- * data from xsave state.  It first ensures that the current task was
- * using the FPU and retrieves the data in to a buffer.  It then calculates
- * the offset of the requested field in the buffer.
- *
- * This function is safe to call whether the FPU is in use or not.
- *
- * Note that this only works on the current task.
- *
- * Inputs:
- *	@xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
- *	XFEATURE_SSE, etc...)
- * Output:
- *	address of the state in the xsave area or NULL if the state
- *	is not present or is in its 'init state'.
- */
-const void *get_xsave_field_ptr(int xfeature_nr)
-{
-	struct fpu *fpu = &current->thread.fpu;
-
-	/*
-	 * fpu__save() takes the CPU's xstate registers
-	 * and saves them off to the 'fpu memory buffer.
-	 */
-	fpu__save(fpu);
-
-	return get_xsave_addr(&fpu->state.xsave, xfeature_nr);
-}
-
 #ifdef CONFIG_ARCH_HAS_PKEYS
 
 /*


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 04/41] x86/fpu: Move inlines where they belong
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (2 preceding siblings ...)
  2021-06-11 16:15 ` [patch 03/41] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
                   ` (37 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

They are only used in fpstate_init() and there is no point to have them in
a header just to make reading the code harder.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/fpu/internal.h |   14 --------------
 arch/x86/kernel/fpu/core.c          |   15 +++++++++++++++
 2 files changed, 15 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -87,20 +87,6 @@ extern void fpstate_init_soft(struct swr
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-static inline void fpstate_init_xstate(struct xregs_state *xsave)
-{
-	/*
-	 * XRSTORS requires these bits set in xcomp_bv, or it will
-	 * trigger #GP:
-	 */
-	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
-}
-
-static inline void fpstate_init_fxstate(struct fxregs_state *fx)
-{
-	fx->cwd = 0x37f;
-	fx->mxcsr = MXCSR_DEFAULT;
-}
 extern void fpstate_sanitize_xstate(struct fpu *fpu);
 
 #define user_insn(insn, output, input...)				\
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -181,6 +181,21 @@ void fpu__save(struct fpu *fpu)
 	fpregs_unlock();
 }
 
+static inline void fpstate_init_xstate(struct xregs_state *xsave)
+{
+	/*
+	 * XRSTORS requires these bits set in xcomp_bv, or it will
+	 * trigger #GP:
+	 */
+	xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT | xfeatures_mask_all;
+}
+
+static inline void fpstate_init_fxstate(struct fxregs_state *fx)
+{
+	fx->cwd = 0x37f;
+	fx->mxcsr = MXCSR_DEFAULT;
+}
+
 /*
  * Legacy x87 fpstate state init:
  */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (3 preceding siblings ...)
  2021-06-11 16:15 ` [patch 04/41] x86/fpu: Move inlines where they belong Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 18:15   ` Borislav Petkov
  2021-06-11 18:37   ` Andy Lutomirski
  2021-06-11 16:15 ` [patch 06/41] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
                   ` (36 subsequent siblings)
  41 siblings, 2 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

If the count argument is larger than the xstate size, this will happily
copy beyond the end of xstate.

Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/regset.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -117,7 +117,7 @@ int xstateregs_set(struct task_struct *t
 	/*
 	 * A whole standard-format XSAVE buffer is needed:
 	 */
-	if ((pos != 0) || (count < fpu_user_xstate_size))
+	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
 	xsave = &fpu->state.xsave;


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 06/41] x86/fpu: Sanitize xstateregs_set()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (4 preceding siblings ...)
  2021-06-11 16:15 ` [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 18:45   ` Andy Lutomirski
  2021-06-11 16:15 ` [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
                   ` (35 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

xstateregs_set() operates on a stopped task and tries to copy the provided
buffer into the tasks fpu.state.xsave buffer.

Any error while copying or invalid state detected after copying results in
wiping the target tasks FPU state completely including supervisor states.

That's just wrong. The caller supplied invalid data or has a problem with
unmapped memory, so there is absolutely no justification to corrupt the
target state.

Fix this with the following modifications:

 1) If data has to be copied from userspace, allocate a buffer and copy from
    user first.

 2) Use copy_kernel_to_xstate() unconditionally so that header checking
    works correctly.

 3) Return on error without corrupting the target state.

This prevents corrupting states and lets the caller deal with the problem
it caused in the first place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Use copy_from_user() - Dave, Yu
    Rename xbuf to tmpbuf - Borislav
---
 arch/x86/include/asm/fpu/xstate.h |    4 ---
 arch/x86/kernel/fpu/regset.c      |   41 +++++++++++++++-----------------------
 arch/x86/kernel/fpu/xstate.c      |   14 +++++++-----
 3 files changed, 25 insertions(+), 34 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,8 +111,4 @@ void copy_supervisor_to_kernel(struct xr
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
-
-/* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr);
-
 #endif
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -6,8 +6,12 @@
 #include <asm/fpu/signal.h>
 #include <asm/fpu/regset.h>
 #include <asm/fpu/xstate.h>
+
+#include <linux/vmalloc.h>
+
 #include <linux/sched/task_stack.h>
 
+
 /*
  * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
  * as the "regset->n" for the xstate regset will be updated based on the feature
@@ -108,7 +112,7 @@ int xstateregs_set(struct task_struct *t
 		  const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
+	struct xregs_state *tmpbuf = NULL;
 	int ret;
 
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
@@ -120,32 +124,21 @@ int xstateregs_set(struct task_struct *t
 	if (pos != 0 || count != fpu_user_xstate_size)
 		return -EFAULT;
 
-	xsave = &fpu->state.xsave;
-
-	fpu__prepare_write(fpu);
-
-	if (using_compacted_format()) {
-		if (kbuf)
-			ret = copy_kernel_to_xstate(xsave, kbuf);
-		else
-			ret = copy_user_to_xstate(xsave, ubuf);
-	} else {
-		ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
-		if (!ret)
-			ret = validate_user_xstate_header(&xsave->header);
+	if (!kbuf) {
+		tmpbuf = vmalloc(count);
+		if (!tmpbuf)
+			return -ENOMEM;
+		if (copy_from_user(tmpbuf, ubuf, count)) {
+			ret = -EFAULT;
+			goto out;
+		}
 	}
 
-	/*
-	 * mxcsr reserved bits must be masked to zero for security reasons.
-	 */
-	xsave->i387.mxcsr &= mxcsr_feature_mask;
-
-	/*
-	 * In case of failure, mark all states as init:
-	 */
-	if (ret)
-		fpstate_init(&fpu->state);
+	fpu__prepare_write(fpu);
+	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
+out:
+	vfree(tmpbuf);
 	return ret;
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -519,7 +519,7 @@ int using_compacted_format(void)
 }
 
 /* Validate an xstate header supplied by userspace (ptrace or sigreturn) */
-int validate_user_xstate_header(const struct xstate_header *hdr)
+static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
 	if (hdr->xfeatures & ~xfeatures_mask_user())
@@ -1099,7 +1099,7 @@ void copy_xstate_to_kernel(struct membuf
 }
 
 /*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVES format
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
@@ -1146,14 +1146,16 @@ int copy_kernel_to_xstate(struct xregs_s
 	 */
 	xsave->header.xfeatures |= hdr.xfeatures;
 
+	/* mxcsr reserved bits must be masked to zero for security reasons. */
+	xsave->i387.mxcsr &= mxcsr_feature_mask;
+
 	return 0;
 }
 
 /*
- * Convert from a ptrace or sigreturn standard-format user-space buffer to
- * kernel XSAVES format and copy to the target thread. This is called from
- * xstateregs_set(), as well as potentially from the sigreturn() and
- * rt_sigreturn() system calls.
+ * Convert from a sigreturn standard-format user-space buffer to kernel
+ * XSAVE[S] format and copy to the target thread. This is called from the
+ * sigreturn() and rt_sigreturn() system calls.
  */
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 {


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (5 preceding siblings ...)
  2021-06-11 16:15 ` [patch 06/41] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 18:47   ` Andy Lutomirski
  2021-06-12  9:13   ` Borislav Petkov
  2021-06-11 16:15 ` [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components Thomas Gleixner
                   ` (34 subsequent siblings)
  41 siblings, 2 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

ptrace() has interfaces that let a ptracer inspect a ptracee's register state.
This includes XSAVE state.  The ptrace() ABI includes a hardware-format XSAVE
buffer for both the SETREGS and GETREGS interfaces.

In the old days, the kernel buffer and the ptrace() ABI buffer were the
same boring non-compacted format.  But, since the advent of supervisor
states and the compacted format, the kernel buffer has diverged from the
format presented in the ABI.

This leads to two paths in the kernel:
1. Effectively a verbatim copy_to_user() which just copies the kernel buffer
   out to userspace.  This is used when the kernel buffer is kept in the
   non-compacted form which means that it shares a format with the ptrace
   ABI.
2. A one-state-at-a-time path: copy_xstate_to_kernel().  This is theoretically
   slower since it does a bunch of piecemeal copies.

Remove the verbatim copy case.  Speed probably does not matter in this path,
and the vast majority of new hardware will use the one-state-at-a-time path
anyway.  This ensures greater testing for the "slow" path.

This also makes enabling PKRU in this interface easier since a single path
can be patched instead of two.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Picked up from Dave's PKRU series
---
 arch/x86/kernel/fpu/regset.c |   22 ++--------------------
 arch/x86/kernel/fpu/xstate.c |    6 +++---
 2 files changed, 5 insertions(+), 23 deletions(-)

--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -79,32 +79,14 @@ int xstateregs_get(struct task_struct *t
 		struct membuf to)
 {
 	struct fpu *fpu = &target->thread.fpu;
-	struct xregs_state *xsave;
 
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	xsave = &fpu->state.xsave;
-
 	fpu__prepare_read(fpu);
 
-	if (using_compacted_format()) {
-		copy_xstate_to_kernel(to, xsave);
-		return 0;
-	} else {
-		fpstate_sanitize_xstate(fpu);
-		/*
-		 * Copy the 48 bytes defined by the software into the xsave
-		 * area in the thread struct, so that we can copy the whole
-		 * area to user using one user_regset_copyout().
-		 */
-		memcpy(&xsave->i387.sw_reserved, xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
-
-		/*
-		 * Copy the xstate memory layout.
-		 */
-		return membuf_write(&to, xsave, fpu_user_xstate_size);
-	}
+	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	return 0;
 }
 
 int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1040,11 +1040,11 @@ static void copy_part(struct membuf *to,
 }
 
 /*
- * Convert from kernel XSAVES compacted format to standard format and copy
- * to a kernel-space ptrace buffer.
+ * Convert from kernel XSAVE or XSAVES compacted format to UABI
+ * non-compacted format and copy to a kernel-space ptrace buffer.
  *
  * It supports partial copy but pos always starts from zero. This is called
- * from xstateregs_get() and there we check the CPU has XSAVES.
+ * from xstateregs_get() and there we check the CPU has XSAVE.
  */
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
 {


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (6 preceding siblings ...)
  2021-06-11 16:15 ` [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 19:03   ` Andy Lutomirski
  2021-06-11 16:15 ` [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
                   ` (33 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

xstateregs_get() does not longer use fpstate_sanitize_xstate() and the only
callers are the regset functions for the legacy FP/SSE components.

Move the function to the callsites and remove the extended features part.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    2 
 arch/x86/kernel/fpu/regset.c        |   41 ++++++++++++++++--
 arch/x86/kernel/fpu/xstate.c        |   79 ------------------------------------
 3 files changed, 37 insertions(+), 85 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -87,8 +87,6 @@ extern void fpstate_init_soft(struct swr
 static inline void fpstate_init_soft(struct swregs_state *soft) {}
 #endif
 
-extern void fpstate_sanitize_xstate(struct fpu *fpu);
-
 #define user_insn(insn, output, input...)				\
 ({									\
 	int err;							\
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -11,6 +11,39 @@
 
 #include <linux/sched/task_stack.h>
 
+/*
+ * When executing XSAVEOPT (or other optimized XSAVE instructions), if
+ * a processor implementation detects that an FPU state component is still
+ * (or is again) in its initialized state, it may clear the corresponding
+ * bit in the header.xfeatures field, and can skip the writeout of registers
+ * to the corresponding memory layout.
+ *
+ * This means that when the bit is zero, the state component might still
+ * contain some previous - non-initialized register state.
+ *
+ * This is required for the legacy regset functions.
+ */
+static void fpstate_sanitize_legacy(struct fpu *fpu)
+{
+	struct fxregs_state *fx = &fpu->state.fxsave;
+	u64 xfeatures;
+
+	if (!use_xsaveopt())
+		return;
+
+	xfeatures = fpu->state.xsave.header.xfeatures;
+
+	/* If FP is in init state, reinitialize it */
+	if (!(xfeatures & XFEATURE_MASK_FP)) {
+		memset(fx, 0, sizeof(*fx));
+		fx->cwd = 0x37f;
+	}
+
+	/* If SSE is in init state, clear the storage */
+	if (!(xfeatures & XFEATURE_MASK_SSE))
+		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
+}
+
 
 /*
  * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
@@ -39,7 +72,7 @@ int xfpregs_get(struct task_struct *targ
 		return -ENODEV;
 
 	fpu__prepare_read(fpu);
-	fpstate_sanitize_xstate(fpu);
+	fpstate_sanitize_legacy(fpu);
 
 	return membuf_write(&to, &fpu->state.fxsave, sizeof(struct fxregs_state));
 }
@@ -55,7 +88,7 @@ int xfpregs_set(struct task_struct *targ
 		return -ENODEV;
 
 	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
+	fpstate_sanitize_legacy(fpu);
 
 	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
 				 &fpu->state.fxsave, 0, -1);
@@ -276,7 +309,7 @@ int fpregs_get(struct task_struct *targe
 				    sizeof(struct fregs_state));
 	}
 
-	fpstate_sanitize_xstate(fpu);
+	fpstate_sanitize_legacy(fpu);
 
 	if (to.left == sizeof(env)) {
 		convert_from_fxsr(to.p, target);
@@ -296,7 +329,7 @@ int fpregs_set(struct task_struct *targe
 	int ret;
 
 	fpu__prepare_write(fpu);
-	fpstate_sanitize_xstate(fpu);
+	fpstate_sanitize_legacy(fpu);
 
 	if (!boot_cpu_has(X86_FEATURE_FPU))
 		return fpregs_soft_set(target, regset, pos, count, kbuf, ubuf);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -129,85 +129,6 @@ static bool xfeature_is_supervisor(int x
 }
 
 /*
- * When executing XSAVEOPT (or other optimized XSAVE instructions), if
- * a processor implementation detects that an FPU state component is still
- * (or is again) in its initialized state, it may clear the corresponding
- * bit in the header.xfeatures field, and can skip the writeout of registers
- * to the corresponding memory layout.
- *
- * This means that when the bit is zero, the state component might still contain
- * some previous - non-initialized register state.
- *
- * Before writing xstate information to user-space we sanitize those components,
- * to always ensure that the memory layout of a feature will be in the init state
- * if the corresponding header bit is zero. This is to ensure that user-space doesn't
- * see some stale state in the memory layout during signal handling, debugging etc.
- */
-void fpstate_sanitize_xstate(struct fpu *fpu)
-{
-	struct fxregs_state *fx = &fpu->state.fxsave;
-	int feature_bit;
-	u64 xfeatures;
-
-	if (!use_xsaveopt())
-		return;
-
-	xfeatures = fpu->state.xsave.header.xfeatures;
-
-	/*
-	 * None of the feature bits are in init state. So nothing else
-	 * to do for us, as the memory layout is up to date.
-	 */
-	if ((xfeatures & xfeatures_mask_all) == xfeatures_mask_all)
-		return;
-
-	/*
-	 * FP is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_FP)) {
-		fx->cwd = 0x37f;
-		fx->swd = 0;
-		fx->twd = 0;
-		fx->fop = 0;
-		fx->rip = 0;
-		fx->rdp = 0;
-		memset(fx->st_space, 0, sizeof(fx->st_space));
-	}
-
-	/*
-	 * SSE is in init state
-	 */
-	if (!(xfeatures & XFEATURE_MASK_SSE))
-		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
-
-	/*
-	 * First two features are FPU and SSE, which above we handled
-	 * in a special way already:
-	 */
-	feature_bit = 0x2;
-	xfeatures = (xfeatures_mask_user() & ~xfeatures) >> 2;
-
-	/*
-	 * Update all the remaining memory layouts according to their
-	 * standard xstate layout, if their header bit is in the init
-	 * state:
-	 */
-	while (xfeatures) {
-		if (xfeatures & 0x1) {
-			int offset = xstate_comp_offsets[feature_bit];
-			int size = xstate_sizes[feature_bit];
-
-			memcpy((void *)fx + offset,
-			       (void *)&init_fpstate.xsave + offset,
-			       size);
-		}
-
-		xfeatures >>= 1;
-		feature_bit++;
-	}
-}
-
-/*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
  */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (7 preceding siblings ...)
  2021-06-11 16:15 ` [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-14 10:26   ` Borislav Petkov
  2021-06-11 16:15 ` [patch 10/41] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
                   ` (32 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

PKRU is being removed from the kernel XSAVE/FPU buffers.  This removal
will probably include warnings for code that look up PKRU in those
buffers.

KVM currently looks up the location of PKRU but doesn't even use the
pointer that it gets back.  Rework the code to avoid calling
get_xsave_addr() except in cases where its result is actually used.

This makes the code more clear and also avoids the inevitable PKRU
warnings.

This is probably a good cleanup and could go upstream idependently
of any PKRU rework.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---

 arch/x86/kvm/x86.c |   41 ++++++++++++++++++++++-------------------
 1 file changed, 22 insertions(+), 19 deletions(-)

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4589,20 +4589,21 @@ static void fill_xsave(u8 *dest, struct
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *src = get_xsave_addr(xsave, xfeature_nr);
+		void *src;
 
-		if (src) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(dest + offset, &vcpu->arch.pkru,
-				       sizeof(vcpu->arch.pkru));
-			else
-				memcpy(dest + offset, src, size);
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
 
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(dest + offset, &vcpu->arch.pkru,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			src = get_xsave_addr(xsave, xfeature_nr);
+			if (src)
+				memcpy(dest + offset, src, size);
 		}
 
 		valid -= xfeature_mask;
@@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *dest = get_xsave_addr(xsave, xfeature_nr);
 
-		if (dest) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(&vcpu->arch.pkru, src + offset,
-				       sizeof(vcpu->arch.pkru));
-			else
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
+
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(&vcpu->arch.pkru, src + offset,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			void *dest = get_xsave_addr(xsave, xfeature_nr);
+
+			if (dest)
 				memcpy(dest, src + offset, size);
 		}
 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 10/41] x86/fpu: Cleanup arch_set_user_pkey_access()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (8 preceding siblings ...)
  2021-06-11 16:15 ` [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 11/41] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
                   ` (31 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function is having a sanity check with a WARN_ON_ONCE() but happily
proceeds when the pkey argument is out of range.

Clean it up.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -887,11 +887,10 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
  * rights for @pkey to @init_val.
  */
 int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
-		unsigned long init_val)
+			      unsigned long init_val)
 {
-	u32 old_pkru;
-	int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
-	u32 new_pkru_bits = 0;
+	u32 old_pkru, new_pkru_bits = 0;
+	int pkey_shift;
 
 	/*
 	 * This check implies XSAVE support.  OSPKE only gets
@@ -905,7 +904,8 @@ int arch_set_user_pkey_access(struct tas
 	 * values originating from in-kernel users.  Complain
 	 * if a bad value is observed.
 	 */
-	WARN_ON_ONCE(pkey >= arch_max_pkey());
+	if (WARN_ON_ONCE(pkey >= arch_max_pkey()))
+		return -EINVAL;
 
 	/* Set the bits we need in PKRU:  */
 	if (init_val & PKEY_DISABLE_ACCESS)
@@ -914,6 +914,7 @@ int arch_set_user_pkey_access(struct tas
 		new_pkru_bits |= PKRU_WD_BIT;
 
 	/* Shift the bits in to the correct place in PKRU for pkey: */
+	pkey_shift = pkey * PKRU_BITS_PER_PKEY;
 	new_pkru_bits <<= pkey_shift;
 
 	/* Get old PKRU and mask off any old bits in place: */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 11/41] x86/fpu: Get rid of copy_supervisor_to_kernel()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (9 preceding siblings ...)
  2021-06-11 16:15 ` [patch 10/41] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 19:42   ` Andy Lutomirski
  2021-06-11 16:15 ` [patch 12/41] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
                   ` (30 subsequent siblings)
  41 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

If the fast path of restoring the FPU state on sigreturn fails or is not
taken and the current task's FPU is active then the FPU has to be
deactivated for the slow path to allow a safe update of the tasks FPU
memory state.

With supervisor states enabled, this requires to save the supervisor state
in the memory state first. Supervisor states require XSAVES so saving only
the supervisor state requires to reshuffle the memory buffer because XSAVES
uses the compacted format and therefore stores the supervisor states at the
beginning of the memory state. That's just an overengineered optimization.

Get rid of it and save the full state for this case.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/fpu/xstate.h |    1 
 arch/x86/kernel/fpu/signal.c      |   13 +++++---
 arch/x86/kernel/fpu/xstate.c      |   55 --------------------------------------
 3 files changed, 8 insertions(+), 61 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -107,7 +107,6 @@ struct membuf;
 void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave);
 int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_supervisor_to_kernel(struct xregs_state *xsave);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -411,15 +411,18 @@ static int __fpu__restore_sig(void __use
 	 * the optimisation).
 	 */
 	fpregs_lock();
-
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-
 		/*
-		 * Supervisor states are not modified by user space input.  Save
-		 * current supervisor states first and invalidate the FPU regs.
+		 * If supervisor states are available then save the
+		 * hardware state in current's fpstate so that the
+		 * supervisor state is preserved. Save the full state for
+		 * simplicity. There is no point in optimizing this by only
+		 * saving the supervisor states and then shuffle them to
+		 * the right place in memory. This is the slow path and the
+		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_supervisor_to_kernel(&fpu->state.xsave);
+			copy_xregs_to_kernel(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1129,61 +1129,6 @@ int copy_user_to_xstate(struct xregs_sta
 	return 0;
 }
 
-/*
- * Save only supervisor states to the kernel buffer.  This blows away all
- * old states, and is intended to be used only in __fpu__restore_sig(), where
- * user states are restored from the user buffer.
- */
-void copy_supervisor_to_kernel(struct xregs_state *xstate)
-{
-	struct xstate_header *header;
-	u64 max_bit, min_bit;
-	u32 lmask, hmask;
-	int err, i;
-
-	if (WARN_ON(!boot_cpu_has(X86_FEATURE_XSAVES)))
-		return;
-
-	if (!xfeatures_mask_supervisor())
-		return;
-
-	max_bit = __fls(xfeatures_mask_supervisor());
-	min_bit = __ffs(xfeatures_mask_supervisor());
-
-	lmask = xfeatures_mask_supervisor();
-	hmask = xfeatures_mask_supervisor() >> 32;
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* We should never fault when copying to a kernel buffer: */
-	if (WARN_ON_FPU(err))
-		return;
-
-	/*
-	 * At this point, the buffer has only supervisor states and must be
-	 * converted back to normal kernel format.
-	 */
-	header = &xstate->header;
-	header->xcomp_bv |= xfeatures_mask_all;
-
-	/*
-	 * This only moves states up in the buffer.  Start with
-	 * the last state and move backwards so that states are
-	 * not overwritten until after they are moved.  Note:
-	 * memmove() allows overlapping src/dst buffers.
-	 */
-	for (i = max_bit; i >= min_bit; i--) {
-		u8 *xbuf = (u8 *)xstate;
-
-		if (!((header->xfeatures >> i) & 1))
-			continue;
-
-		/* Move xfeature 'i' into its normal location */
-		memmove(xbuf + xstate_comp_offsets[i],
-			xbuf + xstate_supervisor_only_offsets[i],
-			xstate_sizes[i]);
-	}
-}
-
 /**
  * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
  *                                       an xsave area


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 12/41] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (10 preceding siblings ...)
  2021-06-11 16:15 ` [patch 11/41] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 13/41] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
                   ` (29 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_kernel() to xsave_to_kernel()
	copy_kernel_to_xregs() to xrstor_from_kernel()

so it's entirely clear what this is about.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   12 ++++++------
 arch/x86/kernel/fpu/core.c          |    6 +++---
 arch/x86/kernel/fpu/signal.c        |   14 +++++++-------
 arch/x86/kernel/fpu/xstate.c        |    4 ++--
 4 files changed, 18 insertions(+), 18 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -256,7 +256,7 @@ static inline void copy_fxregs_to_kernel
  * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
-static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
+static inline void xsave_to_kernel_booting(struct xregs_state *xstate)
 {
 	u64 mask = xfeatures_mask_all;
 	u32 lmask = mask;
@@ -278,7 +278,7 @@ static inline void copy_xregs_to_kernel_
  * This function is called only during boot time when x86 caps are not set
  * up and alternative can not be used yet.
  */
-static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
+static inline void xrstor_from_kernel_booting(struct xregs_state *xstate)
 {
 	u64 mask = -1;
 	u32 lmask = mask;
@@ -302,7 +302,7 @@ static inline void copy_kernel_to_xregs_
 /*
  * Save processor xstate to xsave area.
  */
-static inline void copy_xregs_to_kernel(struct xregs_state *xstate)
+static inline void xsave_to_kernel(struct xregs_state *xstate)
 {
 	u64 mask = xfeatures_mask_all;
 	u32 lmask = mask;
@@ -320,7 +320,7 @@ static inline void copy_xregs_to_kernel(
 /*
  * Restore processor xstate from xsave area.
  */
-static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask)
+static inline void xrstor_from_kernel(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
@@ -381,7 +381,7 @@ static inline int copy_user_to_xregs(str
  * Restore xstate from kernel space xsave area, return an error code instead of
  * an exception.
  */
-static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
+static inline int xrstor_from_kernel_err(struct xregs_state *xstate, u64 mask)
 {
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
@@ -400,7 +400,7 @@ extern int copy_fpregs_to_fpstate(struct
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
-		copy_kernel_to_xregs(&fpstate->xsave, mask);
+		xrstor_from_kernel(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
 			copy_kernel_to_fxregs(&fpstate->fxsave);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -95,7 +95,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
 int copy_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
-		copy_xregs_to_kernel(&fpu->state.xsave);
+		xsave_to_kernel(&fpu->state.xsave);
 
 		/*
 		 * AVX512 state is tracked here because its use is
@@ -358,7 +358,7 @@ void fpu__drop(struct fpu *fpu)
 static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
 {
 	if (use_xsave())
-		copy_kernel_to_xregs(&init_fpstate.xsave, features_mask);
+		xrstor_from_kernel(&init_fpstate.xsave, features_mask);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		copy_kernel_to_fxregs(&init_fpstate.fxsave);
 	else
@@ -389,7 +389,7 @@ static void fpu__clear(struct fpu *fpu,
 	if (user_only) {
 		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
 		    xfeatures_mask_supervisor())
-			copy_kernel_to_xregs(&fpu->state.xsave,
+			xrstor_from_kernel(&fpu->state.xsave,
 					     xfeatures_mask_supervisor());
 		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
 	} else {
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -271,14 +271,14 @@ static int copy_user_to_fpregs_zeroing(v
 
 			r = copy_user_to_fxregs(buf);
 			if (!r)
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
 			r = copy_user_to_xregs(buf, xbv);
 			if (!r && unlikely(init_bv))
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
 		}
 	} else if (use_fxsr()) {
@@ -367,7 +367,7 @@ static int __fpu__restore_sig(void __use
 			 */
 			if (test_thread_flag(TIF_NEED_FPU_LOAD) &&
 			    xfeatures_mask_supervisor())
-				copy_kernel_to_xregs(&fpu->state.xsave,
+				xrstor_from_kernel(&fpu->state.xsave,
 						     xfeatures_mask_supervisor());
 			fpregs_mark_activate();
 			fpregs_unlock();
@@ -422,7 +422,7 @@ static int __fpu__restore_sig(void __use
 		 * above XRSTOR failed or ia32_fxstate is true. Shrug.
 		 */
 		if (xfeatures_mask_supervisor())
-			copy_xregs_to_kernel(&fpu->state.xsave);
+			xsave_to_kernel(&fpu->state.xsave);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	__fpu_invalidate_fpregs_state(fpu);
@@ -440,13 +440,13 @@ static int __fpu__restore_sig(void __use
 
 		fpregs_lock();
 		if (unlikely(init_bv))
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 
 		/*
 		 * Restore previously saved supervisor xstates along with
 		 * copied-in user xstates.
 		 */
-		ret = copy_kernel_to_xregs_err(&fpu->state.xsave,
+		ret = xrstor_from_kernel_err(&fpu->state.xsave,
 					       user_xfeatures | xfeatures_mask_supervisor());
 
 	} else if (use_fxsr()) {
@@ -464,7 +464,7 @@ static int __fpu__restore_sig(void __use
 			u64 init_bv;
 
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
-			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 		}
 
 		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -388,13 +388,13 @@ static void __init setup_init_fpu_buf(vo
 	/*
 	 * Init all the features state with header.xfeatures being 0x0
 	 */
-	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
+	xrstor_from_kernel_booting(&init_fpstate.xsave);
 
 	/*
 	 * Dump the init state again. This is to identify the init state
 	 * of any feature which is not represented by all zero's.
 	 */
-	copy_xregs_to_kernel_booting(&init_fpstate.xsave);
+	xsave_to_kernel_booting(&init_fpstate.xsave);
 }
 
 static int xfeature_uncompacted_offset(int xfeature_nr)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 13/41] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (11 preceding siblings ...)
  2021-06-11 16:15 ` [patch 12/41] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 14/41] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
                   ` (28 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for xsave[s]/xrstor[s] operations are horribly named and
a permanent source of confusion.

Rename:
	copy_xregs_to_user() to xsave_to_user_sigframe()
	copy_user_to_xregs() to xrstor_from_user_sigframe()

so it's entirely clear what this is about. This is also a clear indicator
of the potentially different storage format because this is user ABI and
cannot use compacted format.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/kernel/fpu/signal.c        |    4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -338,7 +338,7 @@ static inline void xrstor_from_kernel(st
  * backward compatibility for old applications which don't understand
  * compacted format of xsave area.
  */
-static inline int copy_xregs_to_user(struct xregs_state __user *buf)
+static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
 	u64 mask = xfeatures_mask_user();
 	u32 lmask = mask;
@@ -363,7 +363,7 @@ static inline int copy_xregs_to_user(str
 /*
  * Restore xstate from user space xsave area.
  */
-static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
+static inline int xrstor_from_user_sigframe(struct xregs_state __user *buf, u64 mask)
 {
 	struct xregs_state *xstate = ((__force struct xregs_state *)buf);
 	u32 lmask = mask;
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -129,7 +129,7 @@ static inline int copy_fpregs_to_sigfram
 	int err;
 
 	if (use_xsave())
-		err = copy_xregs_to_user(buf);
+		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
 		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
 	else
@@ -276,7 +276,7 @@ static int copy_user_to_fpregs_zeroing(v
 		} else {
 			init_bv = xfeatures_mask_user() & ~xbv;
 
-			r = copy_user_to_xregs(buf, xbv);
+			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
 				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 14/41] x86/fpu: Rename fxregs related copy functions
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (12 preceding siblings ...)
  2021-06-11 16:15 ` [patch 13/41] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 15/41] x86/fpu: Rename fregs " Thomas Gleixner
                   ` (27 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for fxsave/fxrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_fxregs_to_kernel() to fxsave_to_kernel()
	copy_kernel_to_fxregs() to fxrstor_from_kernel()
	copy_fxregs_to_user() to fxsave_to_user_sigframe()
	copy_user_to_fxregs() to fxrstor_from_user_sigframe()

so it's entirely clear what this is about.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   12 ++++++------
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kernel/fpu/signal.c        |   10 +++++-----
 3 files changed, 13 insertions(+), 13 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -132,7 +132,7 @@ static inline int copy_fregs_to_user(str
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
 
-static inline int copy_fxregs_to_user(struct fxregs_state __user *fx)
+static inline int fxsave_to_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxsave %[fx], [fx] "=m" (*fx), "m" (*fx));
@@ -141,7 +141,7 @@ static inline int copy_fxregs_to_user(st
 
 }
 
-static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
+static inline void fxrstor_from_kernel(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		kernel_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -149,7 +149,7 @@ static inline void copy_kernel_to_fxregs
 		kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
+static inline int fxrstor_from_kernel_err(struct fxregs_state *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return kernel_insn_err(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -157,7 +157,7 @@ static inline int copy_kernel_to_fxregs_
 		return kernel_insn_err(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
+static inline int fxrstor_from_user_sigframe(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		return user_insn(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -180,7 +180,7 @@ static inline int copy_user_to_fregs(str
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_fxregs_to_kernel(struct fpu *fpu)
+static inline void fxsave_to_kernel(struct fpu *fpu)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
 		asm volatile( "fxsave %[fx]" : [fx] "=m" (fpu->state.fxsave));
@@ -403,7 +403,7 @@ static inline void __copy_kernel_to_fpre
 		xrstor_from_kernel(&fpstate->xsave, mask);
 	} else {
 		if (use_fxsr())
-			copy_kernel_to_fxregs(&fpstate->fxsave);
+			fxrstor_from_kernel(&fpstate->fxsave);
 		else
 			copy_kernel_to_fregs(&fpstate->fsave);
 	}
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -107,7 +107,7 @@ int copy_fpregs_to_fpstate(struct fpu *f
 	}
 
 	if (likely(use_fxsr())) {
-		copy_fxregs_to_kernel(fpu);
+		fxsave_to_kernel(fpu);
 		return 1;
 	}
 
@@ -360,7 +360,7 @@ static inline void copy_init_fpstate_to_
 	if (use_xsave())
 		xrstor_from_kernel(&init_fpstate.xsave, features_mask);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
-		copy_kernel_to_fxregs(&init_fpstate.fxsave);
+		fxrstor_from_kernel(&init_fpstate.fxsave);
 	else
 		copy_kernel_to_fregs(&init_fpstate.fsave);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -64,7 +64,7 @@ static inline int save_fsave_header(stru
 
 		fpregs_lock();
 		if (!test_thread_flag(TIF_NEED_FPU_LOAD))
-			copy_fxregs_to_kernel(&tsk->thread.fpu);
+			fxsave_to_kernel(&tsk->thread.fpu);
 		fpregs_unlock();
 
 		convert_from_fxsr(&env, tsk);
@@ -131,7 +131,7 @@ static inline int copy_fpregs_to_sigfram
 	if (use_xsave())
 		err = xsave_to_user_sigframe(buf);
 	else if (use_fxsr())
-		err = copy_fxregs_to_user((struct fxregs_state __user *) buf);
+		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
 		err = copy_fregs_to_user((struct fregs_state __user *) buf);
 
@@ -269,7 +269,7 @@ static int copy_user_to_fpregs_zeroing(v
 		if (fx_only) {
 			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
 
-			r = copy_user_to_fxregs(buf);
+			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
@@ -282,7 +282,7 @@ static int copy_user_to_fpregs_zeroing(v
 			return r;
 		}
 	} else if (use_fxsr()) {
-		return copy_user_to_fxregs(buf);
+		return fxrstor_from_user_sigframe(buf);
 	} else
 		return copy_user_to_fregs(buf);
 }
@@ -467,7 +467,7 @@ static int __fpu__restore_sig(void __use
 			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 		}
 
-		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
+		ret = fxrstor_from_kernel_err(&fpu->state.fxsave);
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 15/41] x86/fpu: Rename fregs related copy functions
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (13 preceding siblings ...)
  2021-06-11 16:15 ` [patch 14/41] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 16/41] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
                   ` (26 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The function names for fnsave/fnrstor operations are horribly named and
a permanent source of confusion.

Rename:
	copy_fregs_to_kernel() to fnsave_to_kernel()
	copy_kernel_to_fregs() to fnrstor_from_kernel()
	copy_fregs_to_user()   to fnsave_to_user_sigframe()
	copy_user_to_fregs()   to fnrstor_from_user_sigframe()

so it's entirely clear what this is about.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   10 +++++-----
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |    6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -127,7 +127,7 @@ static inline void fpstate_init_soft(str
 		     _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_fprestore)	\
 		     : output : input)
 
-static inline int copy_fregs_to_user(struct fregs_state __user *fx)
+static inline int fnsave_to_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(fnsave %[fx]; fwait,  [fx] "=m" (*fx), "m" (*fx));
 }
@@ -165,17 +165,17 @@ static inline int fxrstor_from_user_sigf
 		return user_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline void copy_kernel_to_fregs(struct fregs_state *fx)
+static inline void frstor_from_kernel(struct fregs_state *fx)
 {
 	kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_kernel_to_fregs_err(struct fregs_state *fx)
+static inline int frstor_from_kernel_err(struct fregs_state *fx)
 {
 	return kernel_insn_err(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
-static inline int copy_user_to_fregs(struct fregs_state __user *fx)
+static inline int frstor_from_user_sigframe(struct fregs_state __user *fx)
 {
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
@@ -405,7 +405,7 @@ static inline void __copy_kernel_to_fpre
 		if (use_fxsr())
 			fxrstor_from_kernel(&fpstate->fxsave);
 		else
-			copy_kernel_to_fregs(&fpstate->fsave);
+			frstor_from_kernel(&fpstate->fsave);
 	}
 }
 
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -362,7 +362,7 @@ static inline void copy_init_fpstate_to_
 	else if (static_cpu_has(X86_FEATURE_FXSR))
 		fxrstor_from_kernel(&init_fpstate.fxsave);
 	else
-		copy_kernel_to_fregs(&init_fpstate.fsave);
+		frstor_from_kernel(&init_fpstate.fsave);
 
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -133,7 +133,7 @@ static inline int copy_fpregs_to_sigfram
 	else if (use_fxsr())
 		err = fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
 	else
-		err = copy_fregs_to_user((struct fregs_state __user *) buf);
+		err = fnsave_to_user_sigframe((struct fregs_state __user *) buf);
 
 	if (unlikely(err) && __clear_user(buf, fpu_user_xstate_size))
 		err = -EFAULT;
@@ -284,7 +284,7 @@ static int copy_user_to_fpregs_zeroing(v
 	} else if (use_fxsr()) {
 		return fxrstor_from_user_sigframe(buf);
 	} else
-		return copy_user_to_fregs(buf);
+		return frstor_from_user_sigframe(buf);
 }
 
 static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
@@ -474,7 +474,7 @@ static int __fpu__restore_sig(void __use
 			goto out;
 
 		fpregs_lock();
-		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
+		ret = frstor_from_kernel_err(&fpu->state.fsave);
 	}
 	if (!ret)
 		fpregs_mark_activate();


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 16/41] x86/fpu: Rename xstate copy functions which are related to UABI
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (14 preceding siblings ...)
  2021-06-11 16:15 ` [patch 15/41] x86/fpu: Rename fregs " Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 17/41] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
                   ` (25 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Rename them to reflect that these functions deal with user space format
XSAVE buffers.

      copy_xstate_to_kernel() -> copy_uabi_xstate_to_membuf()
      copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate()
      copy_user_to_xstate()   -> copy_sigframe_from_user_to_xstate()

Again a clear statement that these functions deal with user space ABI.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Better names at least more consistent
V2: New patch
---
 arch/x86/include/asm/fpu/xstate.h |    6 +++---
 arch/x86/kernel/fpu/regset.c      |    4 ++--
 arch/x86/kernel/fpu/signal.c      |    2 +-
 arch/x86/kernel/fpu/xstate.c      |    7 ++++---
 4 files changed, 10 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -104,9 +104,9 @@ void *get_xsave_addr(struct xregs_state
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave);
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
+void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave);
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
 void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
 
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -118,7 +118,7 @@ int xstateregs_get(struct task_struct *t
 
 	fpu__prepare_read(fpu);
 
-	copy_xstate_to_kernel(to, &fpu->state.xsave);
+	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave);
 	return 0;
 }
 
@@ -150,7 +150,7 @@ int xstateregs_set(struct task_struct *t
 	}
 
 	fpu__prepare_write(fpu);
-	ret = copy_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
+	ret = copy_uabi_from_kernel_to_xstate(&fpu->state.xsave, kbuf ?: tmpbuf);
 
 out:
 	vfree(tmpbuf);
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -431,7 +431,7 @@ static int __fpu__restore_sig(void __use
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
-		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
+		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			goto out;
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -968,7 +968,7 @@ static void copy_part(struct membuf *to,
  * It supports partial copy but pos always starts from zero. This is called
  * from xstateregs_get() and there we check the CPU has XSAVE.
  */
-void copy_xstate_to_kernel(struct membuf to, struct xregs_state *xsave)
+void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave)
 {
 	struct xstate_header header;
 	const unsigned off_mxcsr = offsetof(struct fxregs_state, mxcsr);
@@ -1024,7 +1024,7 @@ void copy_xstate_to_kernel(struct membuf
  * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
  * and copy to the target thread. This is called from xstateregs_set().
  */
-int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 {
 	unsigned int offset, size;
 	int i;
@@ -1079,7 +1079,8 @@ int copy_kernel_to_xstate(struct xregs_s
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
  */
-int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
+int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
+				      const void __user *ubuf)
 {
 	unsigned int offset, size;
 	int i;


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 17/41] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (15 preceding siblings ...)
  2021-06-11 16:15 ` [patch 16/41] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 18/41] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
                   ` (24 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

copy_uabi_from_user_to_xstate() and copy_uabi_from_kernel_to_xstate() are
almost identical except for the copy function.

Unify them.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
---
V4: Adjusted to the rename.
---
 arch/x86/kernel/fpu/xstate.c |   84 +++++++++++++++----------------------------
 1 file changed, 30 insertions(+), 54 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1020,20 +1020,30 @@ void copy_uabi_xstate_to_membuf(struct m
 	fill_gap(&to, &last, size);
 }
 
-/*
- * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S] format
- * and copy to the target thread. This is called from xstateregs_set().
- */
-int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+static int copy_from_buffer(void *dst, unsigned int offset, unsigned int size,
+			    const void *kbuf, const void __user *ubuf)
+{
+	if (kbuf) {
+		memcpy(dst, kbuf + offset, size);
+	} else {
+		if (copy_from_user(dst, ubuf + offset, size))
+			return -EFAULT;
+	}
+	return 0;
+}
+
+static int copy_uabi_to_xstate(struct xregs_state *xsave, const void *kbuf,
+			       const void __user *ubuf)
 {
 	unsigned int offset, size;
-	int i;
 	struct xstate_header hdr;
+	int i;
 
 	offset = offsetof(struct xregs_state, header);
 	size = sizeof(hdr);
 
-	memcpy(&hdr, kbuf + offset, size);
+	if (copy_from_buffer(&hdr, offset, size, kbuf, ubuf))
+		return -EFAULT;
 
 	if (validate_user_xstate_header(&hdr))
 		return -EINVAL;
@@ -1047,7 +1057,8 @@ int copy_uabi_from_kernel_to_xstate(stru
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
 
-			memcpy(dst, kbuf + offset, size);
+			if (copy_from_buffer(dst, offset, size, kbuf, ubuf))
+				return -EFAULT;
 		}
 	}
 
@@ -1075,6 +1086,16 @@ int copy_uabi_from_kernel_to_xstate(stru
 }
 
 /*
+ * Convert from a ptrace standard-format kernel buffer to kernel XSAVE[S]
+ * format and copy to the target thread. This is called from
+ * xstateregs_set().
+ */
+int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
+{
+	return copy_uabi_to_xstate(xsave, kbuf, NULL);
+}
+
+/*
  * Convert from a sigreturn standard-format user-space buffer to kernel
  * XSAVE[S] format and copy to the target thread. This is called from the
  * sigreturn() and rt_sigreturn() system calls.
@@ -1082,52 +1103,7 @@ int copy_uabi_from_kernel_to_xstate(stru
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave,
 				      const void __user *ubuf)
 {
-	unsigned int offset, size;
-	int i;
-	struct xstate_header hdr;
-
-	offset = offsetof(struct xregs_state, header);
-	size = sizeof(hdr);
-
-	if (copy_from_user(&hdr, ubuf + offset, size))
-		return -EFAULT;
-
-	if (validate_user_xstate_header(&hdr))
-		return -EINVAL;
-
-	for (i = 0; i < XFEATURE_MAX; i++) {
-		u64 mask = ((u64)1 << i);
-
-		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, i);
-
-			offset = xstate_offsets[i];
-			size = xstate_sizes[i];
-
-			if (copy_from_user(dst, ubuf + offset, size))
-				return -EFAULT;
-		}
-	}
-
-	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
-		offset = offsetof(struct fxregs_state, mxcsr);
-		size = MXCSR_AND_FLAGS_SIZE;
-		if (copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
-			return -EFAULT;
-	}
-
-	/*
-	 * The state that came in from userspace was user-state only.
-	 * Mask all the user states out of 'xfeatures':
-	 */
-	xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR_ALL;
-
-	/*
-	 * Add back in the features that came in from userspace:
-	 */
-	xsave->header.xfeatures |= hdr.xfeatures;
-
-	return 0;
+	return copy_uabi_to_xstate(xsave, NULL, ubuf);
 }
 
 /**


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 18/41] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (16 preceding siblings ...)
  2021-06-11 16:15 ` [patch 17/41] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 19/41] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
                   ` (23 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

A copy is guaranteed to leave the source intact, which is not the case when
FNSAVE is used as that reinitilizes the registers.

Rename it to save_fpregs_to_fpstate() which does not make such guarantees.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/kernel/fpu/core.c          |   10 +++++-----
 arch/x86/kvm/x86.c                  |    2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -395,7 +395,7 @@ static inline int xrstor_from_kernel_err
 	return err;
 }
 
-extern int copy_fpregs_to_fpstate(struct fpu *fpu);
+extern int save_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
 {
@@ -527,7 +527,7 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!copy_fpregs_to_fpstate(old_fpu))
+		if (!save_fpregs_to_fpstate(old_fpu))
 			old_fpu->last_cpu = -1;
 		else
 			old_fpu->last_cpu = cpu;
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -92,7 +92,7 @@ EXPORT_SYMBOL(irq_fpu_usable);
  * Modern FPU state can be kept in registers, if there are
  * no pending FP exceptions.
  */
-int copy_fpregs_to_fpstate(struct fpu *fpu)
+int save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		xsave_to_kernel(&fpu->state.xsave);
@@ -119,7 +119,7 @@ int copy_fpregs_to_fpstate(struct fpu *f
 
 	return 0;
 }
-EXPORT_SYMBOL(copy_fpregs_to_fpstate);
+EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -137,7 +137,7 @@ void kernel_fpu_begin_mask(unsigned int
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
 		 */
-		copy_fpregs_to_fpstate(&current->thread.fpu);
+		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
 
@@ -172,7 +172,7 @@ void fpu__save(struct fpu *fpu)
 	trace_x86_fpu_before_save(fpu);
 
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!copy_fpregs_to_fpstate(fpu)) {
+		if (!save_fpregs_to_fpstate(fpu)) {
 			copy_kernel_to_fpregs(&fpu->state);
 		}
 	}
@@ -255,7 +255,7 @@ int fpu__copy(struct task_struct *dst, s
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
-	else if (!copy_fpregs_to_fpstate(dst_fpu))
+	else if (!save_fpregs_to_fpstate(dst_fpu))
 		copy_kernel_to_fpregs(&dst_fpu->state);
 
 	fpregs_unlock();
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9618,7 +9618,7 @@ static void kvm_save_current_fpu(struct
 		memcpy(&fpu->state, &current->thread.fpu.state,
 		       fpu_kernel_xstate_size);
 	else
-		copy_fpregs_to_fpstate(fpu);
+		save_fpregs_to_fpstate(fpu);
 }
 
 /* Swap (qemu) user FPU context for the guest FPU context. */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 19/41] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (17 preceding siblings ...)
  2021-06-11 16:15 ` [patch 18/41] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 20/41] x86/fpu: Rename initstate copy functions Thomas Gleixner
                   ` (22 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

This is not a copy functionality. It restores the register state from the
supplied kernel buffer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |    8 ++++----
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kvm/x86.c                  |    4 ++--
 arch/x86/mm/extable.c               |    2 +-
 4 files changed, 9 insertions(+), 9 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -397,7 +397,7 @@ static inline int xrstor_from_kernel_err
 
 extern int save_fpregs_to_fpstate(struct fpu *fpu);
 
-static inline void __copy_kernel_to_fpregs(union fpregs_state *fpstate, u64 mask)
+static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
 	if (use_xsave()) {
 		xrstor_from_kernel(&fpstate->xsave, mask);
@@ -409,7 +409,7 @@ static inline void __copy_kernel_to_fpre
 	}
 }
 
-static inline void copy_kernel_to_fpregs(union fpregs_state *fpstate)
+static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
 	/*
 	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
@@ -424,7 +424,7 @@ static inline void copy_kernel_to_fpregs
 			: : [addr] "m" (fpstate));
 	}
 
-	__copy_kernel_to_fpregs(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
@@ -495,7 +495,7 @@ static inline void __fpregs_load_activat
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		copy_kernel_to_fpregs(&fpu->state);
+		restore_fpregs_from_fpstate(&fpu->state);
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -173,7 +173,7 @@ void fpu__save(struct fpu *fpu)
 
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		if (!save_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
+			restore_fpregs_from_fpstate(&fpu->state);
 		}
 	}
 
@@ -256,7 +256,7 @@ int fpu__copy(struct task_struct *dst, s
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
 
 	else if (!save_fpregs_to_fpstate(dst_fpu))
-		copy_kernel_to_fpregs(&dst_fpu->state);
+		restore_fpregs_from_fpstate(&dst_fpu->state);
 
 	fpregs_unlock();
 
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9634,7 +9634,7 @@ static void kvm_load_guest_fpu(struct kv
 	 */
 	if (vcpu->arch.guest_fpu)
 		/* PKRU is separately restored in kvm_x86_ops.run. */
-		__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state,
+		__restore_fpregs_from_fpstate(&vcpu->arch.guest_fpu->state,
 					~XFEATURE_MASK_PKRU);
 
 	fpregs_mark_activate();
@@ -9655,7 +9655,7 @@ static void kvm_put_guest_fpu(struct kvm
 	if (vcpu->arch.guest_fpu)
 		kvm_save_current_fpu(vcpu->arch.guest_fpu);
 
-	copy_kernel_to_fpregs(&vcpu->arch.user_fpu->state);
+	restore_fpregs_from_fpstate(&vcpu->arch.user_fpu->state);
 
 	fpregs_mark_activate();
 	fpregs_unlock();
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__copy_kernel_to_fpregs(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, -1);
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 20/41] x86/fpu: Rename initstate copy functions
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (18 preceding siblings ...)
  2021-06-11 16:15 ` [patch 19/41] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 21/41] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
                   ` (21 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Again this not a copy. It's loading register state.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/core.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -355,7 +355,7 @@ void fpu__drop(struct fpu *fpu)
  * Clear FPU registers by setting them up from the init fpstate.
  * Caller must do fpregs_[un]lock() around it.
  */
-static inline void copy_init_fpstate_to_fpregs(u64 features_mask)
+static inline void load_fpregs_from_init_fpstate(u64 features_mask)
 {
 	if (use_xsave())
 		xrstor_from_kernel(&init_fpstate.xsave, features_mask);
@@ -391,9 +391,9 @@ static void fpu__clear(struct fpu *fpu,
 		    xfeatures_mask_supervisor())
 			xrstor_from_kernel(&fpu->state.xsave,
 					     xfeatures_mask_supervisor());
-		copy_init_fpstate_to_fpregs(xfeatures_mask_user());
+		load_fpregs_from_init_fpstate(xfeatures_mask_user());
 	} else {
-		copy_init_fpstate_to_fpregs(xfeatures_mask_all);
+		load_fpregs_from_init_fpstate(xfeatures_mask_all);
 	}
 
 	fpregs_mark_activate();


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 21/41] x86/fpu: Rename "dynamic" XSTATEs to "independent"
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (19 preceding siblings ...)
  2021-06-11 16:15 ` [patch 20/41] x86/fpu: Rename initstate copy functions Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 22/41] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
                   ` (20 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

The salient feature of "dynamic" XSTATEs is that they are not part of the
main task XSTATE buffer.  The fact that they are dynamically allocated is
irrelevant and will become quite confusing when user math XSTATEs start
being dynamically allocated.  Rename them to "independent" because they
are independent of the main XSTATE code.

This is just a search-and-replace with some whitespace updates to keep
things aligned.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1eecb0e4f3e07828ebe5d737ec77dc3b708fad2d.1623388344.git.luto@kernel.org

---
 arch/x86/events/intel/lbr.c       |    6 +--
 arch/x86/include/asm/fpu/xstate.h |   14 ++++----
 arch/x86/kernel/fpu/xstate.c      |   62 +++++++++++++++++++-------------------
 3 files changed, 41 insertions(+), 41 deletions(-)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_dynamic_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_dynamic_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_dynamic_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -56,7 +56,7 @@
  * - Don't set the bit corresponding to the dynamic supervisor feature in
  *   IA32_XSS at run time, since it has been set at boot time.
  */
-#define XFEATURE_MASK_DYNAMIC (XFEATURE_MASK_LBR)
+#define XFEATURE_MASK_INDEPENDENT (XFEATURE_MASK_LBR)
 
 /*
  * Unsupported supervisor features. When a supervisor feature in this mask is
@@ -66,7 +66,7 @@
 
 /* All supervisor states including supported and unsupported states. */
 #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
-				      XFEATURE_MASK_DYNAMIC | \
+				      XFEATURE_MASK_INDEPENDENT | \
 				      XFEATURE_MASK_SUPERVISOR_UNSUPPORTED)
 
 #ifdef CONFIG_X86_64
@@ -87,12 +87,12 @@ static inline u64 xfeatures_mask_user(vo
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_dynamic(void)
+static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
-		return XFEATURE_MASK_DYNAMIC & ~XFEATURE_MASK_LBR;
+		return XFEATURE_MASK_INDEPENDENT & ~XFEATURE_MASK_LBR;
 
-	return XFEATURE_MASK_DYNAMIC;
+	return XFEATURE_MASK_INDEPENDENT;
 }
 
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
@@ -107,7 +107,7 @@ struct membuf;
 void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask);
+void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
+void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
 
 #endif
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -162,7 +162,7 @@ void fpu__init_cpu_xstate(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() |
-				     xfeatures_mask_dynamic());
+				     xfeatures_mask_independent());
 	}
 }
 
@@ -541,7 +541,7 @@ static void check_xstate_against_struct(
  * how large the XSAVE buffer needs to be.  We are recalculating
  * it to be safe.
  *
- * Dynamic XSAVE features allocate their own buffers and are not
+ * Independent XSAVE features allocate their own buffers and are not
  * covered by these checks. Only the size of the buffer for task->fpu
  * is checked here.
  */
@@ -607,18 +607,18 @@ static unsigned int __init get_xsaves_si
 }
 
 /*
- * Get the total size of the enabled xstates without the dynamic supervisor
+ * Get the total size of the enabled xstates without the independent supervisor
  * features.
  */
-static unsigned int __init get_xsaves_size_no_dynamic(void)
+static unsigned int __init get_xsaves_size_no_independent(void)
 {
-	u64 mask = xfeatures_mask_dynamic();
+	u64 mask = xfeatures_mask_independent();
 	unsigned int size;
 
 	if (!mask)
 		return get_xsaves_size();
 
-	/* Disable dynamic features. */
+	/* Disable independent features. */
 	wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor());
 
 	/*
@@ -627,7 +627,7 @@ static unsigned int __init get_xsaves_si
 	 */
 	size = get_xsaves_size();
 
-	/* Re-enable dynamic features so XSAVES will work on them again. */
+	/* Re-enable independent features so XSAVES will work on them again. */
 	wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor() | mask);
 
 	return size;
@@ -670,7 +670,7 @@ static int __init init_xstate_size(void)
 	xsave_size = get_xsave_size();
 
 	if (boot_cpu_has(X86_FEATURE_XSAVES))
-		possible_xstate_size = get_xsaves_size_no_dynamic();
+		possible_xstate_size = get_xsaves_size_no_independent();
 	else
 		possible_xstate_size = xsave_size;
 
@@ -812,7 +812,7 @@ void fpu__resume_cpu(void)
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
 		wrmsrl(MSR_IA32_XSS, xfeatures_mask_supervisor()  |
-				     xfeatures_mask_dynamic());
+				     xfeatures_mask_independent());
 	}
 }
 
@@ -1107,34 +1107,34 @@ int copy_sigframe_from_user_to_xstate(st
 }
 
 /**
- * copy_dynamic_supervisor_to_kernel() - Save dynamic supervisor states to
- *                                       an xsave area
+ * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
+ *                                           an xsave area
  * @xstate: A pointer to an xsave area
- * @mask: Represent the dynamic supervisor features saved into the xsave area
+ * @mask: Represent the independent supervisor features saved into the xsave area
  *
- * Only the dynamic supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_DYNAMIC for the details of dynamic
- * supervisor feature). Besides the dynamic supervisor states, the legacy
+ * Only the independent supervisor states sets in the mask are saved into the xsave
+ * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
+ * supervisor feature). Besides the independent supervisor states, the legacy
  * region and XSAVE header are also saved into the xsave area. The supervisor
  * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
  * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
  *
  * The xsave area must be 64-bytes aligned.
  */
-void copy_dynamic_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 dynamic_mask = xfeatures_mask_dynamic() & mask;
+	u64 independent_mask = xfeatures_mask_independent() & mask;
 	u32 lmask, hmask;
 	int err;
 
 	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
 		return;
 
-	if (WARN_ON_FPU(!dynamic_mask))
+	if (WARN_ON_FPU(!independent_mask))
 		return;
 
-	lmask = dynamic_mask;
-	hmask = dynamic_mask >> 32;
+	lmask = independent_mask;
+	hmask = independent_mask >> 32;
 
 	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
 
@@ -1143,34 +1143,34 @@ void copy_dynamic_supervisor_to_kernel(s
 }
 
 /**
- * copy_kernel_to_dynamic_supervisor() - Restore dynamic supervisor states from
- *                                       an xsave area
+ * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
+ *                                           an xsave area
  * @xstate: A pointer to an xsave area
- * @mask: Represent the dynamic supervisor features restored from the xsave area
+ * @mask: Represent the independent supervisor features restored from the xsave area
  *
- * Only the dynamic supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_DYNAMIC for the details of
- * dynamic supervisor feature). Besides the dynamic supervisor states, the
+ * Only the independent supervisor states sets in the mask are restored from the
+ * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
+ * independent supervisor feature). Besides the independent supervisor states, the
  * legacy region and XSAVE header are also restored from the xsave area. The
  * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
  * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
  *
  * The xsave area must be 64-bytes aligned.
  */
-void copy_kernel_to_dynamic_supervisor(struct xregs_state *xstate, u64 mask)
+void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
 {
-	u64 dynamic_mask = xfeatures_mask_dynamic() & mask;
+	u64 independent_mask = xfeatures_mask_independent() & mask;
 	u32 lmask, hmask;
 	int err;
 
 	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
 		return;
 
-	if (WARN_ON_FPU(!dynamic_mask))
+	if (WARN_ON_FPU(!independent_mask))
 		return;
 
-	lmask = dynamic_mask;
-	hmask = dynamic_mask >> 32;
+	lmask = independent_mask;
+	hmask = independent_mask >> 32;
 
 	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 22/41] x86/fpu/xstate: Sanitize handling of independent features
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (20 preceding siblings ...)
  2021-06-11 16:15 ` [patch 21/41] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 23/41] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
                   ` (19 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The copy functions for the independent features are horribly named and the
supervisor and independent part is just overengineered.

The point is that the supplied mask has either to be a subset of the
independent feature or a subset of the task->fpu.xstate managed features.

Rewrite it so it checks check for invalid overlaps of these areas in the
caller supplied feature mask. Rename it so it follows the new naming
convention for these operations. Mop up the function documentation.

This allows to use that function for other purposes as well.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/lbr.c       |    6 +-
 arch/x86/include/asm/fpu/xstate.h |    5 +-
 arch/x86/kernel/fpu/xstate.c      |   93 +++++++++++++++++++-------------------
 3 files changed, 53 insertions(+), 51 deletions(-)

--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -491,7 +491,7 @@ static void intel_pmu_arch_lbr_xrstors(v
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_kernel_to_independent_supervisor(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xrstors_from_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static __always_inline bool lbr_is_reset_in_cstate(void *ctx)
@@ -576,7 +576,7 @@ static void intel_pmu_arch_lbr_xsaves(vo
 {
 	struct x86_perf_task_context_arch_lbr_xsave *task_ctx = ctx;
 
-	copy_independent_supervisor_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
+	xsaves_to_kernel(&task_ctx->xsave, XFEATURE_MASK_LBR);
 }
 
 static void __intel_pmu_lbr_save(void *ctx)
@@ -992,7 +992,7 @@ static void intel_pmu_arch_lbr_read_xsav
 		intel_pmu_store_lbr(cpuc, NULL);
 		return;
 	}
-	copy_independent_supervisor_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
+	xsaves_to_kernel(&xsave->xsave, XFEATURE_MASK_LBR);
 
 	intel_pmu_store_lbr(cpuc, xsave->lbr.entries);
 }
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -107,7 +107,8 @@ struct membuf;
 void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask);
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask);
+
+void xsaves_to_kernel(struct xregs_state *xsave, u64 mask);
+void xrstors_from_kernel(struct xregs_state *xsave, u64 mask);
 
 #endif
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1107,75 +1107,76 @@ int copy_sigframe_from_user_to_xstate(st
 }
 
 /**
- * copy_independent_supervisor_to_kernel() - Save independent supervisor states to
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features saved into the xsave area
+ * xsaves_to_kernel - Save selected components to a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to save
  *
- * Only the independent supervisor states sets in the mask are saved into the xsave
- * area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of independent
- * supervisor feature). Besides the independent supervisor states, the legacy
- * region and XSAVE header are also saved into the xsave area. The supervisor
- * features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not saved.
+ * The @xstate buffer must be 64 byte aligned and correctly initialized as
+ * XSAVES does not write the full xstate header. Before first use the
+ * buffer should be zeroed otherwise a consecutive XRSTORS from that buffer
+ * can #GP.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features
  */
-void copy_independent_supervisor_to_kernel(struct xregs_state *xstate, u64 mask)
+void xsaves_to_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
+	u64 xchk;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (WARN_ON_FPU(!cpu_feature_enabled(X86_FEATURE_XSAVES)))
 		return;
+	/*
+	 * Validate that this is either a task->fpstate related component
+	 * subset or an independent one.
+	 */
+	if (mask & xfeatures_mask_independent())
+		xchk = ~xfeatures_mask_independent();
+	else
+		xchk = ~xfeatures_mask_all;
 
-	if (WARN_ON_FPU(!independent_mask))
+	if (WARN_ON_ONCE(!mask || mask & xchk))
 		return;
 
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying to a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XSAVES, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 /**
- * copy_kernel_to_independent_supervisor() - Restore independent supervisor states from
- *                                           an xsave area
- * @xstate: A pointer to an xsave area
- * @mask: Represent the independent supervisor features restored from the xsave area
+ * xrstors_from_kernel - Restore selected components from a kernel xstate buffer
+ * @xstate:	Pointer to the buffer
+ * @mask:	Feature mask to select the components to restore
+ *
+ * The @xstate buffer must be 64 byte aligned and correctly initialized
+ * otherwise XRSTORS from that buffer can #GP.
  *
- * Only the independent supervisor states sets in the mask are restored from the
- * xsave area (See the comment in XFEATURE_MASK_INDEPENDENT for the details of
- * independent supervisor feature). Besides the independent supervisor states, the
- * legacy region and XSAVE header are also restored from the xsave area. The
- * supervisor features in the XFEATURE_MASK_SUPERVISOR_SUPPORTED and
- * XFEATURE_MASK_SUPERVISOR_UNSUPPORTED are not restored.
+ * Proper usage is to restore the state which was saved with
+ * xsaves_to_kernel() into @xstate.
  *
- * The xsave area must be 64-bytes aligned.
+ * The feature mask must either be a subset of the independent features or
+ * a subset of the task->fpstate related features
  */
-void copy_kernel_to_independent_supervisor(struct xregs_state *xstate, u64 mask)
+void xrstors_from_kernel(struct xregs_state *xstate, u64 mask)
 {
-	u64 independent_mask = xfeatures_mask_independent() & mask;
-	u32 lmask, hmask;
+	u64 xchk;
 	int err;
 
-	if (WARN_ON_FPU(!boot_cpu_has(X86_FEATURE_XSAVES)))
+	if (WARN_ON_FPU(!cpu_feature_enabled(X86_FEATURE_XSAVES)))
 		return;
+	/*
+	 * Validate that this is either a task->fpstate related component
+	 * subset or an independent one.
+	 */
+	if (mask & xfeatures_mask_independent())
+		xchk = ~xfeatures_mask_independent();
+	else
+		xchk = ~xfeatures_mask_all;
 
-	if (WARN_ON_FPU(!independent_mask))
+	if (WARN_ON_ONCE(!mask || mask & xchk))
 		return;
 
-	lmask = independent_mask;
-	hmask = independent_mask >> 32;
-
-	XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
-
-	/* Should never fault when copying from a kernel buffer */
-	WARN_ON_FPU(err);
+	XSTATE_OP(XRSTORS, xstate, (u32)mask, (u32)(mask >> 32), err);
+	WARN_ON_ONCE(err);
 }
 
 #ifdef CONFIG_PROC_PID_ARCH_STATUS


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 23/41] x86/pkeys: Move read_pkru() and write_pkru()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (21 preceding siblings ...)
  2021-06-11 16:15 ` [patch 22/41] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 24/41] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
                   ` (18 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

write_pkru() was originally used just to write to the PKRU register.  It
was mercifully short and sweet and was not out of place in pgtable.h with
some other pkey-related code.

But, later work included a requirement to also modify the task XSAVE
buffer when updating the register.  This really is more related to the
XSAVE architecture than to paging.

Move the read/write_pkru() to asm/pkru.h.  pgtable.h won't miss them.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Adapted it to the latest FPU changes, move them to a seperate file
    along with the constants to avoid header hell later on.
---

 arch/x86/include/asm/fpu/xstate.h |    1 
 arch/x86/include/asm/pgtable.h    |   57 -----------------------------------
 arch/x86/include/asm/pkru.h       |   61 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/process_64.c      |    1 
 arch/x86/kvm/svm/sev.c            |    1 
 arch/x86/kvm/x86.c                |    1 
 arch/x86/mm/pkeys.c               |    1 
 7 files changed, 67 insertions(+), 56 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -6,6 +6,7 @@
 #include <linux/types.h>
 
 #include <asm/processor.h>
+#include <asm/fpu/api.h>
 #include <asm/user.h>
 
 /* Bit 63 of XCR0 is reserved for future expansion */
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,7 +23,7 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
-#include <asm/fpu/xstate.h>
+#include <asm/pkru.h>
 #include <asm/fpu/api.h>
 #include <asm-generic/pgtable_uffd.h>
 
@@ -126,35 +126,6 @@ static inline int pte_dirty(pte_t pte)
 	return pte_flags(pte) & _PAGE_DIRTY;
 }
 
-
-static inline u32 read_pkru(void)
-{
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return rdpkru();
-	return 0;
-}
-
-static inline void write_pkru(u32 pkru)
-{
-	struct pkru_state *pk;
-
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
-	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
-	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
-}
-
 static inline int pte_young(pte_t pte)
 {
 	return pte_flags(pte) & _PAGE_ACCESSED;
@@ -1360,32 +1331,6 @@ static inline pmd_t pmd_swp_clear_uffd_w
 }
 #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
 
-#define PKRU_AD_BIT 0x1
-#define PKRU_WD_BIT 0x2
-#define PKRU_BITS_PER_PKEY 2
-
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-extern u32 init_pkru_value;
-#else
-#define init_pkru_value	0
-#endif
-
-static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
-}
-
-static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
-{
-	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
-	/*
-	 * Access-disable disables writes too so we need to check
-	 * both bits here.
-	 */
-	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
-}
-
 static inline u16 pte_flags_pkey(unsigned long pte_flags)
 {
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--- /dev/null
+++ b/arch/x86/include/asm/pkru.h
@@ -0,0 +1,61 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_PKRU_H
+#define _ASM_X86_PKRU_H
+
+#include <asm/fpu/xstate.h>
+
+#define PKRU_AD_BIT 0x1
+#define PKRU_WD_BIT 0x2
+#define PKRU_BITS_PER_PKEY 2
+
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+extern u32 init_pkru_value;
+#else
+#define init_pkru_value	0
+#endif
+
+static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
+}
+
+static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
+{
+	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
+	/*
+	 * Access-disable disables writes too so we need to check
+	 * both bits here.
+	 */
+	return !(pkru & ((PKRU_AD_BIT|PKRU_WD_BIT) << pkru_pkey_bits));
+}
+
+static inline u32 read_pkru(void)
+{
+	if (boot_cpu_has(X86_FEATURE_OSPKE))
+		return rdpkru();
+	return 0;
+}
+
+static inline void write_pkru(u32 pkru)
+{
+	struct pkru_state *pk;
+
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return;
+
+	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
+
+	/*
+	 * The PKRU value in xstate needs to be in sync with the value that is
+	 * written to the CPU. The FPU restore on return to userland would
+	 * otherwise load the previous value again.
+	 */
+	fpregs_lock();
+	if (pk)
+		pk->pkru = pkru;
+	__write_pkru(pkru);
+	fpregs_unlock();
+}
+
+#endif
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -41,6 +41,7 @@
 #include <linux/syscalls.h>
 
 #include <asm/processor.h>
+#include <asm/pkru.h>
 #include <asm/fpu/internal.h>
 #include <asm/mmu_context.h>
 #include <asm/prctl.h>
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -19,6 +19,7 @@
 #include <linux/trace_events.h>
 #include <asm/fpu/internal.h>
 
+#include <asm/pkru.h>
 #include <asm/trapnr.h>
 
 #include "x86.h"
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -65,6 +65,7 @@
 #include <asm/msr.h>
 #include <asm/desc.h>
 #include <asm/mce.h>
+#include <asm/pkru.h>
 #include <linux/kernel_stat.h>
 #include <asm/fpu/internal.h> /* Ugh! */
 #include <asm/pvclock.h>
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,6 +10,7 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
+#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 24/41] x86/fpu: Differentiate "copy" versus "move" of fpregs
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (22 preceding siblings ...)
  2021-06-11 16:15 ` [patch 23/41] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 25/41] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
                   ` (17 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

There are three ways in the ISA to bulk save FPU state:
 1. XSAVE, every CPU newer than 2008 does this
 2. FXSAVE, from ~2000->2007
 3. FNSAVE, pre-2000

XSAVE and FXSAVE are nice.  They just copy FPU state to memory.  FNSAVE is
nasty; it destroys the FPU state when it writes it to memory.  It is more
of a "move".

Currently, copy_fpregs_to_fpstate() returns a number to its caller to say
whether it used the nice, non-destructive XSAVE/FXSAVE or used the mean,
clobbering FNSAVE.  Some sites need special handling for the FNSAVE case to
restore any FNSAVE-clobbered state.  Others don't care, like when they are
about to load new state anyway.

The nasty part about the copy_fpregs_to_fpstate() interface is that it's
hard to tell if callers expect the "move" or the "copy" behavior.

Create a new, explicit "move" interface for callers that can handle
clobbering register state.  Make "copy" only do copies and never clobber
register state.

== switch_fpu_prepare() optimization ==

switch_fpu_prepare() had a nice optimization for the FNSAVE case.  It can
handle either clobbering or preserving register state.  For the
XSAVE/FXSAVE case, it records that the fpregs state is still loaded, just
in case a later "restore" operation can be elided.  For the FNSAVE case, it
marks the fpregs as not loaded on the CPU, since they were clobbered.

Instead of having switch_fpu_prepare() modify its behavior based on whether
registers were clobbered or not, simply switch its behavior based on
whether FNSAVE is in use.  This makes it much more clear what is going on
and what the common path is.

It would be simpler to just remove this FNSAVE optimization: Always save
and restore in the FNSAVE case.  This may incur the cost of the restore
even in cases where the restored state is never used.  But, it would only
hurt painfully ancient (>20 years old) processors.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Picked it up from Dave's series and adopted it to the other changes.
---

 arch/x86/include/asm/fpu/internal.h |   14 ++++--
 arch/x86/kernel/fpu/core.c          |   84 +++++++++++++++++++++---------------
 2 files changed, 59 insertions(+), 39 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -395,7 +395,8 @@ static inline int xrstor_from_kernel_err
 	return err;
 }
 
-extern int save_fpregs_to_fpstate(struct fpu *fpu);
+extern void save_fpregs_to_fpstate(struct fpu *fpu);
+extern void copy_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
@@ -527,10 +528,15 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!save_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
-		else
+		/*
+		 * Avoid the expense of restoring fpregs with FNSAVE when it
+		 * might be unnecssary. XSAVE and FXSAVE preserve the FPU state.
+		 */
+		save_fpregs_to_fpstate(old_fpu);
+		if (likely(use_xsave() || use_fxsr()))
 			old_fpu->last_cpu = cpu;
+		else
+			old_fpu->last_cpu = -1;
 
 		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -83,16 +83,17 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
- *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
+ * Must be called with fpregs locked.
+ *
+ * Returns 'true' if the FPU state has been clobbered and the register
+ * contents are lost.
+ *
+ * The legacy FNSAVE instruction clobebrs all FPU state unconditionally, so
+ * registers are essentially destroyed.
+ *
+ * XSAVE and FXSAVE preserve register contents.
  */
-int save_fpregs_to_fpstate(struct fpu *fpu)
+static bool __clobber_save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		xsave_to_kernel(&fpu->state.xsave);
@@ -103,23 +104,46 @@ int save_fpregs_to_fpstate(struct fpu *f
 		 */
 		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
 			fpu->avx512_timestamp = jiffies;
-		return 1;
+		return false;
 	}
 
 	if (likely(use_fxsr())) {
 		fxsave_to_kernel(fpu);
-		return 1;
+		return false;
 	}
 
-	/*
-	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
-	 */
+	/* Legacy FPU register saving, FNSAVE always clears FPU registers */
 	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
 
-	return 0;
+	return true;
+}
+
+/**
+ * save_fpregs_to_fpstate - Save fpregs in fpstate
+ * @fpu:	Pointer to FPU context
+ *
+ * Hardware register state might be clobbered when the
+ * function returns.
+ */
+void save_fpregs_to_fpstate(struct fpu *fpu)
+{
+	__clobber_save_fpregs_to_fpstate(fpu);
+}
+EXPORT_SYMBOL_GPL(save_fpregs_to_fpstate);
+
+/**
+ * copy_fpregs_to_fpstate - Copy fpregs to fpstate
+ * @fpu:	Pointer to FPU context
+ *
+ * Guarantees that the hardware register state is preserved.
+ */
+void copy_fpregs_to_fpstate(struct fpu *fpu)
+{
+	bool clobbered = __clobber_save_fpregs_to_fpstate(fpu);
+
+	if (clobbered)
+		restore_fpregs_from_fpstate(&fpu->state);
 }
-EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -133,10 +157,6 @@ void kernel_fpu_begin_mask(unsigned int
 	if (!(current->flags & PF_KTHREAD) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
 		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
@@ -160,7 +180,8 @@ void kernel_fpu_end(void)
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
 /*
- * Save the FPU state (mark it for reload if necessary):
+ * Save the FPU register state. If the registers are active then they are
+ * preserved.
  *
  * This only ever gets called for the current task.
  */
@@ -171,11 +192,8 @@ void fpu__save(struct fpu *fpu)
 	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!save_fpregs_to_fpstate(fpu)) {
-			restore_fpregs_from_fpstate(&fpu->state);
-		}
-	}
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		copy_fpregs_to_fpstate(fpu);
 
 	trace_x86_fpu_after_save(fpu);
 	fpregs_unlock();
@@ -245,18 +263,14 @@ int fpu__copy(struct task_struct *dst, s
 
 	/*
 	 * If the FPU registers are not current just memcpy() the state.
-	 * Otherwise save current FPU registers directly into the child's FPU
-	 * context, without any memory-to-memory copying.
-	 *
-	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to load them back. )
+	 * Otherwise copy current FPU registers directly into the child's
+	 * FPU context.
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
-
-	else if (!save_fpregs_to_fpstate(dst_fpu))
-		restore_fpregs_from_fpstate(&dst_fpu->state);
+	else
+		copy_fpregs_to_fpstate(dst_fpu);
 
 	fpregs_unlock();
 


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 25/41] x86/cpu: Sanitize X86_FEATURE_OSPKE
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (23 preceding siblings ...)
  2021-06-11 16:15 ` [patch 24/41] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 26/41] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
                   ` (16 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

X86_FEATURE_OSPKE is enabled first on the boot CPU and the feature flag is
set. Secondary CPUs have to enable CR4.PKE as well and set their per CPU
feature flag. That's ineffective because all call sites have checks for
boot_cpu_data.

Make it smarter and force the feature flag when PKU is enabled on the boot
cpu which allows then to use cpu_feature_enabled(X86_FEATURE_OSPKE) all
over the place. That either compiles the code out when PKEY support is
disabled in Kconfig or uses a static_cpu_has() for the feature check which
makes a significant difference in hotpathes, e.g. context switch.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/pkeys.h |    8 ++++----
 arch/x86/include/asm/pkru.h  |    4 ++--
 arch/x86/kernel/cpu/common.c |   24 +++++++++++-------------
 arch/x86/kernel/fpu/core.c   |    2 +-
 arch/x86/kernel/fpu/xstate.c |    2 +-
 arch/x86/kernel/process_64.c |    2 +-
 arch/x86/mm/fault.c          |    2 +-
 7 files changed, 21 insertions(+), 23 deletions(-)

--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -9,14 +9,14 @@
  * will be necessary to ensure that the types that store key
  * numbers and masks have sufficient capacity.
  */
-#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
+#define arch_max_pkey() (cpu_feature_enabled(X86_FEATURE_OSPKE) ? 16 : 1)
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
 
 static inline bool arch_pkeys_enabled(void)
 {
-	return boot_cpu_has(X86_FEATURE_OSPKE);
+	return cpu_feature_enabled(X86_FEATURE_OSPKE);
 }
 
 /*
@@ -26,7 +26,7 @@ static inline bool arch_pkeys_enabled(vo
 extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return ARCH_DEFAULT_PKEY;
 
 	return __execute_only_pkey(mm);
@@ -37,7 +37,7 @@ extern int __arch_override_mprotect_pkey
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 		int prot, int pkey)
 {
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return 0;
 
 	return __arch_override_mprotect_pkey(vma, prot, pkey);
--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -32,7 +32,7 @@ static inline bool __pkru_allows_write(u
 
 static inline u32 read_pkru(void)
 {
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return rdpkru();
 	return 0;
 }
@@ -41,7 +41,7 @@ static inline void write_pkru(u32 pkru)
 {
 	struct pkru_state *pk;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
 
 	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -465,22 +465,20 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
-	/* check the boot processor, plus compile options for PKU: */
-	if (!cpu_feature_enabled(X86_FEATURE_PKU))
-		return;
-	/* checks the actual processor's cpuid bits: */
-	if (!cpu_has(c, X86_FEATURE_PKU))
-		return;
-	if (pku_disabled)
+	if (c == &boot_cpu_data) {
+		if (pku_disabled || !cpu_feature_enabled(X86_FEATURE_PKU))
+			return;
+		/*
+		 * Setting CR4.PKE will cause the X86_FEATURE_OSPKE cpuid
+		 * bit to be set.  Enforce it.
+		 */
+		setup_force_cpu_cap(X86_FEATURE_OSPKE);
+
+	} else if (!cpu_feature_enabled(X86_FEATURE_OSPKE)) {
 		return;
+	}
 
 	cr4_set_bits(X86_CR4_PKE);
-	/*
-	 * Setting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
-	 * cpuid bit to be set.  We need to ensure that we
-	 * update that bit in this CPU's "cpu_info".
-	 */
-	set_cpu_cap(c, X86_FEATURE_OSPKE);
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -378,7 +378,7 @@ static inline void copy_init_fpstate_to_
 	else
 		frstor_from_kernel(&init_fpstate.fsave);
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
 }
 
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -896,7 +896,7 @@ int arch_set_user_pkey_access(struct tas
 	 * This check implies XSAVE support.  OSPKE only gets
 	 * set if we enable XSAVE and we enable PKU in XCR0.
 	 */
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return -EINVAL;
 
 	/*
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -137,7 +137,7 @@ void __show_regs(struct pt_regs *regs, e
 		       log_lvl, d3, d6, d7);
 	}
 
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
 		printk("%sPKRU: %08x\n", log_lvl, read_pkru());
 }
 
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -875,7 +875,7 @@ static inline bool bad_area_access_from_
 	/* This code is always called on the current mm */
 	bool foreign = false;
 
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return false;
 	if (error_code & X86_PF_PK)
 		return true;


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 26/41] x86/pkru: Provide pkru_get_init_value()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (24 preceding siblings ...)
  2021-06-11 16:15 ` [patch 25/41] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 27/41] x86/pkru: Provide pkru_write_default() Thomas Gleixner
                   ` (15 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

When CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is disabled then the following
code fails to compile:

     if (cpu_feature_enabled(X86_FEATURE_OSPKE)) {
     	u32 pkru = READ_ONCE(init_pkru_value);
	..
     }

because init_pkru_value is defined as '0' which makes READ_ONCE() upset.

Provide an accessor macro to avoid #ifdeffery all over the place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/pkru.h |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -10,8 +10,10 @@
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
 extern u32 init_pkru_value;
+#define pkru_get_init_value()	READ_ONCE(init_pkru_value)
 #else
 #define init_pkru_value	0
+#define pkru_get_init_value()	0
 #endif
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 27/41] x86/pkru: Provide pkru_write_default()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (25 preceding siblings ...)
  2021-06-11 16:15 ` [patch 26/41] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 28/41] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
                   ` (14 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Provide a simple and trivial helper which just writes the PKRU default
value without trying to fiddle with the tasks xsave buffer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/pkru.h |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -60,4 +60,12 @@ static inline void write_pkru(u32 pkru)
 	fpregs_unlock();
 }
 
+static inline void pkru_write_default(void)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	wrpkru(pkru_get_init_value());
+}
+
 #endif


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 28/41] x86/cpu: Write the default PKRU value when enabling PKE
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (26 preceding siblings ...)
  2021-06-11 16:15 ` [patch 27/41] x86/pkru: Provide pkru_write_default() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 29/41] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
                   ` (13 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

In preparation of making the PKRU management more independent from XSTATES,
write the default PKRU value into the hardware right after enabling PKRU in
CR4. This ensures that switch_to() and copy_thread() have the correct
setting for init task and the per CPU idle threads right away.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/cpu/common.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -479,6 +479,8 @@ static __always_inline void setup_pku(st
 	}
 
 	cr4_set_bits(X86_CR4_PKE);
+	/* Load the default PKRU value */
+	pkru_write_default();
 }
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 29/41] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (27 preceding siblings ...)
  2021-06-11 16:15 ` [patch 28/41] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 30/41] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
                   ` (12 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

There is no point in using copy_init_pkru_to_fpregs() which in turn calls
write_pkru(). write_pkru() tries to fiddle with the task's xstate buffer
for nothing because the XRSTOR[S](init_fpstate) just cleared the xfeature
flag in the xstate header which makes get_xsave_addr() fail.

It's a useless exercise anyway because the reinitialization activates the
FPU so before the task's xstate buffer can be used again a XRSTOR[S] must
happen which in turn dumps the PKRU value.

Get rid of the now unused copy_init_pkru_to_fpregs().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/pkeys.h |    1 -
 arch/x86/kernel/fpu/core.c   |    3 +--
 arch/x86/mm/pkeys.c          |   17 -----------------
 include/linux/pkeys.h        |    4 ----
 4 files changed, 1 insertion(+), 24 deletions(-)

--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -124,7 +124,6 @@ extern int arch_set_user_pkey_access(str
 		unsigned long init_val);
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
-extern void copy_init_pkru_to_fpregs(void);
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -378,8 +378,7 @@ static inline void copy_init_fpstate_to_
 	else
 		frstor_from_kernel(&init_fpstate.fsave);
 
-	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
-		copy_init_pkru_to_fpregs();
+	pkru_write_default();
 }
 
 /*
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -10,7 +10,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/pkru.h>			/* read/write_pkru()		*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -125,22 +124,6 @@ u32 init_pkru_value = PKRU_AD_KEY( 1) |
 		      PKRU_AD_KEY(10) | PKRU_AD_KEY(11) | PKRU_AD_KEY(12) |
 		      PKRU_AD_KEY(13) | PKRU_AD_KEY(14) | PKRU_AD_KEY(15);
 
-/*
- * Called from the FPU code when creating a fresh set of FPU
- * registers.  This is called from a very specific context where
- * we know the FPU registers are safe for use and we can use PKRU
- * directly.
- */
-void copy_init_pkru_to_fpregs(void)
-{
-	u32 init_pkru_value_snapshot = READ_ONCE(init_pkru_value);
-	/*
-	 * Override the PKRU state that came from 'init_fpstate'
-	 * with the baseline from the process.
-	 */
-	write_pkru(init_pkru_value_snapshot);
-}
-
 static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 			     size_t count, loff_t *ppos)
 {
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -44,10 +44,6 @@ static inline bool arch_pkeys_enabled(vo
 	return false;
 }
 
-static inline void copy_init_pkru_to_fpregs(void)
-{
-}
-
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 #endif /* _LINUX_PKEYS_H */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 30/41] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (28 preceding siblings ...)
  2021-06-11 16:15 ` [patch 29/41] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 31/41] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
                   ` (11 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Make it clear what the function is about.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: New patch
---
 arch/x86/include/asm/fpu/internal.h |    3 ++-
 arch/x86/kernel/fpu/core.c          |    4 ++--
 arch/x86/kernel/process.c           |    2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -33,9 +33,10 @@ extern int  fpu__restore_sig(void __user
 extern void fpu__drop(struct fpu *fpu);
 extern int  fpu__copy(struct task_struct *dst, struct task_struct *src);
 extern void fpu__clear_user_states(struct fpu *fpu);
-extern void fpu__clear_all(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 
+extern void fpu_flush_thread(void);
+
 /*
  * Boot time FPU initialization functions:
  */
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -407,9 +407,9 @@ void fpu__clear_user_states(struct fpu *
 	fpu__clear(fpu, true);
 }
 
-void fpu__clear_all(struct fpu *fpu)
+void fpu_flush_thread(void)
 {
-	fpu__clear(fpu, false);
+	fpu__clear(&current->thread.fpu, false);
 }
 
 /*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -206,7 +206,7 @@ void flush_thread(void)
 	flush_ptrace_hw_breakpoint(tsk);
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
-	fpu__clear_all(&tsk->thread.fpu);
+	fpu_flush_thread();
 }
 
 void disable_TSC(void)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 31/41] x86/fpu: Clean up the fpu__clear() variants
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (29 preceding siblings ...)
  2021-06-11 16:15 ` [patch 30/41] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 32/41] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
                   ` (10 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Andy Lutomirski <luto@kernel.org>

fpu__clear() currently resets both register state and kernel XSAVE buffer
state.  It has two modes: one for all state (supervisor and user) and
another for user state only.  fpu__clear_all() uses the "all state"
(user_only=0) mode, while a number of signal paths use the user_only=1
mode.

Make fpu__clear() work only for user state (user_only=1) and remove the
"all state" (user_only=0) code.  Rename it to match so it can be used by
the signal paths.

Replace the "all state" (user_only=0) fpu__clear() functionality.  Use the
TIF_NEED_FPU_LOAD functionality instead of making any actual hardware
registers changes in this path.

Instead of invoking fpu__initialize() just memcpy() init_fpstate into the
tasks FPU state because that has already the correct format and in case of
PKRU also contains the default PKRU value. Move the actual PKRU write out
into flush_thread() where it belongs and where it will end up anyway when
PKRU and XSTATE have been distangled.

For bisectability a workaround is required which stores the PKRU value in
the xstate memory until PKRU is distangled from XSTATE for context
switching and return to user.

[ Dave Hansen: Polished changelog ]
[ tglx: Fixed the PKRU fallout ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Fixed PKRU fallout reported by Yu-Chen
V4: Minor adjustment to PKRU handling
---
 arch/x86/kernel/fpu/core.c |  113 ++++++++++++++++++++++++++++++---------------
 arch/x86/kernel/process.c  |   10 +++
 2 files changed, 87 insertions(+), 36 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -283,19 +283,6 @@ int fpu__copy(struct task_struct *dst, s
 }
 
 /*
- * Activate the current task's in-memory FPU context,
- * if it has not been used before:
- */
-static void fpu__initialize(struct fpu *fpu)
-{
-	WARN_ON_FPU(fpu != &current->thread.fpu);
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-	fpstate_init(&fpu->state);
-	trace_x86_fpu_init_state(fpu);
-}
-
-/*
  * This function must be called before we read a task's fpstate.
  *
  * There's two cases where this gets called:
@@ -381,46 +368,100 @@ static inline void load_fpregs_from_init
 	pkru_write_default();
 }
 
+static inline unsigned int init_fpstate_copy_size(void)
+{
+	if (!use_xsave())
+		return fpu_kernel_xstate_size;
+
+	/* XSAVE(S) just needs the legacy and the xstate header part */
+	return sizeof(init_fpstate.xsave);
+}
+
+/* Temporary workaround. Will be removed once PKRU and XSTATE are distangled. */
+static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
+{
+	struct pkru_state *pk;
+
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+	/*
+	 * Force XFEATURE_PKRU to be set in the header otherwise
+	 * get_xsave_addr() does not work and it also needs to be set to
+	 * make XRSTOR(S) load it.
+	 */
+	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
+	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
+	pk->pkru = pkru_get_init_value();
+}
+
 /*
- * Clear the FPU state back to init state.
- *
- * Called by sys_execve(), by the signal handler code and by various
- * error paths.
+ * Reset current->fpu memory state to the init values.
  */
-static void fpu__clear(struct fpu *fpu, bool user_only)
+static void fpu_reset_fpstate(void)
+{
+	struct fpu *fpu= &current->thread.fpu;
+
+	fpregs_lock();
+	fpu__drop(fpu);
+	/*
+	 * This does not change the actual hardware registers. It just
+	 * resets the memory image and sets TIF_NEED_FPU_LOAD so a
+	 * subsequent return to usermode will reload the registers from the
+	 * tasks memory image.
+	 *
+	 * Do not use fpstate_init() here. Just copy init_fpstate which has
+	 * the correct content already except for PKRU.
+	 */
+	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
+	pkru_set_default_in_xstate(&fpu->state.xsave);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
+	fpregs_unlock();
+}
+
+/*
+ * Reset current's user FPU states to the init states.  current's
+ * supervisor states, if any, are not modified by this function.  The
+ * caller guarantees that the XSTATE header in memory is intact.
+ */
+void fpu__clear_user_states(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
+	fpregs_lock();
 	if (!static_cpu_has(X86_FEATURE_FPU)) {
-		fpu__drop(fpu);
-		fpu__initialize(fpu);
+		fpu_reset_fpstate();
+		fpregs_unlock();
 		return;
 	}
 
-	fpregs_lock();
-
-	if (user_only) {
-		if (!fpregs_state_valid(fpu, smp_processor_id()) &&
-		    xfeatures_mask_supervisor())
-			xrstor_from_kernel(&fpu->state.xsave,
-					     xfeatures_mask_supervisor());
-		load_fpregs_from_init_fpstate(xfeatures_mask_user());
-	} else {
-		load_fpregs_from_init_fpstate(xfeatures_mask_all);
+	/*
+	 * Ensure that current's supervisor states are loaded into their
+	 * corresponding registers.
+	 */
+	if (xfeatures_mask_supervisor() &&
+	    !fpregs_state_valid(fpu, smp_processor_id())) {
+		xrstor_from_kernel(&fpu->state.xsave,
+				   xfeatures_mask_supervisor());
 	}
 
+	/* Reset user states in registers. */
+	load_fpregs_from_init_fpstate(xfeatures_mask_user());
+
+	/*
+	 * Now all FPU registers have their desired values.  Inform the FPU
+	 * state machine that current's FPU registers are in the hardware
+	 * registers. The memory image does not need to be updated because
+	 * any operation relying on it has to save the registers first when
+	 * currents FPU is marked active.
+	 */
 	fpregs_mark_activate();
-	fpregs_unlock();
-}
 
-void fpu__clear_user_states(struct fpu *fpu)
-{
-	fpu__clear(fpu, true);
+	fpregs_unlock();
 }
 
 void fpu_flush_thread(void)
 {
-	fpu__clear(&current->thread.fpu, false);
+	fpu_reset_fpstate();
 }
 
 /*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -199,6 +199,15 @@ int copy_thread(unsigned long clone_flag
 	return ret;
 }
 
+static void pkru_flush_thread(void)
+{
+	/*
+	 * If PKRU is enabled the default PKRU value has to be loaded into
+	 * the hardware right here (similar to context switch).
+	 */
+	pkru_write_default();
+}
+
 void flush_thread(void)
 {
 	struct task_struct *tsk = current;
@@ -207,6 +216,7 @@ void flush_thread(void)
 	memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
 
 	fpu_flush_thread();
+	pkru_flush_thread();
 }
 
 void disable_TSC(void)


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 32/41] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (30 preceding siblings ...)
  2021-06-11 16:15 ` [patch 31/41] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 33/41] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
                   ` (9 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Rename it so that it becomes entirely clear what this function is
about. It's purpose is to restore the FPU registers to the state which was
saved in the task's FPU memory state either at context switch or by an in
kernel FPU user.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/fpu/internal.h |    6 ++----
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |    2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -485,10 +485,8 @@ static inline void fpregs_activate(struc
 	trace_x86_fpu_regs_activated(fpu);
 }
 
-/*
- * Internal helper, do not use directly. Use switch_fpu_return() instead.
- */
-static inline void __fpregs_load_activate(void)
+/* Internal helper for switch_fpu_return() and signal frame setup */
+static inline void fpregs_restore_userregs(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 	int cpu = smp_processor_id();
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -472,7 +472,7 @@ void switch_fpu_return(void)
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return;
 
-	__fpregs_load_activate();
+	fpregs_restore_userregs();
 }
 EXPORT_SYMBOL_GPL(switch_fpu_return);
 
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -188,7 +188,7 @@ int copy_fpstate_to_sigframe(void __user
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
-		__fpregs_load_activate();
+		fpregs_restore_userregs();
 
 	pagefault_disable();
 	ret = copy_fpregs_to_sigframe(buf_fx);


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 33/41] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (31 preceding siblings ...)
  2021-06-11 16:15 ` [patch 32/41] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 34/41] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
                   ` (8 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

copy_kernel_to_fpregs() restores all xfeatures but it is also the place
where the AMD FXSAVE_LEAK bug is handled.

That prevents fpregs_restore_userregs() to limit the restored features,
which is required to distangle PKRU and XSTATE handling and also for the
upcoming supervisor state management.

Move the FXSAVE_LEAK quirk into __copy_kernel_to_fpregs() and deinline that
function which has become rather fat.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/fpu/internal.h |   25 +------------------------
 arch/x86/kernel/fpu/core.c          |   26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -399,33 +399,10 @@ static inline int xrstor_from_kernel_err
 extern void save_fpregs_to_fpstate(struct fpu *fpu);
 extern void copy_fpregs_to_fpstate(struct fpu *fpu);
 
-static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
-{
-	if (use_xsave()) {
-		xrstor_from_kernel(&fpstate->xsave, mask);
-	} else {
-		if (use_fxsr())
-			fxrstor_from_kernel(&fpstate->fxsave);
-		else
-			frstor_from_kernel(&fpstate->fsave);
-	}
-}
+extern void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask);
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	/*
-	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
-	 * pending. Clear the x87 state here by setting it to fixed values.
-	 * "m" is a random variable that should be in L1.
-	 */
-	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
-		asm volatile(
-			"fnclex\n\t"
-			"emms\n\t"
-			"fildl %P[addr]"	/* set F?P to defined value */
-			: : [addr] "m" (fpstate));
-	}
-
 	__restore_fpregs_from_fpstate(fpstate, -1);
 }
 
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -145,6 +145,32 @@ void copy_fpregs_to_fpstate(struct fpu *
 		restore_fpregs_from_fpstate(&fpu->state);
 }
 
+void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
+{
+	/*
+	 * AMD K7/K8 CPUs don't save/restore FDP/FIP/FOP unless an exception is
+	 * pending. Clear the x87 state here by setting it to fixed values.
+	 * "m" is a random variable that should be in L1.
+	 */
+	if (unlikely(static_cpu_has_bug(X86_BUG_FXSAVE_LEAK))) {
+		asm volatile(
+			"fnclex\n\t"
+			"emms\n\t"
+			"fildl %P[addr]"	/* set F?P to defined value */
+			: : [addr] "m" (fpstate));
+	}
+
+	if (use_xsave()) {
+		xrstor_from_kernel(&fpstate->xsave, mask);
+	} else {
+		if (use_fxsr())
+			fxrstor_from_kernel(&fpstate->fxsave);
+		else
+			frstor_from_kernel(&fpstate->fsave);
+	}
+}
+EXPORT_SYMBOL_GPL(__restore_fpregs_from_fpstate);
+
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
 	preempt_disable();


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 34/41] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (32 preceding siblings ...)
  2021-06-11 16:15 ` [patch 33/41] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 35/41] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
                   ` (7 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

Rename it so it's clear that this is about user ABI features which can
differ from the feature set which the kernel saves and restores because the
kernel handles e.g. PKRU differently. But the user ABI (ptrace, signal
frame) expects it to be there.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/fpu/internal.h |    7 ++++++-
 arch/x86/include/asm/fpu/xstate.h   |    6 +++++-
 arch/x86/kernel/fpu/core.c          |    2 +-
 arch/x86/kernel/fpu/signal.c        |   10 +++++-----
 arch/x86/kernel/fpu/xstate.c        |   12 ++++++------
 5 files changed, 23 insertions(+), 14 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -341,7 +341,12 @@ static inline void xrstor_from_kernel(st
  */
 static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
-	u64 mask = xfeatures_mask_user();
+	/*
+	 * Include the features which are not xsaved/rstored by the kernel
+	 * internally, e.g. PKRU. That's user space ABI and also required
+	 * to allow the signal handler to modify PKRU.
+	 */
+	u64 mask = xfeatures_mask_uabi();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -83,7 +83,11 @@ static inline u64 xfeatures_mask_supervi
 	return xfeatures_mask_all & XFEATURE_MASK_SUPERVISOR_SUPPORTED;
 }
 
-static inline u64 xfeatures_mask_user(void)
+/*
+ * The xfeatures which are enabled in XCR0 and expected to be in ptrace
+ * buffers and signal frames.
+ */
+static inline u64 xfeatures_mask_uabi(void)
 {
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -471,7 +471,7 @@ void fpu__clear_user_states(struct fpu *
 	}
 
 	/* Reset user states in registers. */
-	load_fpregs_from_init_fpstate(xfeatures_mask_user());
+	load_fpregs_from_init_fpstate(xfeatures_mask_uabi());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -267,14 +267,14 @@ static int copy_user_to_fpregs_zeroing(v
 
 	if (use_xsave()) {
 		if (fx_only) {
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 
 			r = fxrstor_from_user_sigframe(buf);
 			if (!r)
 				xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 			return r;
 		} else {
-			init_bv = xfeatures_mask_user() & ~xbv;
+			init_bv = xfeatures_mask_uabi() & ~xbv;
 
 			r = xrstor_from_user_sigframe(buf, xbv);
 			if (!r && unlikely(init_bv))
@@ -429,7 +429,7 @@ static int __fpu__restore_sig(void __use
 	fpregs_unlock();
 
 	if (use_xsave() && !fx_only) {
-		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
+		u64 init_bv = xfeatures_mask_uabi() & ~user_xfeatures;
 
 		ret = copy_sigframe_from_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
@@ -463,7 +463,7 @@ static int __fpu__restore_sig(void __use
 		if (use_xsave()) {
 			u64 init_bv;
 
-			init_bv = xfeatures_mask_user() & ~XFEATURE_MASK_FPSSE;
+			init_bv = xfeatures_mask_uabi() & ~XFEATURE_MASK_FPSSE;
 			xrstor_from_kernel(&init_fpstate.xsave, init_bv);
 		}
 
@@ -558,7 +558,7 @@ void fpu__init_prepare_fx_sw_frame(void)
 
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
-	fx_sw_reserved.xfeatures = xfeatures_mask_user();
+	fx_sw_reserved.xfeatures = xfeatures_mask_uabi();
 	fx_sw_reserved.xstate_size = fpu_user_xstate_size;
 
 	if (IS_ENABLED(CONFIG_IA32_EMULATION) ||
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -155,7 +155,7 @@ void fpu__init_cpu_xstate(void)
 	 * managed by XSAVE{C, OPT, S} and XRSTOR{S}.  Only XSAVE user
 	 * states can be set here.
 	 */
-	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * MSR_IA32_XSS sets supervisor states managed by XSAVES.
@@ -443,7 +443,7 @@ int using_compacted_format(void)
 static int validate_user_xstate_header(const struct xstate_header *hdr)
 {
 	/* No unknown or supervisor features may be set */
-	if (hdr->xfeatures & ~xfeatures_mask_user())
+	if (hdr->xfeatures & ~xfeatures_mask_uabi())
 		return -EINVAL;
 
 	/* Userspace must use the uncompacted format */
@@ -745,7 +745,7 @@ void __init fpu__init_system_xstate(void
 	cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
 	xfeatures_mask_all |= ecx + ((u64)edx << 32);
 
-	if ((xfeatures_mask_user() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
+	if ((xfeatures_mask_uabi() & XFEATURE_MASK_FPSSE) != XFEATURE_MASK_FPSSE) {
 		/*
 		 * This indicates that something really unexpected happened
 		 * with the enumeration.  Disable XSAVE and try to continue
@@ -776,7 +776,7 @@ void __init fpu__init_system_xstate(void
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
 	 */
-	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_user());
+	update_regset_xstate_info(fpu_user_xstate_size, xfeatures_mask_uabi());
 
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
@@ -804,7 +804,7 @@ void fpu__resume_cpu(void)
 	 * Restore XCR0 on xsave capable CPUs:
 	 */
 	if (boot_cpu_has(X86_FEATURE_XSAVE))
-		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_user());
+		xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask_uabi());
 
 	/*
 	 * Restore IA32_XSS. The same CPUID bit enumerates support
@@ -981,7 +981,7 @@ void copy_uabi_xstate_to_membuf(struct m
 	 */
 	memset(&header, 0, sizeof(header));
 	header.xfeatures = xsave->header.xfeatures;
-	header.xfeatures &= xfeatures_mask_user();
+	header.xfeatures &= xfeatures_mask_uabi();
 
 	if (header.xfeatures & XFEATURE_MASK_FP)
 		copy_part(&to, &last, 0, off_mxcsr, &xsave->i387);


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 35/41] x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (33 preceding siblings ...)
  2021-06-11 16:15 ` [patch 34/41] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:15 ` [patch 36/41] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
                   ` (6 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

switch_to(), flush_thread() write the task's PKRU value eagerly so the PKRU
value of current is always valid in the hardware.

That means there is no point in restoring PKRU on exit to user or when
reactivating the task's FPU registers in the signal frame setup path.

This allows to remove all the xstate buffer updates with PKRU values once
the PKRU state is stored in thread struct while a task is scheduled out.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   12 +++++++++++-
 arch/x86/include/asm/fpu/xstate.h   |   19 +++++++++++++++++++
 arch/x86/kernel/fpu/core.c          |    2 +-
 3 files changed, 31 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -477,7 +477,17 @@ static inline void fpregs_restore_userre
 		return;
 
 	if (!fpregs_state_valid(fpu, cpu)) {
-		restore_fpregs_from_fpstate(&fpu->state);
+		/*
+		 * This restores _all_ xstate which has not been
+		 * established yet.
+		 *
+		 * If PKRU is enabled, then the PKRU value is already
+		 * correct because it was either set in switch_to() or in
+		 * flush_thread(). So it is excluded because it might be
+		 * not up to date in current->thread.fpu.xsave state.
+		 */
+		__restore_fpregs_from_fpstate(&fpu->state,
+					      xfeatures_mask_restore_user());
 		fpregs_activate(fpu);
 		fpu->last_cpu = cpu;
 	}
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -35,6 +35,14 @@
 				      XFEATURE_MASK_BNDREGS | \
 				      XFEATURE_MASK_BNDCSR)
 
+/*
+ * Features which are restored when returning to user space.
+ * PKRU is not restored on return to user space because PKRU
+ * is switched eagerly in switch_to() and flush_thread()
+ */
+#define XFEATURE_MASK_USER_RESTORE	\
+	(XFEATURE_MASK_USER_SUPPORTED & ~XFEATURE_MASK_PKRU)
+
 /* All currently supported supervisor features */
 #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID)
 
@@ -92,6 +100,17 @@ static inline u64 xfeatures_mask_uabi(vo
 	return xfeatures_mask_all & XFEATURE_MASK_USER_SUPPORTED;
 }
 
+/*
+ * The xfeatures which are restored by the kernel when returning to user
+ * mode. This is not necessarily the same as xfeatures_mask_uabi() as the
+ * kernel does not manage all XCR0 enabled features via xsave/xrstor as
+ * some of them have to be switched eagerly on context switch and exec().
+ */
+static inline u64 xfeatures_mask_restore_user(void)
+{
+	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -471,7 +471,7 @@ void fpu__clear_user_states(struct fpu *
 	}
 
 	/* Reset user states in registers. */
-	load_fpregs_from_init_fpstate(xfeatures_mask_uabi());
+	load_fpregs_from_init_fpstate(xfeatures_mask_restore_user());
 
 	/*
 	 * Now all FPU registers have their desired values.  Inform the FPU


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 36/41] x86/fpu: Add PKRU storage outside of task XSAVE buffer
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (34 preceding siblings ...)
  2021-06-11 16:15 ` [patch 35/41] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
@ 2021-06-11 16:15 ` Thomas Gleixner
  2021-06-11 16:16 ` [patch 37/41] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
                   ` (5 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:15 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

PKRU is currently partly XSAVE-managed and partly not.  It has space in the
task XSAVE buffer and is context-switched by XSAVE/XRSTOR.  However, it is
switched more eagerly than FPU because there may be a need for PKRU to be
up-to-date for things like copy_to/from_user() since PKRU affects
user-permission memory accesses, not just accesses from userspace itself.

This leaves PKRU in a very odd position.  XSAVE brings very little value to
the table for how Linux uses PKRU except for signal related XSTATE
handling.

Prepare to move PKRU away from being XSAVE-managed.  Allocate space in the
thread_struct for it and save/restore it in the context-switch path
separately from the XSAVE-managed features. task->thread_struct.pkru is
only valid when the task is scheduled out. For the current task the
authoritative source is the hardware, i.e. it has to be retrieved via
rdpkru().

Leave the XSAVE code in place for now to ensure bisectability.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Picked up from Dave's PKRU series and adopted it to the other changes.
---

 arch/x86/include/asm/processor.h |    9 +++++++++
 arch/x86/kernel/process.c        |    7 +++++++
 arch/x86/kernel/process_64.c     |   25 +++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -518,6 +518,15 @@ struct thread_struct {
 
 	unsigned int		sig_on_uaccess_err:1;
 
+	/*
+	 * Protection Keys Register for Userspace.  Loaded immediately on
+	 * context switch. Store it in thread_struct to avoid a lookup in
+	 * the tasks's FPU xstate buffer. This value is only valid when a
+	 * task is scheduled out. For 'current' the authoritative source of
+	 * PKRU is the hardware itself.
+	 */
+	u32			pkru;
+
 	/* Floating point and extended processor state */
 	struct fpu		fpu;
 	/*
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -157,11 +157,18 @@ int copy_thread(unsigned long clone_flag
 
 	/* Kernel thread ? */
 	if (unlikely(p->flags & PF_KTHREAD)) {
+		p->thread.pkru = pkru_get_init_value();
 		memset(childregs, 0, sizeof(struct pt_regs));
 		kthread_frame_init(frame, sp, arg);
 		return 0;
 	}
 
+	/*
+	 * Clone current's PKRU value from hardware. tsk->thread.pkru
+	 * is only valid when scheduled out.
+	 */
+	p->thread.pkru = rdpkru();
+
 	frame->bx = 0;
 	*childregs = *current_pt_regs();
 	childregs->ax = 0;
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -340,6 +340,29 @@ static __always_inline void load_seg_leg
 	}
 }
 
+/*
+ * Store prev's PKRU value and load next's PKRU value if they differ. PKRU
+ * is not XSTATE managed on context switch because that would require a
+ * lookup in the task's FPU xsave buffer and require to keep that updated
+ * in various places.
+ */
+static __always_inline void x86_pkru_load(struct thread_struct *prev,
+					  struct thread_struct *next)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	/* Stash the prev task's value: */
+	prev->pkru = rdpkru();
+
+	/*
+	 * PKRU writes are slightly expensive.  Avoid them when not
+	 * strictly necessary:
+	 */
+	if (prev->pkru != next->pkru)
+		wrpkru(next->pkru);
+}
+
 static __always_inline void x86_fsgsbase_load(struct thread_struct *prev,
 					      struct thread_struct *next)
 {
@@ -589,6 +612,8 @@ void compat_start_thread(struct pt_regs
 
 	x86_fsgsbase_load(prev, next);
 
+	x86_pkru_load(prev, next);
+
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 37/41] x86/fpu: Hook up PKRU into ptrace()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (35 preceding siblings ...)
  2021-06-11 16:15 ` [patch 36/41] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
@ 2021-06-11 16:16 ` Thomas Gleixner
  2021-06-11 16:16 ` [patch 38/41] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
                   ` (4 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:16 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

From: Dave Hansen <dave.hansen@linux.intel.com>

One nice thing about having PKRU be XSAVE-managed is that it gets naturally
exposed into the XSAVE-using ABIs.  Now that XSAVE will not be used to
manage PKRU, these ABIs need to be manually enabled to deal with PKRU.

ptrace() uses copy_uabi_xstate_to_kernel() to collect the tracee's
XSTATE. As PKRU is not in the task's XSTATE buffer, use task->thread.pkru
for filling in up the ptrace buffer.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---

V4: Picked up from Dave's PKRU series, adopted to rename and dropped the
    copy_user_to_fpregs_zeroing() part because task->thread.pkru is only
    valid when the task is scheduled out. For current the authoritative
    source is the hardware.

---
 arch/x86/include/asm/fpu/xstate.h |    2 -
 arch/x86/kernel/fpu/regset.c      |   12 ++++-------
 arch/x86/kernel/fpu/xstate.c      |   40 ++++++++++++++++++++++++++------------
 3 files changed, 34 insertions(+), 20 deletions(-)

--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -128,7 +128,7 @@ void *get_xsave_addr(struct xregs_state
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
 struct membuf;
-void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave);
+void copy_uabi_xstate_to_membuf(struct membuf to, struct task_struct *target);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
 int copy_sigframe_from_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf);
 
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -109,22 +109,20 @@ int xfpregs_set(struct task_struct *targ
 }
 
 int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
-		struct membuf to)
+		   struct membuf to)
 {
-	struct fpu *fpu = &target->thread.fpu;
-
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return -ENODEV;
 
-	fpu__prepare_read(fpu);
+	fpu__prepare_read(&target->thread.fpu);
 
-	copy_uabi_xstate_to_membuf(to, &fpu->state.xsave);
+	copy_uabi_xstate_to_membuf(to, target);
 	return 0;
 }
 
 int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
-		  unsigned int pos, unsigned int count,
-		  const void *kbuf, const void __user *ubuf)
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
 {
 	struct fpu *fpu = &target->thread.fpu;
 	struct xregs_state *tmpbuf = NULL;
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -968,12 +968,13 @@ static void copy_part(struct membuf *to,
  * It supports partial copy but pos always starts from zero. This is called
  * from xstateregs_get() and there we check the CPU has XSAVE.
  */
-void copy_uabi_xstate_to_membuf(struct membuf to, struct xregs_state *xsave)
+void copy_uabi_xstate_to_membuf(struct membuf to, struct task_struct *target)
 {
-	struct xstate_header header;
 	const unsigned off_mxcsr = offsetof(struct fxregs_state, mxcsr);
-	unsigned size = to.left;
-	unsigned last = 0;
+	struct xregs_state *xsave = &target->thread.fpu.state.xsave;
+	struct xstate_header header;
+	unsigned int size = to.left;
+	unsigned int last = 0;
 	int i;
 
 	/*
@@ -983,6 +984,13 @@ void copy_uabi_xstate_to_membuf(struct m
 	header.xfeatures = xsave->header.xfeatures;
 	header.xfeatures &= xfeatures_mask_uabi();
 
+	/*
+	 * PKRU is not XSTATE managed. If enabled set the xfeature bit as
+	 * user space expects it to be part of the XSTATE buffer.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_OSPKE))
+		header.xfeatures |= XFEATURE_MASK_PKRU;
+
 	if (header.xfeatures & XFEATURE_MASK_FP)
 		copy_part(&to, &last, 0, off_mxcsr, &xsave->i387);
 	if (header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM))
@@ -1006,16 +1014,24 @@ void copy_uabi_xstate_to_membuf(struct m
 		  sizeof(header), &header);
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
-		/*
-		 * Copy only in-use xstates:
-		 */
-		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, i);
+		struct pkru_state pkru = {0};
+		void *src;
 
-			copy_part(&to, &last, xstate_offsets[i],
-				  xstate_sizes[i], src);
-		}
+		/* Copy only in-use xstates: */
+		if (!((header.xfeatures >> i) & 1))
+			continue;
 
+		if (i == XFEATURE_PKRU) {
+			/*
+			 * PKRU is not kept in the thread's XSAVE buffer.
+			 * Fill this part from the per-thread storage.
+			 */
+			pkru.pkru = target->thread.pkru;
+			src = &pkru;
+		} else {
+			src = __raw_xsave_addr(xsave, i);
+		}
+		copy_part(&to, &last, xstate_offsets[i], xstate_sizes[i], src);
 	}
 	fill_gap(&to, &last, size);
 }


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 38/41] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (36 preceding siblings ...)
  2021-06-11 16:16 ` [patch 37/41] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
@ 2021-06-11 16:16 ` Thomas Gleixner
  2021-06-11 16:16 ` [patch 39/41] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
                   ` (3 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:16 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

As the PKRU state is managed seperately restoring it from the xstate buffer
would be counterproductive as it might either restore a stale value or
reinit the PKRU state to 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/fpu/internal.h |    4 ++--
 arch/x86/include/asm/fpu/xstate.h   |   10 ++++++++++
 arch/x86/kernel/fpu/xstate.c        |    1 +
 arch/x86/mm/extable.c               |    2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -281,7 +281,7 @@ static inline void xsave_to_kernel_booti
  */
 static inline void xrstor_from_kernel_booting(struct xregs_state *xstate)
 {
-	u64 mask = -1;
+	u64 mask = xfeatures_mask_fpstate();
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 	int err;
@@ -408,7 +408,7 @@ extern void __restore_fpregs_from_fpstat
 
 static inline void restore_fpregs_from_fpstate(union fpregs_state *fpstate)
 {
-	__restore_fpregs_from_fpstate(fpstate, -1);
+	__restore_fpregs_from_fpstate(fpstate, xfeatures_mask_fpstate());
 }
 
 extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size);
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -111,6 +111,16 @@ static inline u64 xfeatures_mask_restore
 	return xfeatures_mask_all & XFEATURE_MASK_USER_RESTORE;
 }
 
+/*
+ * Like xfeatures_mask_restore_user() but additionally restors the
+ * supported supervisor states.
+ */
+static inline u64 xfeatures_mask_fpstate(void)
+{
+	return xfeatures_mask_all & \
+		(XFEATURE_MASK_USER_RESTORE | XFEATURE_MASK_SUPERVISOR_SUPPORTED);
+}
+
 static inline u64 xfeatures_mask_independent(void)
 {
 	if (!boot_cpu_has(X86_FEATURE_ARCH_LBR))
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -60,6 +60,7 @@ static short xsave_cpuid_features[] __in
  * XSAVE buffer, both supervisor and user xstates.
  */
 u64 xfeatures_mask_all __ro_after_init;
+EXPORT_SYMBOL_GPL(xfeatures_mask_all);
 
 static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -65,7 +65,7 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
 	WARN_ONCE(1, "Bad FPU state detected at %pB, reinitializing FPU registers.",
 		  (void *)instruction_pointer(regs));
 
-	__restore_fpregs_from_fpstate(&init_fpstate, -1);
+	__restore_fpregs_from_fpstate(&init_fpstate, xfeatures_mask_fpstate());
 	return true;
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 39/41] x86/fpu: Remove PKRU handling from switch_fpu_finish()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (37 preceding siblings ...)
  2021-06-11 16:16 ` [patch 38/41] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
@ 2021-06-11 16:16 ` Thomas Gleixner
  2021-06-11 16:16 ` [patch 40/41] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
                   ` (2 subsequent siblings)
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:16 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

PKRU is already updated and the xstate is not longer the proper source of
information.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch.
---
 arch/x86/include/asm/fpu/internal.h |   34 ++++------------------------------
 1 file changed, 4 insertions(+), 30 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -539,39 +539,13 @@ static inline void switch_fpu_prepare(st
  */
 
 /*
- * Load PKRU from the FPU context if available. Delay loading of the
- * complete FPU state until the return to userland.
+ * Delay loading of the complete FPU state until the return to userland.
+ * PKRU is handled seperately.
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu)
 {
-	u32 pkru_val = init_pkru_value;
-	struct pkru_state *pk;
-
-	if (!static_cpu_has(X86_FEATURE_FPU))
-		return;
-
-	set_thread_flag(TIF_NEED_FPU_LOAD);
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-
-	/*
-	 * PKRU state is switched eagerly because it needs to be valid before we
-	 * return to userland e.g. for a copy_to_user() operation.
-	 */
-	if (!(current->flags & PF_KTHREAD)) {
-		/*
-		 * If the PKRU bit in xsave.header.xfeatures is not set,
-		 * then the PKRU component was in init state, which means
-		 * XRSTOR will set PKRU to 0. If the bit is not set then
-		 * get_xsave_addr() will return NULL because the PKRU value
-		 * in memory is not valid. This means pkru_val has to be
-		 * set to 0 and not to init_pkru_value.
-		 */
-		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
-		pkru_val = pk ? pk->pkru : 0;
-	}
-	__write_pkru(pkru_val);
+	if (static_cpu_has(X86_FEATURE_FPU))
+		set_thread_flag(TIF_NEED_FPU_LOAD);
 }
 
 #endif /* _ASM_X86_FPU_INTERNAL_H */


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 40/41] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (38 preceding siblings ...)
  2021-06-11 16:16 ` [patch 39/41] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
@ 2021-06-11 16:16 ` Thomas Gleixner
  2021-06-11 16:16 ` [patch 41/41] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
  2021-06-12  0:24 ` [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:16 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

PKRU for a task is stored in task->thread.pkru when the task is scheduled
out. For 'current' the authoritative source of PKRU is the hardware.

fpu_reset_fpstate() has two callers:

  1) fpu__clear_user_states() for !FPU systems. For those PKRU is irrelevant

  2) fpu_flush_thread() which is invoked from flush_thread(). flush_thread()
     resets the hardware to the kernel restrictive default value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/kernel/fpu/core.c |   22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -403,23 +403,6 @@ static inline unsigned int init_fpstate_
 	return sizeof(init_fpstate.xsave);
 }
 
-/* Temporary workaround. Will be removed once PKRU and XSTATE are distangled. */
-static inline void pkru_set_default_in_xstate(struct xregs_state *xsave)
-{
-	struct pkru_state *pk;
-
-	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
-		return;
-	/*
-	 * Force XFEATURE_PKRU to be set in the header otherwise
-	 * get_xsave_addr() does not work and it also needs to be set to
-	 * make XRSTOR(S) load it.
-	 */
-	xsave->header.xfeatures |= XFEATURE_MASK_PKRU;
-	pk = get_xsave_addr(xsave, XFEATURE_PKRU);
-	pk->pkru = pkru_get_init_value();
-}
-
 /*
  * Reset current->fpu memory state to the init values.
  */
@@ -437,9 +420,12 @@ static void fpu_reset_fpstate(void)
 	 *
 	 * Do not use fpstate_init() here. Just copy init_fpstate which has
 	 * the correct content already except for PKRU.
+	 *
+	 * PKRU handling does not rely on the xstate when restoring for
+	 * user space as PKRU is eagerly written in switch_to() and
+	 * flush_thread().
 	 */
 	memcpy(&fpu->state, &init_fpstate, init_fpstate_copy_size());
-	pkru_set_default_in_xstate(&fpu->state.xsave);
 	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 }


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [patch 41/41] x86/pkru: Remove xstate fiddling from write_pkru()
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (39 preceding siblings ...)
  2021-06-11 16:16 ` [patch 40/41] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
@ 2021-06-11 16:16 ` Thomas Gleixner
  2021-06-12  0:24 ` [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  41 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 16:16 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

The PKRU value of a task is stored in task->thread.pkru when the task is
scheduled out. PKRU is restored on schedule in from there. So keeping the
XSAVE buffer up to date is a pointless exercise.

Remove the xstate fiddling and cleanup all related functions.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: New patch
---
 arch/x86/include/asm/pkru.h          |   17 ++++-------------
 arch/x86/include/asm/special_insns.h |   14 +-------------
 arch/x86/kvm/x86.c                   |    4 ++--
 3 files changed, 7 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/pkru.h
+++ b/arch/x86/include/asm/pkru.h
@@ -41,23 +41,14 @@ static inline u32 read_pkru(void)
 
 static inline void write_pkru(u32 pkru)
 {
-	struct pkru_state *pk;
-
 	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
-
-	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
-
 	/*
-	 * The PKRU value in xstate needs to be in sync with the value that is
-	 * written to the CPU. The FPU restore on return to userland would
-	 * otherwise load the previous value again.
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
 	 */
-	fpregs_lock();
-	if (pk)
-		pk->pkru = pkru;
-	__write_pkru(pkru);
-	fpregs_unlock();
+	if (pkru != rdpkru())
+		wrpkru(pkru);
 }
 
 static inline void pkru_write_default(void)
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -104,25 +104,13 @@ static inline void wrpkru(u32 pkru)
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
 
-static inline void __write_pkru(u32 pkru)
-{
-	/*
-	 * WRPKRU is relatively expensive compared to RDPKRU.
-	 * Avoid WRPKRU when it would not change the value.
-	 */
-	if (pkru == rdpkru())
-		return;
-
-	wrpkru(pkru);
-}
-
 #else
 static inline u32 rdpkru(void)
 {
 	return 0;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 }
 #endif
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -943,7 +943,7 @@ void kvm_load_guest_xsave_state(struct k
 	    (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
 	    vcpu->arch.pkru != vcpu->arch.host_pkru)
-		__write_pkru(vcpu->arch.pkru);
+		write_pkru(vcpu->arch.pkru);
 }
 EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
 
@@ -957,7 +957,7 @@ void kvm_load_host_xsave_state(struct kv
 	     (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
 		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vcpu->arch.host_pkru)
-			__write_pkru(vcpu->arch.host_pkru);
+			write_pkru(vcpu->arch.host_pkru);
 	}
 
 	if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")
  2021-06-11 16:15 ` [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
@ 2021-06-11 17:04   ` Borislav Petkov
  0 siblings, 0 replies; 69+ messages in thread
From: Borislav Petkov @ 2021-06-11 17:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11, 2021 at 06:15:24PM +0200, Thomas Gleixner wrote:
> This cannot work and it's unclear how that ever made a difference.


Subject: Re: [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate")

Add subject prefix pls:

x86/pkeys: Revert...

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init
  2021-06-11 16:15 ` [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
@ 2021-06-11 17:21   ` Borislav Petkov
  2021-06-11 18:35   ` Andy Lutomirski
  1 sibling, 0 replies; 69+ messages in thread
From: Borislav Petkov @ 2021-06-11 17:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11, 2021 at 06:15:25PM +0200, Thomas Gleixner wrote:
> Nothing modifies these after booting.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/fpu/init.c   |    4 ++--
>  arch/x86/kernel/fpu/xstate.c |   16 ++++++++++------
>  2 files changed, 12 insertions(+), 8 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-11 16:15 ` [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
@ 2021-06-11 18:15   ` Borislav Petkov
  2021-06-11 18:37   ` Andy Lutomirski
  1 sibling, 0 replies; 69+ messages in thread
From: Borislav Petkov @ 2021-06-11 18:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11, 2021 at 06:15:28PM +0200, Thomas Gleixner wrote:
> If the count argument is larger than the xstate size, this will happily
> copy beyond the end of xstate.
> 
> Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/kernel/fpu/regset.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -117,7 +117,7 @@ int xstateregs_set(struct task_struct *t
>  	/*
>  	 * A whole standard-format XSAVE buffer is needed:
>  	 */
> -	if ((pos != 0) || (count < fpu_user_xstate_size))
> +	if (pos != 0 || count != fpu_user_xstate_size)
>  		return -EFAULT;
>  
>  	xsave = &fpu->state.xsave;

Patches 3-5:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init
  2021-06-11 16:15 ` [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
  2021-06-11 17:21   ` Borislav Petkov
@ 2021-06-11 18:35   ` Andy Lutomirski
  1 sibling, 0 replies; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 18:35 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:
> Nothing modifies these after booting.
> 

Reviewed-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 03/41] x86/fpu: Remove unused get_xsave_field_ptr()
  2021-06-11 16:15 ` [patch 03/41] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
@ 2021-06-11 18:35   ` Andy Lutomirski
  0 siblings, 0 replies; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 18:35 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:

Reviewed-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-11 16:15 ` [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
  2021-06-11 18:15   ` Borislav Petkov
@ 2021-06-11 18:37   ` Andy Lutomirski
  2021-06-11 19:37     ` Thomas Gleixner
  1 sibling, 1 reply; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 18:37 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:
> If the count argument is larger than the xstate size, this will happily
> copy beyond the end of xstate.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

This interface is horrible.  Oh well.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 06/41] x86/fpu: Sanitize xstateregs_set()
  2021-06-11 16:15 ` [patch 06/41] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
@ 2021-06-11 18:45   ` Andy Lutomirski
  2021-06-11 20:23     ` Thomas Gleixner
  0 siblings, 1 reply; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 18:45 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:
> xstateregs_set() operates on a stopped task and tries to copy the provided
> buffer into the tasks fpu.state.xsave buffer.
> 
> Any error while copying or invalid state detected after copying results in
> wiping the target tasks FPU state completely including supervisor states.
> 
> That's just wrong. The caller supplied invalid data or has a problem with
> unmapped memory, so there is absolutely no justification to corrupt the
> target state.
> 
> @@ -1146,14 +1146,16 @@ int copy_kernel_to_xstate(struct xregs_s
>  	 */
>  	xsave->header.xfeatures |= hdr.xfeatures;
>  
> +	/* mxcsr reserved bits must be masked to zero for security reasons. */
> +	xsave->i387.mxcsr &= mxcsr_feature_mask;

This comment is vague.  At least it should say:

A subsequent XRSTOR(S) will fail if MXCSR has bits set that are not
accepted by the current CPU.  Mask out unsupported bits.

But a much nicer fix IMO would be to just return an error.

--Andy

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code
  2021-06-11 16:15 ` [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
@ 2021-06-11 18:47   ` Andy Lutomirski
  2021-06-12  9:13   ` Borislav Petkov
  1 sibling, 0 replies; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 18:47 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> ptrace() has interfaces that let a ptracer inspect a ptracee's register state.
> This includes XSAVE state.  The ptrace() ABI includes a hardware-format XSAVE
> buffer for both the SETREGS and GETREGS interfaces.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 16:15 ` [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components Thomas Gleixner
@ 2021-06-11 19:03   ` Andy Lutomirski
  2021-06-11 19:18     ` Andy Lutomirski
                       ` (2 more replies)
  0 siblings, 3 replies; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 19:03 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:
> xstateregs_get() does not longer use fpstate_sanitize_xstate() and the only

s/does not longer use/no longer uses/

		\
> --- a/arch/x86/kernel/fpu/regset.c
> +++ b/arch/x86/kernel/fpu/regset.c
> @@ -11,6 +11,39 @@
>  
>  #include <linux/sched/task_stack.h>
>  
> +/*
> + * When executing XSAVEOPT (or other optimized XSAVE instructions), if

The kernel doesn't use XSAVEOPT any more.  How about:

When executing XSAVES (or other optimized XSAVE instructions)

> + * a processor implementation detects that an FPU state component is still
> + * (or is again) in its initialized state, it may clear the corresponding
> + * bit in the header.xfeatures field, and can skip the writeout of registers
> + * to the corresponding memory layout.

Additionally, copy_xxx_to_xstate() may result in an xsave buffer with a
bit clear in xfeatures but the corresponding state region not containing
the state's init value.

> + *
> + * This means that when the bit is zero, the state component might still
> + * contain some previous - non-initialized register state.

Maybe say what the function does, e.g.:

This function fills in the init values for the X87 and SSE states if the
corresponding xfeatures bits are clear.

> + *
> + * This is required for the legacy regset functions.
> + */
> +static void fpstate_sanitize_legacy(struct fpu *fpu)
> +{
> +	struct fxregs_state *fx = &fpu->state.fxsave;
> +	u64 xfeatures;
> +
> +	if (!use_xsaveopt())
> +		return;

This is confusing, since we never use xsaveopt.  It's also wrong -- see
above.  How about just removing it?

> +
> +	xfeatures = fpu->state.xsave.header.xfeatures;
> +
> +	/* If FP is in init state, reinitialize it */
> +	if (!(xfeatures & XFEATURE_MASK_FP)) {
> +		memset(fx, 0, sizeof(*fx));
> +		fx->cwd = 0x37f;
> +	}
> +
> +	/* If SSE is in init state, clear the storage */
> +	if (!(xfeatures & XFEATURE_MASK_SSE))
> +		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
> +}
> +
>  

Does this result in the mxcsr_mask and mxcsr fields being correct?
There is a silly number of special cases there.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 19:03   ` Andy Lutomirski
@ 2021-06-11 19:18     ` Andy Lutomirski
  2021-06-11 20:33       ` Thomas Gleixner
  2021-06-11 20:27     ` Thomas Gleixner
  2021-06-11 22:12     ` Thomas Gleixner
  2 siblings, 1 reply; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 19:18 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 12:03 PM, Andy Lutomirski wrote:
> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>> xstateregs_get() does not longer use fpstate_sanitize_xstate() and the only
> 
> s/does not longer use/no longer uses/
> 
> 		\
>> --- a/arch/x86/kernel/fpu/regset.c
>> +++ b/arch/x86/kernel/fpu/regset.c
>> @@ -11,6 +11,39 @@
>>  
>>  #include <linux/sched/task_stack.h>
>>  
>> +/*
>> + * When executing XSAVEOPT (or other optimized XSAVE instructions), if
> 
> The kernel doesn't use XSAVEOPT any more.  How about:
> 
> When executing XSAVES (or other optimized XSAVE instructions)
> 
>> + * a processor implementation detects that an FPU state component is still
>> + * (or is again) in its initialized state, it may clear the corresponding
>> + * bit in the header.xfeatures field, and can skip the writeout of registers
>> + * to the corresponding memory layout.
> 
> Additionally, copy_xxx_to_xstate() may result in an xsave buffer with a
> bit clear in xfeatures but the corresponding state region not containing
> the state's init value.
> 
>> + *
>> + * This means that when the bit is zero, the state component might still
>> + * contain some previous - non-initialized register state.
> 
> Maybe say what the function does, e.g.:
> 
> This function fills in the init values for the X87 and SSE states if the
> corresponding xfeatures bits are clear.
> 
>> + *
>> + * This is required for the legacy regset functions.
>> + */
>> +static void fpstate_sanitize_legacy(struct fpu *fpu)
>> +{
>> +	struct fxregs_state *fx = &fpu->state.fxsave;
>> +	u64 xfeatures;
>> +
>> +	if (!use_xsaveopt())
>> +		return;
> 
> This is confusing, since we never use xsaveopt.  It's also wrong -- see
> above.  How about just removing it?
> 
>> +
>> +	xfeatures = fpu->state.xsave.header.xfeatures;
>> +
>> +	/* If FP is in init state, reinitialize it */
>> +	if (!(xfeatures & XFEATURE_MASK_FP)) {
>> +		memset(fx, 0, sizeof(*fx));
>> +		fx->cwd = 0x37f;
>> +	}
>> +
>> +	/* If SSE is in init state, clear the storage */
>> +	if (!(xfeatures & XFEATURE_MASK_SSE))
>> +		memset(fx->xmm_space, 0, sizeof(fx->xmm_space));
>> +}
>> +
>>  
> 
> Does this result in the mxcsr_mask and mxcsr fields being correct?
> There is a silly number of special cases there.
> 

The SDM says, in a footnote in XRSTOR:

There is an exception if RFBM[1] = 0 and RFBM[2] = 1. In this case, the
standard form of XRSTOR will load MXCSR from memory, even though MXCSR
is part of state component 1 — SSE. The compacted form of XRSTOR does
not make this exception.

which makes me think that this code has a bug.  Also, the code is
manifest nonsense for a different reason:

if (!FP)
  memset(the whole damn thing, 0);

if (!SSE)
  memset(just part of it, 0);

which means that, if the FP bit is clear but the SSE bit is set, this
thing will clobber the SSE state.

This code is garbage.  So is the architecture that gave rise to this code.

--Andy

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set()
  2021-06-11 18:37   ` Andy Lutomirski
@ 2021-06-11 19:37     ` Thomas Gleixner
  0 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 19:37 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 11:37, Andy Lutomirski wrote:
> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>> If the count argument is larger than the xstate size, this will happily
>> copy beyond the end of xstate.
>
> Reviewed-by: Andy Lutomirski <luto@kernel.org>
>
> This interface is horrible.  Oh well.

It's ptrace is must be horrible :)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 11/41] x86/fpu: Get rid of copy_supervisor_to_kernel()
  2021-06-11 16:15 ` [patch 11/41] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
@ 2021-06-11 19:42   ` Andy Lutomirski
  0 siblings, 0 replies; 69+ messages in thread
From: Andy Lutomirski @ 2021-06-11 19:42 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 9:15 AM, Thomas Gleixner wrote:
> If the fast path of restoring the FPU state on sigreturn fails or is not
> taken and the current task's FPU is active then the FPU has to be
> deactivated for the slow path to allow a safe update of the tasks FPU
> memory state.
> 
> With supervisor states enabled, this requires to save the supervisor state
> in the memory state first. Supervisor states require XSAVES so saving only
> the supervisor state requires to reshuffle the memory buffer because XSAVES
> uses the compacted format and therefore stores the supervisor states at the
> beginning of the memory state. That's just an overengineered optimization.
> 
> Get rid of it and save the full state for this case.

Hallelujah.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 06/41] x86/fpu: Sanitize xstateregs_set()
  2021-06-11 18:45   ` Andy Lutomirski
@ 2021-06-11 20:23     ` Thomas Gleixner
  0 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 20:23 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 11:45, Andy Lutomirski wrote:
> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>> xstateregs_set() operates on a stopped task and tries to copy the provided
>> buffer into the tasks fpu.state.xsave buffer.
>> 
>> Any error while copying or invalid state detected after copying results in
>> wiping the target tasks FPU state completely including supervisor states.
>> 
>> That's just wrong. The caller supplied invalid data or has a problem with
>> unmapped memory, so there is absolutely no justification to corrupt the
>> target state.
>> 
>> @@ -1146,14 +1146,16 @@ int copy_kernel_to_xstate(struct xregs_s
>>  	 */
>>  	xsave->header.xfeatures |= hdr.xfeatures;
>>  
>> +	/* mxcsr reserved bits must be masked to zero for security reasons. */
>> +	xsave->i387.mxcsr &= mxcsr_feature_mask;
>
> This comment is vague.  At least it should say:
>
> A subsequent XRSTOR(S) will fail if MXCSR has bits set that are not
> accepted by the current CPU.  Mask out unsupported bits.
>
> But a much nicer fix IMO would be to just return an error.

That was the "fix" for CVE-2003-0248, but I forgot the details of this
issue and why the masking was done instead of returning an error. The
problem was not that it caused a #GP, there was some other implication
which I can't remember.

The reason why this masking was done was that the SDM was pretty vague
about the reserved bits and only newer processor generations had an
issue while older ones just ignored them and of course some existing
user space was sloppy and got away with random values in reserved bits
up to the point where newer CPUs got more sensitive about this.

Yes, the comment could get some love and we probably can just error out
today. Those ancient binaries should be in the bit bucket by now.

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 19:03   ` Andy Lutomirski
  2021-06-11 19:18     ` Andy Lutomirski
@ 2021-06-11 20:27     ` Thomas Gleixner
  2021-06-11 22:12     ` Thomas Gleixner
  2 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 20:27 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 12:03, Andy Lutomirski wrote:
> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>> xstateregs_get() does not longer use fpstate_sanitize_xstate() and the only
>> +/*
>> + * When executing XSAVEOPT (or other optimized XSAVE instructions), if
>
> The kernel doesn't use XSAVEOPT any more.  How about:

You sure? 

#define XSTATE_XSAVE(st, lmask, hmask, err)				\
	asm volatile(ALTERNATIVE_2(XSAVE,				\
				   XSAVEOPT, X86_FEATURE_XSAVEOPT,	\
				   XSAVES,   X86_FEATURE_XSAVES)	\
Thanks,

        tglx



^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 19:18     ` Andy Lutomirski
@ 2021-06-11 20:33       ` Thomas Gleixner
  2021-06-11 20:34         ` Thomas Gleixner
  0 siblings, 1 reply; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 20:33 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 12:18, Andy Lutomirski wrote:
> On 6/11/21 12:03 PM, Andy Lutomirski wrote:
>> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>> Does this result in the mxcsr_mask and mxcsr fields being correct?
>> There is a silly number of special cases there.

Let me stare at that.

> The SDM says, in a footnote in XRSTOR:
>
> There is an exception if RFBM[1] = 0 and RFBM[2] = 1. In this case, the
> standard form of XRSTOR will load MXCSR from memory, even though MXCSR
> is part of state component 1 — SSE. The compacted form of XRSTOR does
> not make this exception.
>
> which makes me think that this code has a bug.  Also, the code is
> manifest nonsense for a different reason:
>
> if (!FP)
>   memset(the whole damn thing, 0);
>
> if (!SSE)
>   memset(just part of it, 0);
>
> which means that, if the FP bit is clear but the SSE bit is set, this
> thing will clobber the SSE state.
>
> This code is garbage.  So is the architecture that gave rise to this code.

No argument about that :)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 20:33       ` Thomas Gleixner
@ 2021-06-11 20:34         ` Thomas Gleixner
  0 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 20:34 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 22:33, Thomas Gleixner wrote:

> On Fri, Jun 11 2021 at 12:18, Andy Lutomirski wrote:
>> On 6/11/21 12:03 PM, Andy Lutomirski wrote:
>>> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>>> Does this result in the mxcsr_mask and mxcsr fields being correct?
>>> There is a silly number of special cases there.
>
> Let me stare at that.
>
>> The SDM says, in a footnote in XRSTOR:
>>
>> There is an exception if RFBM[1] = 0 and RFBM[2] = 1. In this case, the
>> standard form of XRSTOR will load MXCSR from memory, even though MXCSR
>> is part of state component 1 — SSE. The compacted form of XRSTOR does
>> not make this exception.
>>
>> which makes me think that this code has a bug.  Also, the code is
>> manifest nonsense for a different reason:
>>
>> if (!FP)
>>   memset(the whole damn thing, 0);

That's actually my bug. The code I replaced was different. Lemme stare
at it and fix it.

>> if (!SSE)
>>   memset(just part of it, 0);
>>
>> which means that, if the FP bit is clear but the SSE bit is set, this
>> thing will clobber the SSE state.
>>
>> This code is garbage.  So is the architecture that gave rise to this code.
>
> No argument about that :)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 19:03   ` Andy Lutomirski
  2021-06-11 19:18     ` Andy Lutomirski
  2021-06-11 20:27     ` Thomas Gleixner
@ 2021-06-11 22:12     ` Thomas Gleixner
  2021-06-12 13:15       ` Thomas Gleixner
  2021-06-12 22:05       ` Thomas Gleixner
  2 siblings, 2 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-11 22:12 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 12:03, Andy Lutomirski wrote:
> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>> + *
>> + * This is required for the legacy regset functions.
>> + */
>> +static void fpstate_sanitize_legacy(struct fpu *fpu)
>> +{
>> +	struct fxregs_state *fx = &fpu->state.fxsave;
>> +	u64 xfeatures;
>> +
>> +	if (!use_xsaveopt())
>> +		return;
>
> This is confusing, since we never use xsaveopt.  It's also wrong -- see
> above.  How about just removing it?

We do and this code is buggy because xsaves does not clear the component
storage either. Neither does xsavec which we in fact do not use in the
kernel.

So here is how the different opcodes behave on a buffer filled with 0xff
when looking the first four 64bit words of the buffer after doing a
xrstor with all feature bits cleared

Intel SKLX
 
XSAVE    000000000000037f 0000000000000000 0000000000000000 0000ffff00001f80
XSAVEOPT ffffffffffffffff ffffffffffffffff ffffffffffffffff 0000ffff00001f80
XSAVEC   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
XSAVES   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

AMD ZEN2

XSAVE    000000000000037f 0000000000000000 0000000000000000 0002ffff00001f80
XSAVEOPT ffffffffffffffff ffffffffffffffff ffffffffffffffff 0002ffff00001f80
XSAVEC   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
XSAVES   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff

I verified that all saved buffers have xstate.header.xstate_bv == 0

So nothing about any of this is consistent and correct. But it magically
works for unknown reasons.

Thanks,

        tglx




^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing
  2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
                   ` (40 preceding siblings ...)
  2021-06-11 16:16 ` [patch 41/41] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
@ 2021-06-12  0:24 ` Thomas Gleixner
  2021-06-12  0:40   ` Dave Hansen
  2021-06-16 20:55   ` Dave Hansen
  41 siblings, 2 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-12  0:24 UTC (permalink / raw)
  To: LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11 2021 at 18:15, Thomas Gleixner wrote:
>   - Removal of PKRU from being XSTATE managed in the kernel because PKRU
>     has to be eagerly restored on context switch and keeping it in sync
>     in the xstate buffer is just pointless overhead and fragile.

Just before anyone comes up with any complaints about the resulting
inconsistency vs. xgetbv(1) in the case that the PKRU value is 0.

That inconsistency is simply a INTEL only hardware bug and there is no
way to get this consistent ever no matter what kind of mechanism the
kernel uses. This inconsistency can be demonstrated in user space w/o
any kernel interaction.

The Intel SDM states in volume 1, chapter 13.6

 PROCESSOR TRACKING OF XSAVE-MANAGED STATE

  * PKRU state. PKRU state is in its initial configuration if the value
    of the PKRU is 0.

But that's just not true.

        wrpkru(0)
        assert(!(xgetbv(1) & XFEATURE_PKRU);

fails on Intel but not on AMD AFAIK. xgetbv(1) returns the 'INUSE'
bitmap of xstate managed features.

But the Intel SDM is blury about this:

  XINUSE denotes the state-component bitmap corresponding to the init
  optimization. If XINUSE[i] = 0, state component i is known to be in
  its initial configuration; otherwise XINUSE[i] = 1. It is possible for
  XINUSE[i] to be 1 even when state component i is in its initial
  configuration. On a processor that does not support the init
  optimization, XINUSE[i] is always 1 for every value of i.

IOW there is no consistency vs. XINUSE and initial state guaranteed at
all. So why should the kernel worry about this?

We just use the most optimized way to deal with this and that's what
this patch series is doing by removing PKRU from xstate management in
the kernel.

If anyone cares about consistency of XINUSE vs. the actual component
state then please redirect the complaints to INTEL.

Either the hardware folks get their act together or software which
relies on consistency (cough, cough) like rr has to cope with it.

Making the kernel to pretend that all of this is consistent under all
circumstances is a futile attempt to ignore reality.

This inconsistency can only be fixed in hardware/ucode. End of story.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing
  2021-06-12  0:24 ` [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
@ 2021-06-12  0:40   ` Dave Hansen
  2021-06-16 20:55   ` Dave Hansen
  1 sibling, 0 replies; 69+ messages in thread
From: Dave Hansen @ 2021-06-12  0:40 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 5:24 PM, Thomas Gleixner wrote:
> The Intel SDM states in volume 1, chapter 13.6
> 
>  PROCESSOR TRACKING OF XSAVE-MANAGED STATE
> 
>   * PKRU state. PKRU state is in its initial configuration if the value
>     of the PKRU is 0.>
...
> IOW there is no consistency vs. XINUSE and initial state guaranteed at
> all. So why should the kernel worry about this?

This is my feeling on it as well.  I may be a bit corrupted by having
talked to the hardware folks who have built the init state "trackers",
but I've always had the opinion that XINUSE is very weakly defined.

There are some other bits of the SDM that you noted that clearly call
out that XINUSE is not strictly tied to the *value* of the component.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code
  2021-06-11 16:15 ` [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
  2021-06-11 18:47   ` Andy Lutomirski
@ 2021-06-12  9:13   ` Borislav Petkov
  1 sibling, 0 replies; 69+ messages in thread
From: Borislav Petkov @ 2021-06-12  9:13 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11, 2021 at 06:15:30PM +0200, Thomas Gleixner wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> ptrace() has interfaces that let a ptracer inspect a ptracee's register state.
> This includes XSAVE state.  The ptrace() ABI includes a hardware-format XSAVE
> buffer for both the SETREGS and GETREGS interfaces.
> 
> In the old days, the kernel buffer and the ptrace() ABI buffer were the
> same boring non-compacted format.  But, since the advent of supervisor
> states and the compacted format, the kernel buffer has diverged from the
> format presented in the ABI.
> 
> This leads to two paths in the kernel:
> 1. Effectively a verbatim copy_to_user() which just copies the kernel buffer
>    out to userspace.  This is used when the kernel buffer is kept in the
>    non-compacted form which means that it shares a format with the ptrace
>    ABI.
> 2. A one-state-at-a-time path: copy_xstate_to_kernel().  This is theoretically
>    slower since it does a bunch of piecemeal copies.
> 
> Remove the verbatim copy case.  Speed probably does not matter in this path,
> and the vast majority of new hardware will use the one-state-at-a-time path
> anyway.  This ensures greater testing for the "slow" path.
> 
> This also makes enabling PKRU in this interface easier since a single path
> can be patched instead of two.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V4: Picked up from Dave's PKRU series
> ---
>  arch/x86/kernel/fpu/regset.c |   22 ++--------------------
>  arch/x86/kernel/fpu/xstate.c |    6 +++---
>  2 files changed, 5 insertions(+), 23 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 22:12     ` Thomas Gleixner
@ 2021-06-12 13:15       ` Thomas Gleixner
  2021-06-12 22:05       ` Thomas Gleixner
  1 sibling, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-12 13:15 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Sat, Jun 12 2021 at 00:12, Thomas Gleixner wrote:
> On Fri, Jun 11 2021 at 12:03, Andy Lutomirski wrote:
>> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>>> + *
>>> + * This is required for the legacy regset functions.
>>> + */
>>> +static void fpstate_sanitize_legacy(struct fpu *fpu)
>>> +{
>>> +	struct fxregs_state *fx = &fpu->state.fxsave;
>>> +	u64 xfeatures;
>>> +
>>> +	if (!use_xsaveopt())
>>> +		return;
>>
>> This is confusing, since we never use xsaveopt.  It's also wrong -- see
>> above.  How about just removing it?
>
> We do and this code is buggy because xsaves does not clear the component
> storage either. Neither does xsavec which we in fact do not use in the
> kernel.

So that kinda works because the CPUs which have XSAVES have XSAVEOPT
too. But XSAVESOPT can be disabled on the command line, which then makes
that condition false...

Oh well.


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components
  2021-06-11 22:12     ` Thomas Gleixner
  2021-06-12 13:15       ` Thomas Gleixner
@ 2021-06-12 22:05       ` Thomas Gleixner
  1 sibling, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-12 22:05 UTC (permalink / raw)
  To: Andy Lutomirski, LKML
  Cc: Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Sat, Jun 12 2021 at 00:12, Thomas Gleixner wrote:
> On Fri, Jun 11 2021 at 12:03, Andy Lutomirski wrote:
>> On 6/11/21 9:15 AM, Thomas Gleixner wrote:
>>> + *
>>> + * This is required for the legacy regset functions.
>>> + */
>>> +static void fpstate_sanitize_legacy(struct fpu *fpu)
>>> +{
>>> +	struct fxregs_state *fx = &fpu->state.fxsave;
>>> +	u64 xfeatures;
>>> +
>>> +	if (!use_xsaveopt())
>>> +		return;
>>
>> This is confusing, since we never use xsaveopt.  It's also wrong -- see
>> above.  How about just removing it?
>
> We do and this code is buggy because xsaves does not clear the component
> storage either. Neither does xsavec which we in fact do not use in the
> kernel.
>
> So here is how the different opcodes behave on a buffer filled with 0xff
> when looking the first four 64bit words of the buffer after doing a
> xrstor with all feature bits cleared
>
> Intel SKLX
>  
> XSAVE    000000000000037f 0000000000000000 0000000000000000 0000ffff00001f80
> XSAVEOPT ffffffffffffffff ffffffffffffffff ffffffffffffffff 0000ffff00001f80
> XSAVEC   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> XSAVES   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
>
> AMD ZEN2
>
> XSAVE    000000000000037f 0000000000000000 0000000000000000 0002ffff00001f80
> XSAVEOPT ffffffffffffffff ffffffffffffffff ffffffffffffffff 0002ffff00001f80
> XSAVEC   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
> XSAVES   ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
>
> I verified that all saved buffers have xstate.header.xstate_bv == 0
>
> So nothing about any of this is consistent and correct. But it magically
> works for unknown reasons.

What's even worse is the following. setup_init_fpu_buf() does:

	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
	/*
	 * Dump the init state again. This is to identify the init state
	 * of any feature which is not represented by all zero's.
	 */
	copy_xregs_to_kernel_booting(&init_fpstate.xsave);

That comment is blatantly wrong with XSAVES/XRSTORS. init_fpstate is
initially all zeros and it stays that way with XRSTORS. Oh well.

And as the intialization values at least for mxcsr_mask differ on AMD
and INTEL making them hardcoded is just wrong. Sigh...

So we could save the state with XSAVE into a different buffer and copy
the components to the correct place in the compacted init_fpstate, but
the only interesting part of all this with any variant of XSAVE is the
legacy part. Everything else is just zeros (except for HDC and HWP
states which we do not use and they can be accessed via MSRs if the need
ever arises). We could just save the legacy part with fxsave and be done
with it.

Aside of that this whole _booting() stuff is complete nonsense. It's
completely sufficient to XRSTOR from an all zeroes buffer which brings
every components into init state and then do all the other muck _after_
alternatives have been patched. Absolutely nothing uses FPU muck before
that point.

While staring at all this I figured out why that sanitizing does not and
_cannot_ touch MXCSR and MXCSR_MASK.

MXCSR and MXCSR_MASK are located in the FP component storage and used by
SSE. But MXCSR is also XSAVEd when AVX is in use.

So the sanitizing cannot touch it without checking whether AVX is in
use. This is really all well thought out in hardware _AND_ software.

Let me try to beat some more sense into this trainwreck.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-11 16:15 ` [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
@ 2021-06-14 10:26   ` Borislav Petkov
  2021-06-14 19:34     ` Dave Hansen
  0 siblings, 1 reply; 69+ messages in thread
From: Borislav Petkov @ 2021-06-14 10:26 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

On Fri, Jun 11, 2021 at 06:15:32PM +0200, Thomas Gleixner wrote:
> -		if (src) {
> -			u32 size, offset, ecx, edx;
> -			cpuid_count(XSTATE_CPUID, xfeature_nr,
> -				    &size, &offset, &ecx, &edx);
> -			if (xfeature_nr == XFEATURE_PKRU)
> -				memcpy(dest + offset, &vcpu->arch.pkru,
> -				       sizeof(vcpu->arch.pkru));
> -			else
> -				memcpy(dest + offset, src, size);
> +		cpuid_count(XSTATE_CPUID, xfeature_nr,
> +			    &size, &offset, &ecx, &edx);
>  
> +		if (xfeature_nr == XFEATURE_PKRU) {
> +			memcpy(dest + offset, &vcpu->arch.pkru,
> +			       sizeof(vcpu->arch.pkru));
> +		} else {
> +			src = get_xsave_addr(xsave, xfeature_nr);
> +			if (src)
> +				memcpy(dest + offset, src, size);
>  		}
>  
>  		valid -= xfeature_mask;

How about pulling up that PKRU case even further (pasting the whole
changed loop instead of an unreadable diff) and keeping the CPUID access
and the xsave address handling separate?

        valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
        while (valid) {
                u32 size, offset, ecx, edx;
                u64 xfeature_mask = valid & -valid;
                int xfeature_nr = fls64(xfeature_mask) - 1;
                void *src;

                if (xfeature_nr == XFEATURE_PKRU) {
                        memcpy(dest + offset, &vcpu->arch.pkru, sizeof(vcpu->arch.pkru));
                        continue;
                }

                cpuid_count(XSTATE_CPUID, xfeature_nr, &size, &offset, &ecx, &edx);

                src = get_xsave_addr(xsave, xfeature_nr);
                if (src)
                        memcpy(dest + offset, src, size);

                valid -= xfeature_mask;
        }

Btw, I'm wondering if that CPUID read in a loop can be replaced with
adding accessors for xstate_{offsets,sizes,..} etc and providing them to
kvm...


> @@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *
>  	 */
>  	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
>  	while (valid) {
> +		u32 size, offset, ecx, edx;
>  		u64 xfeature_mask = valid & -valid;
>  		int xfeature_nr = fls64(xfeature_mask) - 1;
> -		void *dest = get_xsave_addr(xsave, xfeature_nr);
>  
> -		if (dest) {
> -			u32 size, offset, ecx, edx;
> -			cpuid_count(XSTATE_CPUID, xfeature_nr,
> -				    &size, &offset, &ecx, &edx);
> -			if (xfeature_nr == XFEATURE_PKRU)
> -				memcpy(&vcpu->arch.pkru, src + offset,
> -				       sizeof(vcpu->arch.pkru));
> -			else
> +		cpuid_count(XSTATE_CPUID, xfeature_nr,
> +			    &size, &offset, &ecx, &edx);
> +
> +		if (xfeature_nr == XFEATURE_PKRU) {
> +			memcpy(&vcpu->arch.pkru, src + offset,
> +			       sizeof(vcpu->arch.pkru));
> +		} else {
> +			void *dest = get_xsave_addr(xsave, xfeature_nr);
> +


^ Superfluous newline.

> +			if (dest)
>  				memcpy(dest, src + offset, size);
>  		}
>  
> 

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-14 10:26   ` Borislav Petkov
@ 2021-06-14 19:34     ` Dave Hansen
  2021-06-15 10:09       ` Borislav Petkov
  0 siblings, 1 reply; 69+ messages in thread
From: Dave Hansen @ 2021-06-14 19:34 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner
  Cc: LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Peter Zijlstra,
	Kan Liang

[-- Attachment #1: Type: text/plain, Size: 3482 bytes --]

On 6/14/21 3:26 AM, Borislav Petkov wrote:
> On Fri, Jun 11, 2021 at 06:15:32PM +0200, Thomas Gleixner wrote:
>> -		if (src) {
>> -			u32 size, offset, ecx, edx;
>> -			cpuid_count(XSTATE_CPUID, xfeature_nr,
>> -				    &size, &offset, &ecx, &edx);
>> -			if (xfeature_nr == XFEATURE_PKRU)
>> -				memcpy(dest + offset, &vcpu->arch.pkru,
>> -				       sizeof(vcpu->arch.pkru));
>> -			else
>> -				memcpy(dest + offset, src, size);
>> +		cpuid_count(XSTATE_CPUID, xfeature_nr,
>> +			    &size, &offset, &ecx, &edx);
>>  
>> +		if (xfeature_nr == XFEATURE_PKRU) {
>> +			memcpy(dest + offset, &vcpu->arch.pkru,
>> +			       sizeof(vcpu->arch.pkru));
>> +		} else {
>> +			src = get_xsave_addr(xsave, xfeature_nr);
>> +			if (src)
>> +				memcpy(dest + offset, src, size);
>>  		}
>>  
>>  		valid -= xfeature_mask;
> 
> How about pulling up that PKRU case even further (pasting the whole
> changed loop instead of an unreadable diff) and keeping the CPUID access
> and the xsave address handling separate?
> 
>         valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
>         while (valid) {
>                 u32 size, offset, ecx, edx;
>                 u64 xfeature_mask = valid & -valid;
>                 int xfeature_nr = fls64(xfeature_mask) - 1;
>                 void *src;
> 
>                 if (xfeature_nr == XFEATURE_PKRU) {
>                         memcpy(dest + offset, &vcpu->arch.pkru, sizeof(vcpu->arch.pkru));
>                         continue;
>                 }
> 
>                 cpuid_count(XSTATE_CPUID, xfeature_nr, &size, &offset, &ecx, &edx);
> 
>                 src = get_xsave_addr(xsave, xfeature_nr);
>                 if (src)
>                         memcpy(dest + offset, src, size);
> 
>                 valid -= xfeature_mask;
>         }

I gave that a shot.  Two wrinkles: The PKRU memcpy() needs 'offset' from
cpuid_count() and the PKRU case also needs the 'valid -=' manipulation.
 The result is attached, and while it makes the diff look better, I
don't think the resulting code is an improvement.

> Btw, I'm wondering if that CPUID read in a loop can be replaced with
> adding accessors for xstate_{offsets,sizes,..} etc and providing them to
> kvm...

I *think* these are already stored in xfeature_uncompacted_offset[].  It
would be a pretty simple matter to export it.  I just assumed that this
is a slow enough path that the KVM folks don't care.

>> @@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *
>>  	 */
>>  	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
>>  	while (valid) {
>> +		u32 size, offset, ecx, edx;
>>  		u64 xfeature_mask = valid & -valid;
>>  		int xfeature_nr = fls64(xfeature_mask) - 1;
>> -		void *dest = get_xsave_addr(xsave, xfeature_nr);
>>  
>> -		if (dest) {
>> -			u32 size, offset, ecx, edx;
>> -			cpuid_count(XSTATE_CPUID, xfeature_nr,
>> -				    &size, &offset, &ecx, &edx);
>> -			if (xfeature_nr == XFEATURE_PKRU)
>> -				memcpy(&vcpu->arch.pkru, src + offset,
>> -				       sizeof(vcpu->arch.pkru));
>> -			else
>> +		cpuid_count(XSTATE_CPUID, xfeature_nr,
>> +			    &size, &offset, &ecx, &edx);
>> +
>> +		if (xfeature_nr == XFEATURE_PKRU) {
>> +			memcpy(&vcpu->arch.pkru, src + offset,
>> +			       sizeof(vcpu->arch.pkru));
>> +		} else {
>> +			void *dest = get_xsave_addr(xsave, xfeature_nr);
>> +
> 
> 
> ^ Superfluous newline.

I'm happy to change it, but I usually like to separate declarations from
pure code.  Although, I guess that's a bit inconsistent in that file.

[-- Attachment #2: pkrukvm.patch --]
[-- Type: text/x-patch, Size: 3007 bytes --]

commit a761b0a48fb62429bd98c9a1275ef3ce33f9925a
Author: Dave Hansen <dave.hansen@linux.intel.com>
Date:   Thu Jun 3 16:08:12 2021 -0700

    x86/kvm: Avoid looking up PKRU in XSAVE buffer
    
    PKRU is being removed from the kernel XSAVE/FPU buffers.  This removal
    will probably include warnings for code that look up PKRU in those
    buffers.
    
    KVM currently looks up the location of PKRU but doesn't even use the
    pointer that it gets back.  Rework the code to avoid calling
    get_xsave_addr() except in cases where its result is actually used.
    
    This makes the code more clear and also avoids the inevitable PKRU
    warnings.
    
    This is probably a good cleanup and could go upstream idependently
    of any PKRU rework.
    
    Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    
    --

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b594275d49b5..ed4c3d90a18b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4589,20 +4589,21 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *src = get_xsave_addr(xsave, xfeature_nr);
-
-		if (src) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(dest + offset, &vcpu->arch.pkru,
-				       sizeof(vcpu->arch.pkru));
-			else
-				memcpy(dest + offset, src, size);
+		void *src;
+
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
 
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(dest + offset, &vcpu->arch.pkru,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			src = get_xsave_addr(xsave, xfeature_nr);
+			if (src)
+				memcpy(dest + offset, src, size);
 		}
 
 		valid -= xfeature_mask;
@@ -4632,18 +4633,20 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
+		u32 size, offset, ecx, edx;
 		u64 xfeature_mask = valid & -valid;
 		int xfeature_nr = fls64(xfeature_mask) - 1;
-		void *dest = get_xsave_addr(xsave, xfeature_nr);
-
-		if (dest) {
-			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, xfeature_nr,
-				    &size, &offset, &ecx, &edx);
-			if (xfeature_nr == XFEATURE_PKRU)
-				memcpy(&vcpu->arch.pkru, src + offset,
-				       sizeof(vcpu->arch.pkru));
-			else
+
+		cpuid_count(XSTATE_CPUID, xfeature_nr,
+			    &size, &offset, &ecx, &edx);
+
+		if (xfeature_nr == XFEATURE_PKRU) {
+			memcpy(&vcpu->arch.pkru, src + offset,
+			       sizeof(vcpu->arch.pkru));
+		} else {
+			void *dest = get_xsave_addr(xsave, xfeature_nr);
+
+			if (dest)
 				memcpy(dest, src + offset, size);
 		}
 

^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer
  2021-06-14 19:34     ` Dave Hansen
@ 2021-06-15 10:09       ` Borislav Petkov
  0 siblings, 0 replies; 69+ messages in thread
From: Borislav Petkov @ 2021-06-15 10:09 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Dave Hansen, Fenghua Yu,
	Tony Luck, Yu-cheng Yu, Sebastian Andrzej Siewior,
	Peter Zijlstra, Kan Liang

On Mon, Jun 14, 2021 at 12:34:31PM -0700, Dave Hansen wrote:
> I gave that a shot.  Two wrinkles: The PKRU memcpy() needs 'offset' from
> cpuid_count() and the PKRU case also needs the 'valid -=' manipulation.
>  The result is attached, and while it makes the diff look better, I
> don't think the resulting code is an improvement.

Bah, that was too much wishful and faulty thinking on my part, forget
what I said.

> I *think* these are already stored in xfeature_uncompacted_offset[].  It
> would be a pretty simple matter to export it.  I just assumed that this
> is a slow enough path that the KVM folks don't care.

I guess. Yeah, let's cleanup the FPU mess first and then see what makes
sense or not.

> I'm happy to change it, but I usually like to separate declarations from
> pure code.  Although, I guess that's a bit inconsistent in that file.

No, this is what I mean:

+			src = get_xsave_addr(xsave, xfeature_nr);
+			if (src)
+				memcpy(dest + offset, src, size);

vs

+			void *dest = get_xsave_addr(xsave, xfeature_nr);
+
+			if (dest)
 				memcpy(dest, src + offset, size);

both in your patch.

It is a lot easier when reading the code to have the error handling
glued together with the previous function call.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing
  2021-06-12  0:24 ` [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
  2021-06-12  0:40   ` Dave Hansen
@ 2021-06-16 20:55   ` Dave Hansen
  2021-06-17  7:06     ` Thomas Gleixner
  1 sibling, 1 reply; 69+ messages in thread
From: Dave Hansen @ 2021-06-16 20:55 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On 6/11/21 5:24 PM, Thomas Gleixner wrote:
> The Intel SDM states in volume 1, chapter 13.6
> 
>  PROCESSOR TRACKING OF XSAVE-MANAGED STATE
> 
>   * PKRU state. PKRU state is in its initial configuration if the value
>     of the PKRU is 0.
> 
> But that's just not true.
> 
>         wrpkru(0)
>         assert(!(xgetbv(1) & XFEATURE_PKRU);

Hi Thomas,

It's pretty clear that Intel's implementation was intentional.  It was
certainly no accident that it was implemented this way.

I'm a bit confused why you expected to see XINUSE[PKRU]=0 up in your
example.  The CPU is *free* to set XINUSE[PKRU]=0, but it appears that
the example expects that it *must* set XINUSE[PKRU]=0.

I do wish the SDM had been excruciatingly explicit  in defining:

	initial configuration
vs.
	initial state

All features in their "initial state" have their "initial configuration"
values, but not all features with their "initial configuration" values
are in their "initial state".

> But the Intel SDM is blury about this:
> 
>   XINUSE denotes the state-component bitmap corresponding to the init
>   optimization. If XINUSE[i] = 0, state component i is known to be in
>   its initial configuration; otherwise XINUSE[i] = 1. It is possible for
>   XINUSE[i] to be 1 even when state component i is in its initial
>   configuration. On a processor that does not support the init
>   optimization, XINUSE[i] is always 1 for every value of i.
> 
> IOW there is no consistency vs. XINUSE and initial state guaranteed at
> all. So why should the kernel worry about this?

Exactly, there is really no consistency guarantee.  This was written to
give the CPU designers some flexibility so that they could opt to omit
"init tracker" hardware if they chose.  Or, so that they could be a bit
lazy about implementing one.

Imagine what would happen if the AMD PKRU init tracking behavior (write
all 0's, get XINUSE[PKRU]=0) was *required* XSAVE behavior. Every ZMM
register write would potentially need to go checking ~2k of state to see
if the rest of the state is all 0's.

> If anyone cares about consistency of XINUSE vs. the actual component
> state then please redirect the complaints to INTEL.

I think we can take a _bit_ of the blame on the kernel side too.  The
kernel has very good reasons for managing PKRU with WRPKRU instead of
XSAVE.  *But*, it also tossed out XINUSE[PKRU] consistency in the process.

I'm not sure we should be looking to the hardware to bring that back.

> Either the hardware folks get their act together or software which
> relies on consistency (cough, cough) like rr has to cope with it.
> 
> Making the kernel to pretend that all of this is consistent under all
> circumstances is a futile attempt to ignore reality.
> 
> This inconsistency can only be fixed in hardware/ucode. End of story.

I agree with this.  If it's going to be fixed, the kernel simply doesn't
have the tools to do it.  We either need new ISA or new hardware/ucode.

I'm just not convinced it's worth fixing for PKRU.

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing
  2021-06-16 20:55   ` Dave Hansen
@ 2021-06-17  7:06     ` Thomas Gleixner
  0 siblings, 0 replies; 69+ messages in thread
From: Thomas Gleixner @ 2021-06-17  7:06 UTC (permalink / raw)
  To: Dave Hansen, LKML
  Cc: Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck, Yu-cheng Yu,
	Sebastian Andrzej Siewior, Borislav Petkov, Peter Zijlstra,
	Kan Liang

On Wed, Jun 16 2021 at 13:55, Dave Hansen wrote:
> On 6/11/21 5:24 PM, Thomas Gleixner wrote:
>> Making the kernel to pretend that all of this is consistent under all
>> circumstances is a futile attempt to ignore reality.
>> 
>> This inconsistency can only be fixed in hardware/ucode. End of story.
>
> I agree with this.  If it's going to be fixed, the kernel simply doesn't
> have the tools to do it.  We either need new ISA or new hardware/ucode.
>
> I'm just not convinced it's worth fixing for PKRU.

Me neither.

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2021-06-17  7:06 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-11 16:15 [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
2021-06-11 16:15 ` [patch 01/41] Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
2021-06-11 17:04   ` Borislav Petkov
2021-06-11 16:15 ` [patch 02/41] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
2021-06-11 17:21   ` Borislav Petkov
2021-06-11 18:35   ` Andy Lutomirski
2021-06-11 16:15 ` [patch 03/41] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
2021-06-11 18:35   ` Andy Lutomirski
2021-06-11 16:15 ` [patch 04/41] x86/fpu: Move inlines where they belong Thomas Gleixner
2021-06-11 16:15 ` [patch 05/41] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
2021-06-11 18:15   ` Borislav Petkov
2021-06-11 18:37   ` Andy Lutomirski
2021-06-11 19:37     ` Thomas Gleixner
2021-06-11 16:15 ` [patch 06/41] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
2021-06-11 18:45   ` Andy Lutomirski
2021-06-11 20:23     ` Thomas Gleixner
2021-06-11 16:15 ` [patch 07/41] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
2021-06-11 18:47   ` Andy Lutomirski
2021-06-12  9:13   ` Borislav Petkov
2021-06-11 16:15 ` [patch 08/41] x86/fpu: Restrict fpstate sanitizing to legacy components Thomas Gleixner
2021-06-11 19:03   ` Andy Lutomirski
2021-06-11 19:18     ` Andy Lutomirski
2021-06-11 20:33       ` Thomas Gleixner
2021-06-11 20:34         ` Thomas Gleixner
2021-06-11 20:27     ` Thomas Gleixner
2021-06-11 22:12     ` Thomas Gleixner
2021-06-12 13:15       ` Thomas Gleixner
2021-06-12 22:05       ` Thomas Gleixner
2021-06-11 16:15 ` [patch 09/41] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
2021-06-14 10:26   ` Borislav Petkov
2021-06-14 19:34     ` Dave Hansen
2021-06-15 10:09       ` Borislav Petkov
2021-06-11 16:15 ` [patch 10/41] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
2021-06-11 16:15 ` [patch 11/41] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
2021-06-11 19:42   ` Andy Lutomirski
2021-06-11 16:15 ` [patch 12/41] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
2021-06-11 16:15 ` [patch 13/41] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
2021-06-11 16:15 ` [patch 14/41] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
2021-06-11 16:15 ` [patch 15/41] x86/fpu: Rename fregs " Thomas Gleixner
2021-06-11 16:15 ` [patch 16/41] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
2021-06-11 16:15 ` [patch 17/41] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
2021-06-11 16:15 ` [patch 18/41] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
2021-06-11 16:15 ` [patch 19/41] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
2021-06-11 16:15 ` [patch 20/41] x86/fpu: Rename initstate copy functions Thomas Gleixner
2021-06-11 16:15 ` [patch 21/41] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
2021-06-11 16:15 ` [patch 22/41] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
2021-06-11 16:15 ` [patch 23/41] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
2021-06-11 16:15 ` [patch 24/41] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
2021-06-11 16:15 ` [patch 25/41] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
2021-06-11 16:15 ` [patch 26/41] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
2021-06-11 16:15 ` [patch 27/41] x86/pkru: Provide pkru_write_default() Thomas Gleixner
2021-06-11 16:15 ` [patch 28/41] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
2021-06-11 16:15 ` [patch 29/41] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
2021-06-11 16:15 ` [patch 30/41] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
2021-06-11 16:15 ` [patch 31/41] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
2021-06-11 16:15 ` [patch 32/41] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
2021-06-11 16:15 ` [patch 33/41] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
2021-06-11 16:15 ` [patch 34/41] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
2021-06-11 16:15 ` [patch 35/41] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
2021-06-11 16:15 ` [patch 36/41] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
2021-06-11 16:16 ` [patch 37/41] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
2021-06-11 16:16 ` [patch 38/41] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
2021-06-11 16:16 ` [patch 39/41] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
2021-06-11 16:16 ` [patch 40/41] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
2021-06-11 16:16 ` [patch 41/41] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
2021-06-12  0:24 ` [patch 00/41] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
2021-06-12  0:40   ` Dave Hansen
2021-06-16 20:55   ` Dave Hansen
2021-06-17  7:06     ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.