linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout
@ 2022-03-24 13:47 Thomas Gleixner
  2022-03-24 13:47 ` [patch 1/7] x86/fpu: Remove redundant XCOMP_BV initialization Thomas Gleixner
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae, Paolo Bonzini, Fenghua Yu, Tony Luck

Folks!

The (re)enabling of ENQCMD or the enabling of any supervisor only state
results in a possible inconsistency of the host and guest FPU XSTATE layout
on systems which support that feature.

The reason is that on the host the FPU state has to include supervisor
features while the guest FPU state is strictly user features only.

The problem is restricted to a few places:

  1) The size calculation in the dynamic state permission PRCTL which lacks
     adding the supervisor state size for calculating the kernel buffer
     size.

  2) The offset calculations in the copy to/from UABI functions which
     use precalculated offsets which are only valid for the host.

The cure is to use runtime calculation for the offsets based on the
XCOMP_BV bitmask in the XSTATE header in case of compacted buffers. For
non-compacted format nothing changes.

The following series addresses this and it turns out to be an overall
valuable cleanup and simplification of the code:

 core.c   |    3 
 xstate.c |  211 ++++++++++++++++++---------------------------------------------
 2 files changed, 62 insertions(+), 152 deletions(-)

The result of the consolidation of the buffer size calculation (last patch
in the series) is a significant reduction of cycles spent for initializing
XSTATE due to the avoidance of a gazillion redundant CPUID invocations:

 Before: XSTATE init: 174344 cycles
 After:  XSTATE init:  73890 cycles

It builds, boots on host and guest, but is not yet extensively tested.
Testing with a AMX + PASID enabled machine has not been done at all as I
don't have easy access to such a beast.

The series is based on:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/urgent

and available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/fpu

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 1/7] x86/fpu: Remove redundant XCOMP_BV initialization
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-31  9:01   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2022-03-24 13:47 ` [patch 2/7] x86/fpu: Remove unused supervisor only offsets Thomas Gleixner
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae, Paolo Bonzini

fpu_copy_uabi_to_guest_fpstate() initializes the XCOMP_BV field in the
XSAVE header. That's a leftover from the old KVM FPU buffer handling code.

Since d69c1382e1b7 ("x86/kvm: Convert FPU handling to a single swap
buffer") KVM uses the FPU core allocation code, which initializes the
XCOMP_BV field already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <bonzini@gnu.org>
---
 arch/x86/kernel/fpu/core.c |    3 ---
 1 file changed, 3 deletions(-)

--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -415,9 +415,6 @@ int fpu_copy_uabi_to_guest_fpstate(struc
 		xpkru = get_xsave_addr(&kstate->regs.xsave, XFEATURE_PKRU);
 		*vpkru = xpkru->pkru;
 	}
-
-	/* Ensure that XCOMP_BV is set up for XSAVES */
-	xstate_init_xcomp_bv(&kstate->regs.xsave, kstate->xfeatures);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(fpu_copy_uabi_to_guest_fpstate);


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 2/7] x86/fpu: Remove unused supervisor only offsets
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
  2022-03-24 13:47 ` [patch 1/7] x86/fpu: Remove redundant XCOMP_BV initialization Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2022-03-24 13:47 ` [patch 3/7] x86/fpu/xsave: Initialize offset/size cache early Thomas Gleixner
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae

No users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |   30 ------------------------------
 1 file changed, 30 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -83,8 +83,6 @@ static unsigned int xstate_sizes[XFEATUR
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
-	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 
 /*
  * Return whether the system supports a given xfeature.
@@ -325,33 +323,6 @@ static void __init setup_xstate_comp_off
 }
 
 /*
- * Setup offsets of a supervisor-state-only XSAVES buffer:
- *
- * The offsets stored in xstate_comp_offsets[] only work for one specific
- * value of the Requested Feature BitMap (RFBM).  In cases where a different
- * RFBM value is used, a different set of offsets is required.  This set of
- * offsets is for when RFBM=xfeatures_mask_supervisor().
- */
-static void __init setup_supervisor_only_offsets(void)
-{
-	unsigned int next_offset;
-	int i;
-
-	next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-
-	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
-		if (!xfeature_is_supervisor(i))
-			continue;
-
-		if (xfeature_is_aligned(i))
-			next_offset = ALIGN(next_offset, 64);
-
-		xstate_supervisor_only_offsets[i] = next_offset;
-		next_offset += xstate_sizes[i];
-	}
-}
-
-/*
  * Print out xstate component offsets and sizes
  */
 static void __init print_xstate_offset_size(void)
@@ -951,7 +922,6 @@ void __init fpu__init_system_xstate(unsi
 
 	setup_init_fpu_buf();
 	setup_xstate_comp_offsets();
-	setup_supervisor_only_offsets();
 
 	/*
 	 * Paranoia check whether something in the setup modified the


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 3/7] x86/fpu/xsave: Initialize offset/size cache early
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
  2022-03-24 13:47 ` [patch 1/7] x86/fpu: Remove redundant XCOMP_BV initialization Thomas Gleixner
  2022-03-24 13:47 ` [patch 2/7] x86/fpu: Remove unused supervisor only offsets Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2022-03-24 13:47 ` [patch 4/7] x86/fpu: Cache xfeature flags from CPUID Thomas Gleixner
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae

Reading XSTATE feature information from CPUID over and over does not make
sense. The information has to be cached anyway, so it can be done early.

This prepares for runtime calculation of XSTATE offsets and allows
consolidation of the size calculation functions in a later step.

Rename the function while at it as it does not setup any features.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -180,7 +180,7 @@ static bool xfeature_enabled(enum xfeatu
  * Record the offsets and sizes of various xstates contained
  * in the XSAVE state memory layout.
  */
-static void __init setup_xstate_features(void)
+static void __init setup_xstate_cache(void)
 {
 	u32 eax, ebx, ecx, edx, i;
 	/* start at the beginning of the "extended state" */
@@ -390,7 +390,6 @@ static void __init setup_init_fpu_buf(vo
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return;
 
-	setup_xstate_features();
 	print_xstate_features();
 
 	xstate_init_xcomp_bv(&init_fpstate.regs.xsave, fpu_kernel_cfg.max_features);
@@ -906,6 +905,10 @@ void __init fpu__init_system_xstate(unsi
 
 	/* Enable xstate instructions to be able to continue with initialization: */
 	fpu__init_cpu_xstate();
+
+	/* Cache size, offset and flags for initialization */
+	setup_xstate_cache();
+
 	err = init_xstate_size();
 	if (err)
 		goto out_disable;


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 4/7] x86/fpu: Cache xfeature flags from CPUID
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
                   ` (2 preceding siblings ...)
  2022-03-24 13:47 ` [patch 3/7] x86/fpu/xsave: Initialize offset/size cache early Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2022-03-24 13:47 ` [patch 5/7] x86/fpu/xsave: Handle compacted offsets correctly with supervisor states Thomas Gleixner
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae

In preparation for runtime calculation of XSAVE offsets cache the feature
flags for each XSTATE component during feature enumeration via CPUID(0xD).

EDX has two relevant bits:
    0	Supervisor component
    1	Feature storage must be 64 byte aligned

These bits are currently only evaluated during init, but the alignment bit
must be cached to make runtime calculation of XSAVE offsets efficient.

Cache the full EDX content and use it for the existing alignment and
supervisor checks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |   49 +++++++++++--------------------------------
 1 file changed, 13 insertions(+), 36 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -83,6 +83,10 @@ static unsigned int xstate_sizes[XFEATUR
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_flags[XFEATURE_MAX] __ro_after_init;
+
+#define XSTATE_FLAG_SUPERVISOR	BIT(0)
+#define XSTATE_FLAG_ALIGNED64	BIT(1)
 
 /*
  * Return whether the system supports a given xfeature.
@@ -122,17 +126,14 @@ int cpu_has_xfeatures(u64 xfeatures_need
 }
 EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
 
-static bool xfeature_is_supervisor(int xfeature_nr)
+static bool xfeature_is_aligned64(int xfeature_nr)
 {
-	/*
-	 * Extended State Enumeration Sub-leaves (EAX = 0DH, ECX = n, n > 1)
-	 * returns ECX[0] set to (1) for a supervisor state, and cleared (0)
-	 * for a user state.
-	 */
-	u32 eax, ebx, ecx, edx;
+	return xstate_flags[xfeature_nr] & XSTATE_FLAG_ALIGNED64;
+}
 
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	return ecx & 1;
+static bool xfeature_is_supervisor(int xfeature_nr)
+{
+	return xstate_flags[xfeature_nr] & XSTATE_FLAG_SUPERVISOR;
 }
 
 /*
@@ -203,6 +204,7 @@ static void __init setup_xstate_cache(vo
 		cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
 
 		xstate_sizes[i] = eax;
+		xstate_flags[i] = ecx;
 
 		/*
 		 * If an xfeature is supervisor state, the offset in EBX is
@@ -262,31 +264,6 @@ static void __init print_xstate_features
 } while (0)
 
 /*
- * We could cache this like xstate_size[], but we only use
- * it here, so it would be a waste of space.
- */
-static int xfeature_is_aligned(int xfeature_nr)
-{
-	u32 eax, ebx, ecx, edx;
-
-	CHECK_XFEATURE(xfeature_nr);
-
-	if (!xfeature_enabled(xfeature_nr)) {
-		WARN_ONCE(1, "Checking alignment of disabled xfeature %d\n",
-			  xfeature_nr);
-		return 0;
-	}
-
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	/*
-	 * The value returned by ECX[1] indicates the alignment
-	 * of state component 'i' when the compacted format
-	 * of the extended region of an XSAVE area is used:
-	 */
-	return !!(ecx & 2);
-}
-
-/*
  * This function sets up offsets and sizes of all extended states in
  * xsave area. This supports both standard format and compacted format
  * of the xsave area.
@@ -314,7 +291,7 @@ static void __init setup_xstate_comp_off
 	next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
 
 	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
-		if (xfeature_is_aligned(i))
+		if (xfeature_is_aligned64(i))
 			next_offset = ALIGN(next_offset, 64);
 
 		xstate_comp_offsets[i] = next_offset;
@@ -619,7 +596,7 @@ static unsigned int xstate_calculate_siz
 
 	for_each_extended_xfeature(i, xfeatures) {
 		/* Align from the end of the previous feature */
-		if (xfeature_is_aligned(i))
+		if (xfeature_is_aligned64(i))
 			size = ALIGN(size, 64);
 		/*
 		 * In compacted format the enabled features are packed,


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 5/7] x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
                   ` (3 preceding siblings ...)
  2022-03-24 13:47 ` [patch 4/7] x86/fpu: Cache xfeature flags from CPUID Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2022-03-24 13:47 ` [patch 6/7] x86/fpu/xstate: Handle supervisor states in XSTATE permissions Thomas Gleixner
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae, Paolo Bonzini, Fenghua Yu, Tony Luck

So far the cached fixed compacted offsets worked, but with (re)enabling of
ENQCMD this does not longer work with KVM fpstate.

KVM does not have supervisor features enabled for the guest FPU, which
means that KVM has then a different XSAVE area layout than the host FPU
state. This in turn breaks the copy from/to UABI functions when invoked for
a guest state.

Remove the precalculated compacted offsets and calculate the offset of each
component at runtime based on the XCOMP_BV field in the XSAVE header.

The runtime overhead is not interesting because these copy from/to UABI
functions are not used in critical fast paths. KVM uses them to save and
restore FPU state during migration. The host uses them for ptrace and for
the slow path of 32bit signal handling.

Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <bonzini@gnu.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/fpu/xstate.c |   86 ++++++++++++++++++++-----------------------
 1 file changed, 41 insertions(+), 45 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -81,8 +81,6 @@ static unsigned int xstate_offsets[XFEAT
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
-	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_flags[XFEATURE_MAX] __ro_after_init;
 
 #define XSTATE_FLAG_SUPERVISOR	BIT(0)
@@ -136,6 +134,33 @@ static bool xfeature_is_supervisor(int x
 	return xstate_flags[xfeature_nr] & XSTATE_FLAG_SUPERVISOR;
 }
 
+static unsigned int xfeature_get_offset(u64 xcomp_bv, int xfeature)
+{
+	unsigned int offs, i;
+
+	/*
+	 * Non-compacted format and legacy features use the cached fixed
+	 * offsets.
+	 */
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVES) || xfeature <= XFEATURE_SSE)
+		return xstate_offsets[xfeature];
+
+	/*
+	 * Compacted format offsets depend on the actual content of the
+	 * compacted xsave area which is determined by the xcomp_bv header
+	 * field.
+	 */
+	offs = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+	for_each_extended_xfeature(i, xcomp_bv) {
+		if (xfeature_is_aligned64(i))
+			offs = ALIGN(offs, 64);
+		if (i == xfeature)
+			break;
+		offs += xstate_sizes[i];
+	}
+	return offs;
+}
+
 /*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
@@ -264,42 +289,6 @@ static void __init print_xstate_features
 } while (0)
 
 /*
- * This function sets up offsets and sizes of all extended states in
- * xsave area. This supports both standard format and compacted format
- * of the xsave area.
- */
-static void __init setup_xstate_comp_offsets(void)
-{
-	unsigned int next_offset;
-	int i;
-
-	/*
-	 * The FP xstates and SSE xstates are legacy states. They are always
-	 * in the fixed offsets in the xsave area in either compacted form
-	 * or standard form.
-	 */
-	xstate_comp_offsets[XFEATURE_FP] = 0;
-	xstate_comp_offsets[XFEATURE_SSE] = offsetof(struct fxregs_state,
-						     xmm_space);
-
-	if (!cpu_feature_enabled(X86_FEATURE_XSAVES)) {
-		for_each_extended_xfeature(i, fpu_kernel_cfg.max_features)
-			xstate_comp_offsets[i] = xstate_offsets[i];
-		return;
-	}
-
-	next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-
-	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
-		if (xfeature_is_aligned64(i))
-			next_offset = ALIGN(next_offset, 64);
-
-		xstate_comp_offsets[i] = next_offset;
-		next_offset += xstate_sizes[i];
-	}
-}
-
-/*
  * Print out xstate component offsets and sizes
  */
 static void __init print_xstate_offset_size(void)
@@ -308,7 +297,8 @@ static void __init print_xstate_offset_s
 
 	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
 		pr_info("x86/fpu: xstate_offset[%d]: %4d, xstate_sizes[%d]: %4d\n",
-			 i, xstate_comp_offsets[i], i, xstate_sizes[i]);
+			i, xfeature_get_offset(fpu_kernel_cfg.max_features, i),
+			i, xstate_sizes[i]);
 	}
 }
 
@@ -901,7 +891,6 @@ void __init fpu__init_system_xstate(unsi
 				  fpu_user_cfg.max_features);
 
 	setup_init_fpu_buf();
-	setup_xstate_comp_offsets();
 
 	/*
 	 * Paranoia check whether something in the setup modified the
@@ -956,13 +945,19 @@ void fpu__resume_cpu(void)
  */
 static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 {
-	if (!xfeature_enabled(xfeature_nr)) {
-		WARN_ON_FPU(1);
+	u64 xcomp_bv = xsave->header.xcomp_bv;
+
+	if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
 		return NULL;
+
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES)) {
+		if (WARN_ON_ONCE(!(xcomp_bv & BIT_ULL(xfeature_nr))))
+			return NULL;
 	}
 
-	return (void *)xsave + xstate_comp_offsets[xfeature_nr];
+	return (void *)xsave + xfeature_get_offset(xcomp_bv, xfeature_nr);
 }
+
 /*
  * Given the xsave area and a state inside, this function returns the
  * address of the state.
@@ -993,8 +988,9 @@ void *get_xsave_addr(struct xregs_state
 	 * We should not ever be requesting features that we
 	 * have not enabled.
 	 */
-	WARN_ONCE(!(fpu_kernel_cfg.max_features & BIT_ULL(xfeature_nr)),
-		  "get of unsupported state");
+	if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
+		return NULL;
+
 	/*
 	 * This assumes the last 'xsave*' instruction to
 	 * have requested that 'xfeature_nr' be saved.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 6/7] x86/fpu/xstate: Handle supervisor states in XSTATE permissions
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
                   ` (4 preceding siblings ...)
  2022-03-24 13:47 ` [patch 5/7] x86/fpu/xsave: Handle compacted offsets correctly with supervisor states Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2022-03-24 13:47 ` [patch 7/7] x86/fpu/xstate: Consolidate size calculations Thomas Gleixner
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae, Paolo Bonzini, Fenghua Yu, Tony Luck

The size calculation in __xstate_request_perm() fails to take supervisor
states into account because the permission bitmap is only relevant for user
states.

Up to 5.17 this does not matter because there are no supervisor states
supported, but the (re)enabling of ENQCMD makes them available.

Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <bonzini@gnu.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/fpu/xstate.c |    3 +++
 1 file changed, 3 insertions(+)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1566,6 +1566,9 @@ static int __xstate_request_perm(u64 per
 
 	/* Calculate the resulting kernel state size */
 	mask = permitted | requested;
+	/* Take supervisor states into account on the host */
+	if (!guest)
+		mask |= xfeatures_mask_supervisor();
 	ksize = xstate_calculate_size(mask, compacted);
 
 	/* Calculate the resulting user state size */


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch 7/7] x86/fpu/xstate: Consolidate size calculations
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
                   ` (5 preceding siblings ...)
  2022-03-24 13:47 ` [patch 6/7] x86/fpu/xstate: Handle supervisor states in XSTATE permissions Thomas Gleixner
@ 2022-03-24 13:47 ` Thomas Gleixner
  2022-03-28 18:43   ` [patch V2 " Thomas Gleixner
  2022-03-28 12:39 ` [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Chang S. Bae
  2022-03-28 22:30 ` Fenghua Yu
  8 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-24 13:47 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae

Use the offset calculation to do the size calculation which avoids yet
another series of CPUID instructions for each invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |   46 ++++---------------------------------------
 1 file changed, 5 insertions(+), 41 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -385,25 +385,6 @@ static void __init setup_init_fpu_buf(vo
 	fxsave(&init_fpstate.regs.fxsave);
 }
 
-static int xfeature_uncompacted_offset(int xfeature_nr)
-{
-	u32 eax, ebx, ecx, edx;
-
-	/*
-	 * Only XSAVES supports supervisor states and it uses compacted
-	 * format. Checking a supervisor state's uncompacted offset is
-	 * an error.
-	 */
-	if (XFEATURE_MASK_SUPERVISOR_ALL & BIT_ULL(xfeature_nr)) {
-		WARN_ONCE(1, "No fixed offset for xstate %d\n", xfeature_nr);
-		return -1;
-	}
-
-	CHECK_XFEATURE(xfeature_nr);
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	return ebx;
-}
-
 int xfeature_size(int xfeature_nr)
 {
 	u32 eax, ebx, ecx, edx;
@@ -581,29 +562,12 @@ static bool __init check_xstate_against_
 
 static unsigned int xstate_calculate_size(u64 xfeatures, bool compacted)
 {
-	unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-	int i;
+	unsigned int topmost = fls64(xfeatures) -  1;
+	unsigned int offset = xstate_offsets[topmost];
 
-	for_each_extended_xfeature(i, xfeatures) {
-		/* Align from the end of the previous feature */
-		if (xfeature_is_aligned64(i))
-			size = ALIGN(size, 64);
-		/*
-		 * In compacted format the enabled features are packed,
-		 * i.e. disabled features do not occupy space.
-		 *
-		 * In non-compacted format the offsets are fixed and
-		 * disabled states still occupy space in the memory buffer.
-		 */
-		if (!compacted)
-			size = xfeature_uncompacted_offset(i);
-		/*
-		 * Add the feature size even for non-compacted format
-		 * to make the end result correct
-		 */
-		size += xfeature_size(i);
-	}
-	return size;
+	if (compacted)
+		offset = xfeature_get_offset(xfeatures, topmost);
+	return offset + xstate_sizes[topmost];
 }
 
 /*


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
                   ` (6 preceding siblings ...)
  2022-03-24 13:47 ` [patch 7/7] x86/fpu/xstate: Consolidate size calculations Thomas Gleixner
@ 2022-03-28 12:39 ` Chang S. Bae
  2022-03-28 22:30 ` Fenghua Yu
  8 siblings, 0 replies; 18+ messages in thread
From: Chang S. Bae @ 2022-03-28 12:39 UTC (permalink / raw)
  To: Thomas Gleixner, LKML; +Cc: x86, Paolo Bonzini, Fenghua Yu, Tony Luck

On 3/24/2022 6:47 AM, Thomas Gleixner wrote:
> 
> It builds, boots on host and guest, but is not yet extensively tested.
> Testing with a AMX + PASID enabled machine has not been done at all as I
> don't have easy access to such a beast.

The AMX tests [1][2] went well on both host and guest. Also dmesg logs 
look to be okay:

On host,

$ sudo dmesg | grep fpu
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating 
point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys 
User registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x400: 'PASID state'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x20000: 'AMX Tile config'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x40000: 'AMX Tile data'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: xstate_offset[5]:  832, xstate_sizes[5]:   64
[    0.000000] x86/fpu: xstate_offset[6]:  896, xstate_sizes[6]:  512
[    0.000000] x86/fpu: xstate_offset[7]: 1408, xstate_sizes[7]: 1024
[    0.000000] x86/fpu: xstate_offset[9]: 2432, xstate_sizes[9]:    8
[    0.000000] x86/fpu: xstate_offset[10]: 2440, xstate_sizes[10]:    8
[    0.000000] x86/fpu: xstate_offset[17]: 2496, xstate_sizes[17]:   64
[    0.000000] x86/fpu: xstate_offset[18]: 2560, xstate_sizes[18]: 8192
[    0.000000] x86/fpu: Enabled xstate features 0x606e7, context size is 
10752 bytes, using 'compacted' format.

On guest,
(AMX is available with the upstream Qemu [3].)

$ sudo dmesg | grep fpu
...
[    0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys 
User registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x20000: 'AMX Tile config'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x40000: 'AMX Tile data'
...
[    0.000000] x86/fpu: xstate_offset[9]: 2432, xstate_sizes[9]:    8
[    0.000000] x86/fpu: xstate_offset[17]: 2496, xstate_sizes[17]:   64
[    0.000000] x86/fpu: xstate_offset[18]: 2560, xstate_sizes[18]: 8192
[    0.000000] x86/fpu: Enabled xstate features 0x602e7, context size is 
10752 bytes, using 'compacted' format.

Thanks,
Chang

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git/tree/tools/testing/selftests/x86/amx.c?h=x86/fpu
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git/tree/tools/testing/selftests/kvm/x86_64/amx_test.c?h=x86/fpu
[3] 
https://git.qemu.org/?p=qemu.git;a=commit;h=3d31fe4d662f13c70eb7e87f29513623ccd76322

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [patch V2 7/7] x86/fpu/xstate: Consolidate size calculations
  2022-03-24 13:47 ` [patch 7/7] x86/fpu/xstate: Consolidate size calculations Thomas Gleixner
@ 2022-03-28 18:43   ` Thomas Gleixner
  2022-03-31  9:00     ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2022-03-28 18:43 UTC (permalink / raw)
  To: LKML; +Cc: x86, Chang S. Bae

Use the offset calculation to do the size calculation which avoids yet
another series of CPUID instructions for each invocation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Fix the FP/SSE only case which missed to take the xstate header into
    account. Reported-by: kernel test robot <oliver.sang@intel.com>
---
 arch/x86/kernel/fpu/xstate.c |   49 +++++++------------------------------------
 1 file changed, 8 insertions(+), 41 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -385,25 +385,6 @@ static void __init setup_init_fpu_buf(vo
 	fxsave(&init_fpstate.regs.fxsave);
 }
 
-static int xfeature_uncompacted_offset(int xfeature_nr)
-{
-	u32 eax, ebx, ecx, edx;
-
-	/*
-	 * Only XSAVES supports supervisor states and it uses compacted
-	 * format. Checking a supervisor state's uncompacted offset is
-	 * an error.
-	 */
-	if (XFEATURE_MASK_SUPERVISOR_ALL & BIT_ULL(xfeature_nr)) {
-		WARN_ONCE(1, "No fixed offset for xstate %d\n", xfeature_nr);
-		return -1;
-	}
-
-	CHECK_XFEATURE(xfeature_nr);
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	return ebx;
-}
-
 int xfeature_size(int xfeature_nr)
 {
 	u32 eax, ebx, ecx, edx;
@@ -581,29 +562,15 @@ static bool __init check_xstate_against_
 
 static unsigned int xstate_calculate_size(u64 xfeatures, bool compacted)
 {
-	unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-	int i;
+	unsigned int topmost = fls64(xfeatures) -  1;
+	unsigned int offset = xstate_offsets[topmost];
 
-	for_each_extended_xfeature(i, xfeatures) {
-		/* Align from the end of the previous feature */
-		if (xfeature_is_aligned64(i))
-			size = ALIGN(size, 64);
-		/*
-		 * In compacted format the enabled features are packed,
-		 * i.e. disabled features do not occupy space.
-		 *
-		 * In non-compacted format the offsets are fixed and
-		 * disabled states still occupy space in the memory buffer.
-		 */
-		if (!compacted)
-			size = xfeature_uncompacted_offset(i);
-		/*
-		 * Add the feature size even for non-compacted format
-		 * to make the end result correct
-		 */
-		size += xfeature_size(i);
-	}
-	return size;
+	if (topmost <= XFEATURE_SSE)
+		return sizeof(struct xregs_state);
+
+	if (compacted)
+		offset = xfeature_get_offset(xfeatures, topmost);
+	return offset + xstate_sizes[topmost];
 }
 
 /*

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout
  2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
                   ` (7 preceding siblings ...)
  2022-03-28 12:39 ` [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Chang S. Bae
@ 2022-03-28 22:30 ` Fenghua Yu
  8 siblings, 0 replies; 18+ messages in thread
From: Fenghua Yu @ 2022-03-28 22:30 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, x86, Chang S. Bae, Paolo Bonzini, Tony Luck

Hi, Thomas,

On Thu, Mar 24, 2022 at 02:47:07PM +0100, Thomas Gleixner wrote:
> It builds, boots on host and guest, but is not yet extensively tested.
> Testing with a AMX + PASID enabled machine has not been done at all as I
> don't have easy access to such a beast.

I run many copies of dsa_test [1] (i.e. multi-threading and multi-tasking)
to stressfully exercise FPU PASID state context switch on native.
No issue is found so far.

1. https://github.com/intel/idxd-config

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu/xstate: Consolidate size calculations
  2022-03-28 18:43   ` [patch V2 " Thomas Gleixner
@ 2022-03-31  9:00     ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kernel test robot, Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     d6d6d50f1e801a790a242c80eeda261e36c43b7b
Gitweb:        https://git.kernel.org/tip/d6d6d50f1e801a790a242c80eeda261e36c43b7b
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 28 Mar 2022 20:43:21 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 30 Mar 2022 12:04:09 +02:00

x86/fpu/xstate: Consolidate size calculations

Use the offset calculation to do the size calculation which avoids yet
another series of CPUID instructions for each invocation.

  [ Fix the FP/SSE only case which missed to take the xstate
    header into account, as
    Reported-by: kernel test robot <oliver.sang@intel.com> ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/87o81pgbp2.ffs@tglx
---
 arch/x86/kernel/fpu/xstate.c | 49 +++++------------------------------
 1 file changed, 8 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5ac934b..39e1c86 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -385,25 +385,6 @@ static void __init setup_init_fpu_buf(void)
 	fxsave(&init_fpstate.regs.fxsave);
 }
 
-static int xfeature_uncompacted_offset(int xfeature_nr)
-{
-	u32 eax, ebx, ecx, edx;
-
-	/*
-	 * Only XSAVES supports supervisor states and it uses compacted
-	 * format. Checking a supervisor state's uncompacted offset is
-	 * an error.
-	 */
-	if (XFEATURE_MASK_SUPERVISOR_ALL & BIT_ULL(xfeature_nr)) {
-		WARN_ONCE(1, "No fixed offset for xstate %d\n", xfeature_nr);
-		return -1;
-	}
-
-	CHECK_XFEATURE(xfeature_nr);
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	return ebx;
-}
-
 int xfeature_size(int xfeature_nr)
 {
 	u32 eax, ebx, ecx, edx;
@@ -581,29 +562,15 @@ static bool __init check_xstate_against_struct(int nr)
 
 static unsigned int xstate_calculate_size(u64 xfeatures, bool compacted)
 {
-	unsigned int size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-	int i;
+	unsigned int topmost = fls64(xfeatures) -  1;
+	unsigned int offset = xstate_offsets[topmost];
 
-	for_each_extended_xfeature(i, xfeatures) {
-		/* Align from the end of the previous feature */
-		if (xfeature_is_aligned64(i))
-			size = ALIGN(size, 64);
-		/*
-		 * In compacted format the enabled features are packed,
-		 * i.e. disabled features do not occupy space.
-		 *
-		 * In non-compacted format the offsets are fixed and
-		 * disabled states still occupy space in the memory buffer.
-		 */
-		if (!compacted)
-			size = xfeature_uncompacted_offset(i);
-		/*
-		 * Add the feature size even for non-compacted format
-		 * to make the end result correct
-		 */
-		size += xfeature_size(i);
-	}
-	return size;
+	if (topmost <= XFEATURE_SSE)
+		return sizeof(struct xregs_state);
+
+	if (compacted)
+		offset = xfeature_get_offset(xfeatures, topmost);
+	return offset + xstate_sizes[topmost];
 }
 
 /*

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu/xstate: Handle supervisor states in XSTATE permissions
  2022-03-24 13:47 ` [patch 6/7] x86/fpu/xstate: Handle supervisor states in XSTATE permissions Thomas Gleixner
@ 2022-03-31  9:00   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     781c64bfcb735960717d1cb45428047ff6a5030c
Gitweb:        https://git.kernel.org/tip/781c64bfcb735960717d1cb45428047ff6a5030c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Mar 2022 14:47:14 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 30 Mar 2022 11:35:22 +02:00

x86/fpu/xstate: Handle supervisor states in XSTATE permissions

The size calculation in __xstate_request_perm() fails to take supervisor
states into account because the permission bitmap is only relevant for user
states.

Up to 5.17 this does not matter because there are no supervisor states
supported, but the (re-)enabling of ENQCMD makes them available.

Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324134623.681768598@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c55f72e..5ac934b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1571,6 +1571,9 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest)
 
 	/* Calculate the resulting kernel state size */
 	mask = permitted | requested;
+	/* Take supervisor states into account on the host */
+	if (!guest)
+		mask |= xfeatures_mask_supervisor();
 	ksize = xstate_calculate_size(mask, compacted);
 
 	/* Calculate the resulting user state size */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
  2022-03-24 13:47 ` [patch 5/7] x86/fpu/xsave: Handle compacted offsets correctly with supervisor states Thomas Gleixner
@ 2022-03-31  9:00   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     7aa5128b5fea26cf224766303ea3b8df343f9a87
Gitweb:        https://git.kernel.org/tip/7aa5128b5fea26cf224766303ea3b8df343f9a87
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Mar 2022 14:47:13 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 30 Mar 2022 11:20:36 +02:00

x86/fpu/xsave: Handle compacted offsets correctly with supervisor states

So far the cached fixed compacted offsets worked, but with (re-)enabling
of ENQCMD this does no longer work with KVM fpstate.

KVM does not have supervisor features enabled for the guest FPU, which
means that KVM has then a different XSAVE area layout than the host FPU
state. This in turn breaks the copy from/to UABI functions when invoked for
a guest state.

Remove the pre-calculated compacted offsets and calculate the offset
of each component at runtime based on the XCOMP_BV field in the XSAVE
header.

The runtime overhead is not interesting because these copy from/to UABI
functions are not used in critical fast paths. KVM uses them to save and
restore FPU state during migration. The host uses them for ptrace and for
the slow path of 32bit signal handling.

Fixes: 7c1ef59145f1 ("x86/cpufeatures: Re-enable ENQCMD")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324134623.627636809@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 86 ++++++++++++++++-------------------
 1 file changed, 41 insertions(+), 45 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 5a069c2..c55f72e 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -81,8 +81,6 @@ static unsigned int xstate_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
-	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_flags[XFEATURE_MAX] __ro_after_init;
 
 #define XSTATE_FLAG_SUPERVISOR	BIT(0)
@@ -136,6 +134,33 @@ static bool xfeature_is_supervisor(int xfeature_nr)
 	return xstate_flags[xfeature_nr] & XSTATE_FLAG_SUPERVISOR;
 }
 
+static unsigned int xfeature_get_offset(u64 xcomp_bv, int xfeature)
+{
+	unsigned int offs, i;
+
+	/*
+	 * Non-compacted format and legacy features use the cached fixed
+	 * offsets.
+	 */
+	if (!cpu_feature_enabled(X86_FEATURE_XSAVES) || xfeature <= XFEATURE_SSE)
+		return xstate_offsets[xfeature];
+
+	/*
+	 * Compacted format offsets depend on the actual content of the
+	 * compacted xsave area which is determined by the xcomp_bv header
+	 * field.
+	 */
+	offs = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+	for_each_extended_xfeature(i, xcomp_bv) {
+		if (xfeature_is_aligned64(i))
+			offs = ALIGN(offs, 64);
+		if (i == xfeature)
+			break;
+		offs += xstate_sizes[i];
+	}
+	return offs;
+}
+
 /*
  * Enable the extended processor state save/restore feature.
  * Called once per CPU onlining.
@@ -264,42 +289,6 @@ static void __init print_xstate_features(void)
 } while (0)
 
 /*
- * This function sets up offsets and sizes of all extended states in
- * xsave area. This supports both standard format and compacted format
- * of the xsave area.
- */
-static void __init setup_xstate_comp_offsets(void)
-{
-	unsigned int next_offset;
-	int i;
-
-	/*
-	 * The FP xstates and SSE xstates are legacy states. They are always
-	 * in the fixed offsets in the xsave area in either compacted form
-	 * or standard form.
-	 */
-	xstate_comp_offsets[XFEATURE_FP] = 0;
-	xstate_comp_offsets[XFEATURE_SSE] = offsetof(struct fxregs_state,
-						     xmm_space);
-
-	if (!cpu_feature_enabled(X86_FEATURE_XSAVES)) {
-		for_each_extended_xfeature(i, fpu_kernel_cfg.max_features)
-			xstate_comp_offsets[i] = xstate_offsets[i];
-		return;
-	}
-
-	next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-
-	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
-		if (xfeature_is_aligned64(i))
-			next_offset = ALIGN(next_offset, 64);
-
-		xstate_comp_offsets[i] = next_offset;
-		next_offset += xstate_sizes[i];
-	}
-}
-
-/*
  * Print out xstate component offsets and sizes
  */
 static void __init print_xstate_offset_size(void)
@@ -308,7 +297,8 @@ static void __init print_xstate_offset_size(void)
 
 	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
 		pr_info("x86/fpu: xstate_offset[%d]: %4d, xstate_sizes[%d]: %4d\n",
-			 i, xstate_comp_offsets[i], i, xstate_sizes[i]);
+			i, xfeature_get_offset(fpu_kernel_cfg.max_features, i),
+			i, xstate_sizes[i]);
 	}
 }
 
@@ -901,7 +891,6 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
 				  fpu_user_cfg.max_features);
 
 	setup_init_fpu_buf();
-	setup_xstate_comp_offsets();
 
 	/*
 	 * Paranoia check whether something in the setup modified the
@@ -956,13 +945,19 @@ void fpu__resume_cpu(void)
  */
 static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 {
-	if (!xfeature_enabled(xfeature_nr)) {
-		WARN_ON_FPU(1);
+	u64 xcomp_bv = xsave->header.xcomp_bv;
+
+	if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
 		return NULL;
+
+	if (cpu_feature_enabled(X86_FEATURE_XSAVES)) {
+		if (WARN_ON_ONCE(!(xcomp_bv & BIT_ULL(xfeature_nr))))
+			return NULL;
 	}
 
-	return (void *)xsave + xstate_comp_offsets[xfeature_nr];
+	return (void *)xsave + xfeature_get_offset(xcomp_bv, xfeature_nr);
 }
+
 /*
  * Given the xsave area and a state inside, this function returns the
  * address of the state.
@@ -993,8 +988,9 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 	 * We should not ever be requesting features that we
 	 * have not enabled.
 	 */
-	WARN_ONCE(!(fpu_kernel_cfg.max_features & BIT_ULL(xfeature_nr)),
-		  "get of unsupported state");
+	if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
+		return NULL;
+
 	/*
 	 * This assumes the last 'xsave*' instruction to
 	 * have requested that 'xfeature_nr' be saved.

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu: Cache xfeature flags from CPUID
  2022-03-24 13:47 ` [patch 4/7] x86/fpu: Cache xfeature flags from CPUID Thomas Gleixner
@ 2022-03-31  9:00   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     6afbb58cc2251c1d83472ca3005638206e73b6b8
Gitweb:        https://git.kernel.org/tip/6afbb58cc2251c1d83472ca3005638206e73b6b8
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Mar 2022 14:47:12 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 30 Mar 2022 11:05:32 +02:00

x86/fpu: Cache xfeature flags from CPUID

In preparation for runtime calculation of XSAVE offsets cache the feature
flags for each XSTATE component during feature enumeration via CPUID(0xD).

EDX has two relevant bits:
    0	Supervisor component
    1	Feature storage must be 64 byte aligned

These bits are currently only evaluated during init, but the alignment bit
must be cached to make runtime calculation of XSAVE offsets efficient.

Cache the full EDX content and use it for the existing alignment and
supervisor checks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324134623.573656209@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 49 +++++++++--------------------------
 1 file changed, 13 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 814c2fd..5a069c2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -83,6 +83,10 @@ static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
+static unsigned int xstate_flags[XFEATURE_MAX] __ro_after_init;
+
+#define XSTATE_FLAG_SUPERVISOR	BIT(0)
+#define XSTATE_FLAG_ALIGNED64	BIT(1)
 
 /*
  * Return whether the system supports a given xfeature.
@@ -122,17 +126,14 @@ int cpu_has_xfeatures(u64 xfeatures_needed, const char **feature_name)
 }
 EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
 
-static bool xfeature_is_supervisor(int xfeature_nr)
+static bool xfeature_is_aligned64(int xfeature_nr)
 {
-	/*
-	 * Extended State Enumeration Sub-leaves (EAX = 0DH, ECX = n, n > 1)
-	 * returns ECX[0] set to (1) for a supervisor state, and cleared (0)
-	 * for a user state.
-	 */
-	u32 eax, ebx, ecx, edx;
+	return xstate_flags[xfeature_nr] & XSTATE_FLAG_ALIGNED64;
+}
 
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	return ecx & 1;
+static bool xfeature_is_supervisor(int xfeature_nr)
+{
+	return xstate_flags[xfeature_nr] & XSTATE_FLAG_SUPERVISOR;
 }
 
 /*
@@ -203,6 +204,7 @@ static void __init setup_xstate_cache(void)
 		cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
 
 		xstate_sizes[i] = eax;
+		xstate_flags[i] = ecx;
 
 		/*
 		 * If an xfeature is supervisor state, the offset in EBX is
@@ -262,31 +264,6 @@ static void __init print_xstate_features(void)
 } while (0)
 
 /*
- * We could cache this like xstate_size[], but we only use
- * it here, so it would be a waste of space.
- */
-static int xfeature_is_aligned(int xfeature_nr)
-{
-	u32 eax, ebx, ecx, edx;
-
-	CHECK_XFEATURE(xfeature_nr);
-
-	if (!xfeature_enabled(xfeature_nr)) {
-		WARN_ONCE(1, "Checking alignment of disabled xfeature %d\n",
-			  xfeature_nr);
-		return 0;
-	}
-
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	/*
-	 * The value returned by ECX[1] indicates the alignment
-	 * of state component 'i' when the compacted format
-	 * of the extended region of an XSAVE area is used:
-	 */
-	return !!(ecx & 2);
-}
-
-/*
  * This function sets up offsets and sizes of all extended states in
  * xsave area. This supports both standard format and compacted format
  * of the xsave area.
@@ -314,7 +291,7 @@ static void __init setup_xstate_comp_offsets(void)
 	next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
 
 	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
-		if (xfeature_is_aligned(i))
+		if (xfeature_is_aligned64(i))
 			next_offset = ALIGN(next_offset, 64);
 
 		xstate_comp_offsets[i] = next_offset;
@@ -619,7 +596,7 @@ static unsigned int xstate_calculate_size(u64 xfeatures, bool compacted)
 
 	for_each_extended_xfeature(i, xfeatures) {
 		/* Align from the end of the previous feature */
-		if (xfeature_is_aligned(i))
+		if (xfeature_is_aligned64(i))
 			size = ALIGN(size, 64);
 		/*
 		 * In compacted format the enabled features are packed,

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu/xsave: Initialize offset/size cache early
  2022-03-24 13:47 ` [patch 3/7] x86/fpu/xsave: Initialize offset/size cache early Thomas Gleixner
@ 2022-03-31  9:00   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     35a77d4503d9d9d0e19e3a2a0d3fc9ab09fb6857
Gitweb:        https://git.kernel.org/tip/35a77d4503d9d9d0e19e3a2a0d3fc9ab09fb6857
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Mar 2022 14:47:11 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 30 Mar 2022 10:55:44 +02:00

x86/fpu/xsave: Initialize offset/size cache early

Reading XSTATE feature information from CPUID over and over does not make
sense. The information has to be cached anyway, so it can be done early.

Prepare for runtime calculation of XSTATE offsets and allow
consolidation of the size calculation functions in a later step.

Rename the function while at it as it does not setup any features.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324134623.519411939@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index dc33556..814c2fd 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -180,7 +180,7 @@ static bool xfeature_enabled(enum xfeature xfeature)
  * Record the offsets and sizes of various xstates contained
  * in the XSAVE state memory layout.
  */
-static void __init setup_xstate_features(void)
+static void __init setup_xstate_cache(void)
 {
 	u32 eax, ebx, ecx, edx, i;
 	/* start at the beginning of the "extended state" */
@@ -390,7 +390,6 @@ static void __init setup_init_fpu_buf(void)
 	if (!boot_cpu_has(X86_FEATURE_XSAVE))
 		return;
 
-	setup_xstate_features();
 	print_xstate_features();
 
 	xstate_init_xcomp_bv(&init_fpstate.regs.xsave, fpu_kernel_cfg.max_features);
@@ -906,6 +905,10 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
 
 	/* Enable xstate instructions to be able to continue with initialization: */
 	fpu__init_cpu_xstate();
+
+	/* Cache size, offset and flags for initialization */
+	setup_xstate_cache();
+
 	err = init_xstate_size();
 	if (err)
 		goto out_disable;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu: Remove unused supervisor only offsets
  2022-03-24 13:47 ` [patch 2/7] x86/fpu: Remove unused supervisor only offsets Thomas Gleixner
@ 2022-03-31  9:00   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     d47f71f6de7970d504748d1a60a11c51af5bce47
Gitweb:        https://git.kernel.org/tip/d47f71f6de7970d504748d1a60a11c51af5bce47
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Mar 2022 14:47:09 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 30 Mar 2022 10:44:51 +02:00

x86/fpu: Remove unused supervisor only offsets

No users.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324134623.465066249@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 30 ------------------------------
 1 file changed, 30 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index dc6d5e9..dc33556 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -83,8 +83,6 @@ static unsigned int xstate_sizes[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] __ro_after_init =
 	{ [ 0 ... XFEATURE_MAX - 1] = -1};
-static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] __ro_after_init =
-	{ [ 0 ... XFEATURE_MAX - 1] = -1};
 
 /*
  * Return whether the system supports a given xfeature.
@@ -325,33 +323,6 @@ static void __init setup_xstate_comp_offsets(void)
 }
 
 /*
- * Setup offsets of a supervisor-state-only XSAVES buffer:
- *
- * The offsets stored in xstate_comp_offsets[] only work for one specific
- * value of the Requested Feature BitMap (RFBM).  In cases where a different
- * RFBM value is used, a different set of offsets is required.  This set of
- * offsets is for when RFBM=xfeatures_mask_supervisor().
- */
-static void __init setup_supervisor_only_offsets(void)
-{
-	unsigned int next_offset;
-	int i;
-
-	next_offset = FXSAVE_SIZE + XSAVE_HDR_SIZE;
-
-	for_each_extended_xfeature(i, fpu_kernel_cfg.max_features) {
-		if (!xfeature_is_supervisor(i))
-			continue;
-
-		if (xfeature_is_aligned(i))
-			next_offset = ALIGN(next_offset, 64);
-
-		xstate_supervisor_only_offsets[i] = next_offset;
-		next_offset += xstate_sizes[i];
-	}
-}
-
-/*
  * Print out xstate component offsets and sizes
  */
 static void __init print_xstate_offset_size(void)
@@ -951,7 +922,6 @@ void __init fpu__init_system_xstate(unsigned int legacy_size)
 
 	setup_init_fpu_buf();
 	setup_xstate_comp_offsets();
-	setup_supervisor_only_offsets();
 
 	/*
 	 * Paranoia check whether something in the setup modified the

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip: x86/urgent] x86/fpu: Remove redundant XCOMP_BV initialization
  2022-03-24 13:47 ` [patch 1/7] x86/fpu: Remove redundant XCOMP_BV initialization Thomas Gleixner
@ 2022-03-31  9:01   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 18+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2022-03-31  9:01 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Thomas Gleixner, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     a9f84fb7158fea60cbcadef5c0166fb22b469091
Gitweb:        https://git.kernel.org/tip/a9f84fb7158fea60cbcadef5c0166fb22b469091
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Thu, 24 Mar 2022 14:47:08 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Tue, 29 Mar 2022 20:57:18 +02:00

x86/fpu: Remove redundant XCOMP_BV initialization

fpu_copy_uabi_to_guest_fpstate() initializes the XCOMP_BV field in the
XSAVE header. That's a leftover from the old KVM FPU buffer handling code.

Since

  d69c1382e1b7 ("x86/kvm: Convert FPU handling to a single swap buffer")

KVM uses the FPU core allocation code, which initializes the XCOMP_BV
field already.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220324134623.408932232@linutronix.de
---
 arch/x86/kernel/fpu/core.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 19821f0..c049561 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -415,9 +415,6 @@ int fpu_copy_uabi_to_guest_fpstate(struct fpu_guest *gfpu, const void *buf,
 		xpkru = get_xsave_addr(&kstate->regs.xsave, XFEATURE_PKRU);
 		*vpkru = xpkru->pkru;
 	}
-
-	/* Ensure that XCOMP_BV is set up for XSAVES */
-	xstate_init_xcomp_bv(&kstate->regs.xsave, kstate->xfeatures);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(fpu_copy_uabi_to_guest_fpstate);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-03-31  9:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-24 13:47 [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Thomas Gleixner
2022-03-24 13:47 ` [patch 1/7] x86/fpu: Remove redundant XCOMP_BV initialization Thomas Gleixner
2022-03-31  9:01   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-24 13:47 ` [patch 2/7] x86/fpu: Remove unused supervisor only offsets Thomas Gleixner
2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-24 13:47 ` [patch 3/7] x86/fpu/xsave: Initialize offset/size cache early Thomas Gleixner
2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-24 13:47 ` [patch 4/7] x86/fpu: Cache xfeature flags from CPUID Thomas Gleixner
2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-24 13:47 ` [patch 5/7] x86/fpu/xsave: Handle compacted offsets correctly with supervisor states Thomas Gleixner
2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-24 13:47 ` [patch 6/7] x86/fpu/xstate: Handle supervisor states in XSTATE permissions Thomas Gleixner
2022-03-31  9:00   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-24 13:47 ` [patch 7/7] x86/fpu/xstate: Consolidate size calculations Thomas Gleixner
2022-03-28 18:43   ` [patch V2 " Thomas Gleixner
2022-03-31  9:00     ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2022-03-28 12:39 ` [patch 0/7] x86/fpu: Cure supervisor mode (ENQCMD) fallout Chang S. Bae
2022-03-28 22:30 ` Fenghua Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).