linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
@ 2016-03-04 18:12 Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 01/10] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Yu-cheng Yu
                   ` (10 more replies)
  0 siblings, 11 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

XSAVES is a kernel-mode instruction. It offers a compacted format and
memory-write optimization. These patches fix known issues in the first
implementation. They are intended for discussion and getting feedback
before actually getting applied.

Version 4 adds a new patch (#9) to fix missing XSAVE legacy region
offset/size values.

Patch 1, 2, and 4 are for converting between kernel-mode xstate area and
signal frames.

Patch 3 fixes optimization issues introduced by XSAVES to the buffer
init_fpstate.

Patch 5 and 6 are related to xstate component offsets.

Patch 7 is for converting between kernel-mode xstate area and ptrace
frames.

Patch 8 fixes xstate area print out.

Patch 9 fixes missing offset/size values for XSAVE legacy region. 

Patch 10 re-enables XSAVES.

Yu-cheng Yu (10):
  x86/xsaves: Define and use user_xstate_size for xstate size in signal
    context
  x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly
    distinguish xstate size in kernel from user space
  x86/xsaves: Keep init_fpstate.xsave.header.xfeatures as zero for init
    optimization
  x86/xsaves: Introduce a new check that allows correct xstates copy
    from kernel to user directly
  x86/xsaves: Align xstate components according to CPUID
  x86/xsaves: Supervisor state component offset
  x86/xsaves: Fix PTRACE frames for XSAVES
  x86/xsaves: Fix XSTATE component offset print out
  x86/xsaves: Fix xstate_offsets, xstate_sizes for non-extended states
  x86/xsaves: Re-enable XSAVES

 arch/x86/include/asm/fpu/types.h  |   2 +
 arch/x86/include/asm/fpu/xstate.h |   8 +-
 arch/x86/include/asm/processor.h  |   3 +-
 arch/x86/kernel/fpu/core.c        |   6 +-
 arch/x86/kernel/fpu/init.c        |  32 +--
 arch/x86/kernel/fpu/regset.c      |  56 ++++--
 arch/x86/kernel/fpu/signal.c      |  69 ++++++-
 arch/x86/kernel/fpu/xstate.c      | 411 ++++++++++++++++++++++++++++++--------
 8 files changed, 449 insertions(+), 138 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v4 01/10] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 02/10] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Yu-cheng Yu
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

If "xsaves" is enabled, kernel always uses compacted format of xsave area.
But user space still uses standard format of xsave area. Thus, xstate size
in kernel's xsave area is smaller than xstate size in user's xsave area.
The xstate in signal frame should be in standard format for user's signal
handler to access.

In no "xsaves" case, xsave area in both user space and kernel space are in
standard format. Therefore, user's and kernel's xstate sizes are equal.

In "xsaves" case, xsave area in user space is in standard format while
xsave area in kernel space is in compacted format. Therefore, kernel's
xstate size is smaller than user's xstate size.

So here is the problem: currently kernel assumes its own xstate size is
signal frame's xstate size. This is not a problem in no "xsaves" case.
It is an issue in "xsaves" case because kernel's xstate size is smaller
than user's xstate size. In fpu__alloc_mathframe(), a smaller fpstate
buffer is allocated for the standard format xstate in signal frame.
Then kernel saves only part of xstate registers into this smaller
user's fpstate buffer and user will see part of the xstate registers in
signal context. Similar issue happens after returning from signal handler:
kernel will only restore part of xstate registers from user's fpstate
buffer in signal frame.

This patch defines and uses user_xstate_size for xstate size in signal
frame. It's read from returned value in ebx from CPUID leaf 0x0D subleaf
0x0. This is maximum size required by enabled states in XCR0 and may be
different from ecx when states at the end of the xsave area are not
enabled. This value indicates the size required for XSAVE to save all
supported user states in legacy/standard format.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/include/asm/fpu/xstate.h |  1 -
 arch/x86/include/asm/processor.h  |  1 +
 arch/x86/kernel/fpu/init.c        |  5 ++-
 arch/x86/kernel/fpu/signal.c      | 26 ++++++++++----
 arch/x86/kernel/fpu/xstate.c      | 71 ++++++++++++++++++++++++---------------
 5 files changed, 67 insertions(+), 37 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index af30fde..c6667f2 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -39,7 +39,6 @@
 #define REX_PREFIX
 #endif
 
-extern unsigned int xstate_size;
 extern u64 xfeatures_mask;
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 20c11d1..01e127e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -366,6 +366,7 @@ DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
 #endif	/* X86_64 */
 
 extern unsigned int xstate_size;
+extern unsigned int user_xstate_size;
 
 struct perf_event;
 
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 6d9f0a7..4ac2561 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -193,7 +193,7 @@ static void __init fpu__init_task_struct_size(void)
 }
 
 /*
- * Set up the xstate_size based on the legacy FPU context size.
+ * Set up the user and kernel xstate_size based on the legacy FPU context size.
  *
  * We set this up first, and later it will be overwritten by
  * fpu__init_system_xstate() if the CPU knows about xstates.
@@ -224,6 +224,9 @@ static void __init fpu__init_system_xstate_size_legacy(void)
 		else
 			xstate_size = sizeof(struct fregs_state);
 	}
+
+	user_xstate_size = xstate_size;
+
 	/*
 	 * Quirk: we don't yet handle the XSAVES* instructions
 	 * correctly, as we don't correctly convert between
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 31c6a60..ee6d662 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -31,7 +31,7 @@ static inline int check_for_xstate(struct fxregs_state __user *buf,
 	/* Check for the first magic field and other error scenarios. */
 	if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
 	    fx_sw->xstate_size < min_xstate_size ||
-	    fx_sw->xstate_size > xstate_size ||
+	    fx_sw->xstate_size > user_xstate_size ||
 	    fx_sw->xstate_size > fx_sw->extended_size)
 		return -1;
 
@@ -88,7 +88,7 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
 	if (!use_xsave())
 		return err;
 
-	err |= __put_user(FP_XSTATE_MAGIC2, (__u32 *)(buf + xstate_size));
+	err |= __put_user(FP_XSTATE_MAGIC2, (__u32 *)(buf + user_xstate_size));
 
 	/*
 	 * Read the xfeatures which we copied (directly from the cpu or
@@ -125,7 +125,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 	else
 		err = copy_fregs_to_user((struct fregs_state __user *) buf);
 
-	if (unlikely(err) && __clear_user(buf, xstate_size))
+	if (unlikely(err) && __clear_user(buf, user_xstate_size))
 		err = -EFAULT;
 	return err;
 }
@@ -175,8 +175,19 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 		if (ia32_fxstate)
 			copy_fxregs_to_kernel(&tsk->thread.fpu);
 	} else {
+		/*
+		 * It is a *bug* if kernel uses compacted-format for xsave
+		 * area and we copy it out directly to a signal frame. It
+		 * should have been handled above by saving the registers
+		 * directly.
+		 */
+		if (boot_cpu_has(X86_FEATURE_XSAVES)) {
+			WARN_ONCE(1, "x86/fpu: saving compacted-format xsave area to a signal frame!\n");
+			return -1;
+		}
+
 		fpstate_sanitize_xstate(&tsk->thread.fpu);
-		if (__copy_to_user(buf_fx, xsave, xstate_size))
+		if (__copy_to_user(buf_fx, xsave, user_xstate_size))
 			return -1;
 	}
 
@@ -341,7 +352,8 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 
 static inline int xstate_sigframe_size(void)
 {
-	return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
+	return use_xsave() ? user_xstate_size + FP_XSTATE_MAGIC2_SIZE :
+			user_xstate_size;
 }
 
 /*
@@ -385,12 +397,12 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
  */
 void fpu__init_prepare_fx_sw_frame(void)
 {
-	int size = xstate_size + FP_XSTATE_MAGIC2_SIZE;
+	int size = user_xstate_size + FP_XSTATE_MAGIC2_SIZE;
 
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
 	fx_sw_reserved.xfeatures = xfeatures_mask;
-	fx_sw_reserved.xstate_size = xstate_size;
+	fx_sw_reserved.xstate_size = user_xstate_size;
 
 	if (config_enabled(CONFIG_IA32_EMULATION) ||
 	    config_enabled(CONFIG_X86_32)) {
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d425cda5..9d41c0f 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -35,6 +35,8 @@ static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] =
 static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[sizeof(xfeatures_mask)*8];
 
+unsigned int user_xstate_size;
+
 /*
  * Clear all of the X86_FEATURE_* bits that are unavailable
  * when the CPU has no XSAVE support.
@@ -159,7 +161,7 @@ void fpstate_sanitize_xstate(struct fpu *fpu)
 	 */
 	while (xfeatures) {
 		if (xfeatures & 0x1) {
-			int offset = xstate_offsets[feature_bit];
+			int offset = xstate_comp_offsets[feature_bit];
 			int size = xstate_sizes[feature_bit];
 
 			memcpy((void *)fx + offset,
@@ -518,8 +520,9 @@ static void do_extra_xstate_size_checks(void)
 	XSTATE_WARN_ON(paranoid_xstate_size != xstate_size);
 }
 
+
 /*
- * Calculate total size of enabled xstates in XCR0/xfeatures_mask.
+ * Get total size of enabled xstates in XCR0/xfeatures_mask.
  *
  * Note the SDM's wording here.  "sub-function 0" only enumerates
  * the size of the *user* states.  If we use it to size a buffer
@@ -529,34 +532,33 @@ static void do_extra_xstate_size_checks(void)
  * Note that we do not currently set any bits on IA32_XSS so
  * 'XCR0 | IA32_XSS == XCR0' for now.
  */
-static unsigned int __init calculate_xstate_size(void)
+static unsigned int __init get_xsaves_size(void)
 {
 	unsigned int eax, ebx, ecx, edx;
-	unsigned int calculated_xstate_size;
+	/*
+	 * - CPUID function 0DH, sub-function 1:
+	 *    EBX enumerates the size (in bytes) required by
+	 *    the XSAVES instruction for an XSAVE area
+	 *    containing all the state components
+	 *    corresponding to bits currently set in
+	 *    XCR0 | IA32_XSS.
+	 */
+	cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
+	return ebx;
+}
 
-	if (!cpu_has_xsaves) {
-		/*
-		 * - CPUID function 0DH, sub-function 0:
-		 *    EBX enumerates the size (in bytes) required by
-		 *    the XSAVE instruction for an XSAVE area
-		 *    containing all the *user* state components
-		 *    corresponding to bits currently set in XCR0.
-		 */
-		cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
-		calculated_xstate_size = ebx;
-	} else {
-		/*
-		 * - CPUID function 0DH, sub-function 1:
-		 *    EBX enumerates the size (in bytes) required by
-		 *    the XSAVES instruction for an XSAVE area
-		 *    containing all the state components
-		 *    corresponding to bits currently set in
-		 *    XCR0 | IA32_XSS.
-		 */
-		cpuid_count(XSTATE_CPUID, 1, &eax, &ebx, &ecx, &edx);
-		calculated_xstate_size = ebx;
-	}
-	return calculated_xstate_size;
+static unsigned int __init get_xsave_size(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+	/*
+	 * - CPUID function 0DH, sub-function 0:
+	 *    EBX enumerates the size (in bytes) required by
+	 *    the XSAVE instruction for an XSAVE area
+	 *    containing all the *user* state components
+	 *    corresponding to bits currently set in XCR0.
+	 */
+	cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
+	return ebx;
 }
 
 /*
@@ -576,7 +578,15 @@ static bool is_supported_xstate_size(unsigned int test_xstate_size)
 static int init_xstate_size(void)
 {
 	/* Recompute the context size for enabled features: */
-	unsigned int possible_xstate_size = calculate_xstate_size();
+	unsigned int possible_xstate_size;
+	unsigned int xsave_size;
+
+	xsave_size = get_xsave_size();
+
+	if (cpu_has_xsaves)
+		possible_xstate_size = get_xsaves_size();
+	else
+		possible_xstate_size = xsave_size;
 
 	/* Ensure we have the space to store all enabled: */
 	if (!is_supported_xstate_size(possible_xstate_size))
@@ -588,6 +598,11 @@ static int init_xstate_size(void)
 	 */
 	xstate_size = possible_xstate_size;
 	do_extra_xstate_size_checks();
+
+	/*
+	 * User space is always in standard format.
+	 */
+	user_xstate_size = xsave_size;
 	return 0;
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 02/10] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 01/10] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 03/10] x86/xsaves: Keep init_fpstate.xsave.header.xfeatures as zero for init optimization Yu-cheng Yu
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

User space uses standard format xsave area. fpstate in signal frame should
have standard format size.

To explicitly distinguish between xstate size in kernel space and the one
in user space, we rename xstate_size to kernel_xstate_size. This patch is
not fixing a bug. It just makes kernel code more clear.

So we define the xsave area sizes in two global variables:

kernel_xstate_size (previous xstate_size): the xsave area size used in
xsave area allocated in kernel
user_xstate_size: the xsave area size used in xsave area used by user.

In no "xsaves" case, xsave area in both user space and kernel space are in
standard format. Therefore, kernel_xstate_size and user_xstate_size are
equal.

In "xsaves" case, xsave area in user space is in standard format while
xsave area in kernel space is in compact format. Therefore, kernel's
xstate size is less than user's xstate size.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/include/asm/processor.h |  2 +-
 arch/x86/kernel/fpu/core.c       |  6 +++---
 arch/x86/kernel/fpu/init.c       | 18 +++++++++---------
 arch/x86/kernel/fpu/signal.c     |  2 +-
 arch/x86/kernel/fpu/xstate.c     |  8 ++++----
 5 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 01e127e..68a5fa4 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -365,7 +365,7 @@ DECLARE_PER_CPU(struct irq_stack *, hardirq_stack);
 DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
 #endif	/* X86_64 */
 
-extern unsigned int xstate_size;
+extern unsigned int kernel_xstate_size;
 extern unsigned int user_xstate_size;
 
 struct perf_event;
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d25097c..faa00c0 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -214,7 +214,7 @@ void fpstate_init(union fpregs_state *state)
 		return;
 	}
 
-	memset(state, 0, xstate_size);
+	memset(state, 0, kernel_xstate_size);
 
 	if (cpu_has_fxsr)
 		fpstate_init_fxstate(&state->fxsave);
@@ -238,7 +238,7 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	 * leak into the child task:
 	 */
 	if (use_eager_fpu())
-		memset(&dst_fpu->state.xsave, 0, xstate_size);
+		memset(&dst_fpu->state.xsave, 0, kernel_xstate_size);
 
 	/*
 	 * Save current FPU registers directly into the child
@@ -258,7 +258,7 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	 */
 	preempt_disable();
 	if (!copy_fpregs_to_fpstate(dst_fpu)) {
-		memcpy(&src_fpu->state, &dst_fpu->state, xstate_size);
+		memcpy(&src_fpu->state, &dst_fpu->state, kernel_xstate_size);
 		fpregs_deactivate(src_fpu);
 	}
 	preempt_enable();
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 4ac2561..b5952d5 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -143,8 +143,8 @@ static void __init fpu__init_system_generic(void)
  * This is inherent to the XSAVE architecture which puts all state
  * components into a single, continuous memory block:
  */
-unsigned int xstate_size;
-EXPORT_SYMBOL_GPL(xstate_size);
+unsigned int kernel_xstate_size;
+EXPORT_SYMBOL_GPL(kernel_xstate_size);
 
 /* Get alignment of the TYPE. */
 #define TYPE_ALIGN(TYPE) offsetof(struct { char x; TYPE test; }, test)
@@ -176,7 +176,7 @@ static void __init fpu__init_task_struct_size(void)
 	 * Add back the dynamically-calculated register state
 	 * size.
 	 */
-	task_size += xstate_size;
+	task_size += kernel_xstate_size;
 
 	/*
 	 * We dynamically size 'struct fpu', so we require that
@@ -193,7 +193,7 @@ static void __init fpu__init_task_struct_size(void)
 }
 
 /*
- * Set up the user and kernel xstate_size based on the legacy FPU context size.
+ * Set up the user and kernel xstate sizes based on the legacy FPU context size.
  *
  * We set this up first, and later it will be overwritten by
  * fpu__init_system_xstate() if the CPU knows about xstates.
@@ -206,7 +206,7 @@ static void __init fpu__init_system_xstate_size_legacy(void)
 	on_boot_cpu = 0;
 
 	/*
-	 * Note that xstate_size might be overwriten later during
+	 * Note that xstate sizes might be overwriten later during
 	 * fpu__init_system_xstate().
 	 */
 
@@ -217,15 +217,15 @@ static void __init fpu__init_system_xstate_size_legacy(void)
 		 */
 		setup_clear_cpu_cap(X86_FEATURE_XSAVE);
 		setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
-		xstate_size = sizeof(struct swregs_state);
+		kernel_xstate_size = sizeof(struct swregs_state);
 	} else {
 		if (cpu_has_fxsr)
-			xstate_size = sizeof(struct fxregs_state);
+			kernel_xstate_size = sizeof(struct fxregs_state);
 		else
-			xstate_size = sizeof(struct fregs_state);
+			kernel_xstate_size = sizeof(struct fregs_state);
 	}
 
-	user_xstate_size = xstate_size;
+	user_xstate_size = kernel_xstate_size;
 
 	/*
 	 * Quirk: we don't yet handle the XSAVES* instructions
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index ee6d662..0fbf60c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -261,7 +261,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
-	int state_size = xstate_size;
+	int state_size = kernel_xstate_size;
 	u64 xfeatures = 0;
 	int fx_only = 0;
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9d41c0f..b8d6d98 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -517,7 +517,7 @@ static void do_extra_xstate_size_checks(void)
 		 */
 		paranoid_xstate_size += xfeature_size(i);
 	}
-	XSTATE_WARN_ON(paranoid_xstate_size != xstate_size);
+	XSTATE_WARN_ON(paranoid_xstate_size != kernel_xstate_size);
 }
 
 
@@ -596,7 +596,7 @@ static int init_xstate_size(void)
 	 * The size is OK, we are definitely going to use xsave,
 	 * make it known to the world that we need more space.
 	 */
-	xstate_size = possible_xstate_size;
+	kernel_xstate_size = possible_xstate_size;
 	do_extra_xstate_size_checks();
 
 	/*
@@ -659,14 +659,14 @@ void __init fpu__init_system_xstate(void)
 		return;
 	}
 
-	update_regset_xstate_info(xstate_size, xfeatures_mask);
+	update_regset_xstate_info(kernel_xstate_size, xfeatures_mask);
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
 	setup_xstate_comp();
 
 	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
 		xfeatures_mask,
-		xstate_size,
+		kernel_xstate_size,
 		cpu_has_xsaves ? "compacted" : "standard");
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 03/10] x86/xsaves: Keep init_fpstate.xsave.header.xfeatures as zero for init optimization
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 01/10] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 02/10] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly Yu-cheng Yu
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

Keep init_fpstate.xsave.header.xfeatures as zero for init optimization.
This is important for init optimization that is implemented in processor.
If a bit corresponding to an xstate in xstate_bv is 0, it means the
xstate is in init status and will not be read from memory to the processor
during XRSTOR/XRSTORS instruction. This largely impacts context switch
performance.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/kernel/fpu/xstate.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index b8d6d98..25e11a1 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -311,13 +311,11 @@ static void __init setup_init_fpu_buf(void)
 	setup_xstate_features();
 	print_xstate_features();
 
-	if (cpu_has_xsaves) {
+	if (cpu_has_xsaves)
 		init_fpstate.xsave.header.xcomp_bv = (u64)1 << 63 | xfeatures_mask;
-		init_fpstate.xsave.header.xfeatures = xfeatures_mask;
-	}
 
 	/*
-	 * Init all the features state with header_bv being 0x0
+	 * Init all the features state with header.xfeatures being 0x0
 	 */
 	copy_kernel_to_xregs_booting(&init_fpstate.xsave);
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (2 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 03/10] x86/xsaves: Keep init_fpstate.xsave.header.xfeatures as zero for init optimization Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-04-29 20:09   ` Dave Hansen
  2016-03-04 18:12 ` [PATCH v4 05/10] x86/xsaves: Align xstate components according to CPUID Yu-cheng Yu
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

XSAVES is a kernel instruction and uses a compacted format. When
working with user space, the kernel should provide standard-format,
non-supervisor state data. We cannot do __copy_to_user() from a compacted-
format kernel xstate area to a signal frame.

Note that the path to copy_fpstate_to_sigframe() does currently check if
the thread has used FPU, but add a WARN_ONCE() there to detect any
potential mis-use.

Dave Hansen proposes this method to simplify copy xstate directly to user.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/kernel/fpu/signal.c | 41 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 0fbf60c..09945f1 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -130,6 +130,45 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 	return err;
 }
 
+static int may_copy_fpregs_to_sigframe(void)
+{
+	/*
+	 * In signal handling path, the kernel already checks if
+	 * FPU instructions have been used before it calls
+	 * copy_fpstate_to_sigframe(). We check this here again
+	 * to detect any potential mis-use and saving invalid
+	 * register values directly to a signal frame.
+	 */
+	WARN_ONCE(!current->thread.fpu.fpstate_active,
+		  "direct FPU save with no math use\n");
+
+	/*
+	 * In the case that we are using a compacted kernel
+	 * xsave area, we can not copy the thread.fpu.state
+	 * directly to userspace and *must* save it from the
+	 * registers directly.
+	 */
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
+		return 1;
+
+	/*
+	 * fpregs_active() means "Can I use the FPU hardware
+	 * without taking a device-not-available exception?" This
+	 * means that saving the registers directly will be
+	 * cheaper than copying their contents out of
+	 * thread.fpu.state.
+	 *
+	 * Note that fpregs_active() is inherently racy and may
+	 * become false at any time.  If this race happens, we
+	 * will take a harmless device-not-available exception
+	 * when we attempt the FPU save instruction.
+	 */
+	if (fpregs_active())
+		return 1;
+
+	return 0;
+}
+
 /*
  * Save the fpu, extended register state to the user signal frame.
  *
@@ -167,7 +206,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	if (fpregs_active()) {
+	if (may_copy_fpregs_to_sigframe()) {
 		/* Save the live register state to the user directly. */
 		if (copy_fpregs_to_sigframe(buf_fx))
 			return -1;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 05/10] x86/xsaves: Align xstate components according to CPUID
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (3 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-04-29 20:11   ` Dave Hansen
  2016-03-04 18:12 ` [PATCH v4 06/10] x86/xsaves: Supervisor state component offset Yu-cheng Yu
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

CPUID function 0x0d, sub function (i, i > 1) returns in ecx[1] the
alignment requirement of component i when the compacted format is used.

If ecx[1] is 0, component i is located immediately following the preceding
component. If ecx[1] is 1, component i is located on the next 64-byte
boundary following the preceding component.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/kernel/fpu/xstate.c | 60 +++++++++++++++++++++++---------------------
 1 file changed, 32 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 25e11a1..6e42b87 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -252,6 +252,33 @@ static void __init print_xstate_features(void)
 }
 
 /*
+ * This check is important because it is easy to get XSTATE_*
+ * confused with XSTATE_BIT_*.
+ */
+#define CHECK_XFEATURE(nr) do {		\
+	WARN_ON(nr < FIRST_EXTENDED_XFEATURE);	\
+	WARN_ON(nr >= XFEATURE_MAX);	\
+} while (0)
+
+/*
+ * We could cache this like xstate_size[], but we only use
+ * it here, so it would be a waste of space.
+ */
+static int xfeature_is_aligned(int xfeature_nr)
+{
+	u32 eax, ebx, ecx, edx;
+
+	CHECK_XFEATURE(xfeature_nr);
+	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
+	/*
+	 * The value returned by ECX[1] indicates the alignment
+	 * of state component i when the compacted format
+	 * of the extended region of an XSAVE area is used
+	 */
+	return !!(ecx & 2);
+}
+
+/*
  * This function sets up offsets and sizes of all extended states in
  * xsave area. This supports both standard format and compacted format
  * of the xsave aread.
@@ -288,10 +315,14 @@ static void __init setup_xstate_comp(void)
 		else
 			xstate_comp_sizes[i] = 0;
 
-		if (i > FIRST_EXTENDED_XFEATURE)
+		if (i > FIRST_EXTENDED_XFEATURE) {
 			xstate_comp_offsets[i] = xstate_comp_offsets[i-1]
 					+ xstate_comp_sizes[i-1];
 
+			if (xfeature_is_aligned(i))
+				xstate_comp_offsets[i] =
+					ALIGN(xstate_comp_offsets[i], 64);
+		}
 	}
 }
 
@@ -348,33 +379,6 @@ static int xfeature_is_user(int xfeature_nr)
 }
 */
 
-/*
- * This check is important because it is easy to get XSTATE_*
- * confused with XSTATE_BIT_*.
- */
-#define CHECK_XFEATURE(nr) do {		\
-	WARN_ON(nr < FIRST_EXTENDED_XFEATURE);	\
-	WARN_ON(nr >= XFEATURE_MAX);	\
-} while (0)
-
-/*
- * We could cache this like xstate_size[], but we only use
- * it here, so it would be a waste of space.
- */
-static int xfeature_is_aligned(int xfeature_nr)
-{
-	u32 eax, ebx, ecx, edx;
-
-	CHECK_XFEATURE(xfeature_nr);
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
-	/*
-	 * The value returned by ECX[1] indicates the alignment
-	 * of state component i when the compacted format
-	 * of the extended region of an XSAVE area is used
-	 */
-	return !!(ecx & 2);
-}
-
 static int xfeature_uncompacted_offset(int xfeature_nr)
 {
 	u32 eax, ebx, ecx, edx;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 06/10] x86/xsaves: Supervisor state component offset
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (4 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 05/10] x86/xsaves: Align xstate components according to CPUID Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-03-04 18:12 ` [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES Yu-cheng Yu
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

CPUID function 0x0d, sub function (i, i > 1) returns in ebx the offset of
xstate component i. Zero is returned for a supervisor state. A supervisor
state can only be saved by XSAVES and XSAVES uses a compacted format.
There is no fixed offset for a supervisor state. This patch checks and
makes sure a supervisor state offset is not recorded or mis-used. This has
no effect in practice as we currently use no supervisor states, but it
would be good to fix.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/fpu/types.h  |  2 ++
 arch/x86/include/asm/fpu/xstate.h |  3 ++
 arch/x86/kernel/fpu/xstate.c      | 62 ++++++++++++++++++++++++---------------
 3 files changed, 44 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 1c6f6ac..11466cf 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -108,6 +108,7 @@ enum xfeature {
 	XFEATURE_OPMASK,
 	XFEATURE_ZMM_Hi256,
 	XFEATURE_Hi16_ZMM,
+	XFEATURE_PT,
 
 	XFEATURE_MAX,
 };
@@ -120,6 +121,7 @@ enum xfeature {
 #define XFEATURE_MASK_OPMASK		(1 << XFEATURE_OPMASK)
 #define XFEATURE_MASK_ZMM_Hi256		(1 << XFEATURE_ZMM_Hi256)
 #define XFEATURE_MASK_Hi16_ZMM		(1 << XFEATURE_Hi16_ZMM)
+#define XFEATURE_MASK_PT		(1 << XFEATURE_PT)
 
 #define XFEATURE_MASK_FPSSE		(XFEATURE_MASK_FP | XFEATURE_MASK_SSE)
 #define XFEATURE_MASK_AVX512		(XFEATURE_MASK_OPMASK \
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index c6667f2..b4f5d94 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -18,6 +18,9 @@
 #define XSAVE_YMM_SIZE	    256
 #define XSAVE_YMM_OFFSET    (XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
 
+/* Supervisor features */
+#define XFEATURE_MASK_SUPERVISOR (XFEATURE_MASK_PT)
+
 /* Supported features which support lazy state saving */
 #define XFEATURE_MASK_LAZY	(XFEATURE_MASK_FP | \
 				 XFEATURE_MASK_SSE)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 6e42b87..aaab0d3 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -95,6 +95,27 @@ int cpu_has_xfeatures(u64 xfeatures_needed, const char **feature_name)
 }
 EXPORT_SYMBOL_GPL(cpu_has_xfeatures);
 
+static int xfeature_is_supervisor(int xfeature_nr)
+{
+	/*
+	 * We currently do not support supervisor states, but if
+	 * we did, we could find out like this.
+	 *
+	 * SDM says: If state component i is a user state component,
+	 * ECX[0] return 0; if state component i is a supervisor
+	 * state component, ECX[0] returns 1.
+	 */
+	u32 eax, ebx, ecx, edx;
+
+	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
+	return !!(ecx & 1);
+}
+
+static int xfeature_is_user(int xfeature_nr)
+{
+	return !xfeature_is_supervisor(xfeature_nr);
+}
+
 /*
  * When executing XSAVEOPT (or other optimized XSAVE instructions), if
  * a processor implementation detects that an FPU state component is still
@@ -213,7 +234,14 @@ static void __init setup_xstate_features(void)
 			continue;
 
 		cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
-		xstate_offsets[i] = ebx;
+
+		/*
+		 * If an xfeature is supervisor state, the offset
+		 * in ebx is invalid. We leave it to -1.
+		 */
+		if (xfeature_is_user(i))
+			xstate_offsets[i] = ebx;
+
 		xstate_sizes[i] = eax;
 		/*
 		 * In our xstate size checks, we assume that the
@@ -357,32 +385,20 @@ static void __init setup_init_fpu_buf(void)
 	copy_xregs_to_kernel_booting(&init_fpstate.xsave);
 }
 
-static int xfeature_is_supervisor(int xfeature_nr)
-{
-	/*
-	 * We currently do not support supervisor states, but if
-	 * we did, we could find out like this.
-	 *
-	 * SDM says: If state component i is a user state component,
-	 * ECX[0] return 0; if state component i is a supervisor
-	 * state component, ECX[0] returns 1.
-	u32 eax, ebx, ecx, edx;
-	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx;
-	return !!(ecx & 1);
-	*/
-	return 0;
-}
-/*
-static int xfeature_is_user(int xfeature_nr)
-{
-	return !xfeature_is_supervisor(xfeature_nr);
-}
-*/
-
 static int xfeature_uncompacted_offset(int xfeature_nr)
 {
 	u32 eax, ebx, ecx, edx;
 
+	/*
+	 * Only XSAVES supports supervisor states and it uses compacted
+	 * format. Checking a supervisor state's uncompacted offset is
+	 * an error.
+	 */
+	if (XFEATURE_MASK_SUPERVISOR & (1 << xfeature_nr)) {
+		WARN_ONCE(1, "No fixed offset for xstate %d\n", xfeature_nr);
+		return -1;
+	}
+
 	CHECK_XFEATURE(xfeature_nr);
 	cpuid_count(XSTATE_CPUID, xfeature_nr, &eax, &ebx, &ecx, &edx);
 	return ebx;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (5 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 06/10] x86/xsaves: Supervisor state component offset Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-04-29 20:16   ` Dave Hansen
  2016-04-29 20:25   ` Dave Hansen
  2016-03-04 18:12 ` [PATCH v4 08/10] x86/xsaves: Fix XSTATE component offset print out Yu-cheng Yu
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

XSAVES uses compacted format and is a kernel instruction. The kernel
should use standard-format, non-supervisor state data for PTRACE.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/fpu/xstate.h |   4 +
 arch/x86/kernel/fpu/regset.c      |  56 +++++++++---
 arch/x86/kernel/fpu/xstate.c      | 173 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 217 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index b4f5d94..a5cd808 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -50,5 +50,9 @@ extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
 void fpu__xstate_clear_all_cpu_caps(void);
 void *get_xsave_addr(struct xregs_state *xsave, int xstate);
 const void *get_xsave_field_ptr(int xstate_field);
+int copyout_from_xsaves(unsigned int pos, unsigned int count, void *kbuf,
+			void __user *ubuf, const struct xregs_state *xsave);
+int copyin_to_xsaves(const void *kbuf, const void __user *ubuf,
+		     struct xregs_state *xsave);
 
 #endif
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 0bc3490..61fe8e9 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -4,6 +4,7 @@
 #include <asm/fpu/internal.h>
 #include <asm/fpu/signal.h>
 #include <asm/fpu/regset.h>
+#include <asm/fpu/xstate.h>
 
 /*
  * The xstateregs_active() routine is the same as the regset_fpregs_active() routine,
@@ -82,21 +83,30 @@ int xstateregs_get(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_has_xsave)
 		return -ENODEV;
 
+	xsave = &fpu->state.xsave;
+
 	fpu__activate_fpstate_read(fpu);
 
-	xsave = &fpu->state.xsave;
+	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
+		ret = copyout_from_xsaves(pos, count, kbuf, ubuf, xsave);
+	} else {
+		fpstate_sanitize_xstate(fpu);
+
+		/*
+		 * Copy the 48 bytes defined by the software into the xsave
+		 * area in the thread struct, so that we can copy the whole
+		 * area to user using one user_regset_copyout().
+		 */
+		memcpy(&xsave->i387.sw_reserved,
+			xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
+
+		/*
+		 * Copy the xstate memory layout.
+		 */
+		ret = user_regset_copyout(&pos,
+					  &count, &kbuf, &ubuf, xsave, 0, -1);
+	}
 
-	/*
-	 * Copy the 48bytes defined by the software first into the xstate
-	 * memory layout in the thread struct, so that we can copy the entire
-	 * xstateregs to the user using one user_regset_copyout().
-	 */
-	memcpy(&xsave->i387.sw_reserved,
-		xstate_fx_sw_bytes, sizeof(xstate_fx_sw_bytes));
-	/*
-	 * Copy the xstate memory layout.
-	 */
-	ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
 	return ret;
 }
 
@@ -111,11 +121,29 @@ int xstateregs_set(struct task_struct *target, const struct user_regset *regset,
 	if (!cpu_has_xsave)
 		return -ENODEV;
 
-	fpu__activate_fpstate_write(fpu);
+	/*
+	 * A whole standard-format XSAVE buffer is needed.
+	 */
+	if ((pos != 0) || (count < user_xstate_size))
+		return -EFAULT;
 
 	xsave = &fpu->state.xsave;
 
-	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, xsave, 0, -1);
+	fpu__activate_fpstate_write(fpu);
+
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
+		ret = copyin_to_xsaves(kbuf, ubuf, xsave);
+	else
+		ret = user_regset_copyin(&pos,
+					 &count, &kbuf, &ubuf, xsave, 0, -1);
+
+	/*
+	 * In case of failure, mark all states as init.
+	 */
+
+	if (ret)
+		fpstate_init(&fpu->state);
+
 	/*
 	 * mxcsr reserved bits must be masked to zero for security reasons.
 	 */
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index aaab0d3..b9d4d59 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -677,7 +677,13 @@ void __init fpu__init_system_xstate(void)
 		return;
 	}
 
-	update_regset_xstate_info(kernel_xstate_size, xfeatures_mask);
+	/*
+	 * Update info used for ptrace frames; use standard-format size and no
+	 * supervisor xstates.
+	 */
+	update_regset_xstate_info(user_xstate_size,
+		xfeatures_mask & ~XFEATURE_MASK_SUPERVISOR);
+
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
 	setup_xstate_comp();
@@ -701,6 +707,15 @@ void fpu__resume_cpu(void)
 }
 
 /*
+ * Get an xstate component address for either kernel or user mode.
+ */
+static void *get_xsave_addr_no_check(const struct xregs_state *xsave,
+				     int feature_nr)
+{
+	return (void *)xsave + xstate_comp_offsets[feature_nr];
+}
+
+/*
  * Given the xsave area and a state inside, this function returns the
  * address of the state.
  *
@@ -748,7 +763,7 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	if (!(xsave->header.xfeatures & xstate_feature))
 		return NULL;
 
-	return (void *)xsave + xstate_comp_offsets[feature_nr];
+	return get_xsave_addr_no_check(xsave, feature_nr);
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
@@ -783,3 +798,157 @@ const void *get_xsave_field_ptr(int xsave_state)
 
 	return get_xsave_addr(&fpu->state.xsave, xsave_state);
 }
+
+/*
+ * This is similar to user_regset_copyout(), but will not add offset to
+ * the source data pointer or increment pos, count, kbuf, and ubuf.
+ */
+static inline int xstate_copyout(unsigned int pos, unsigned int count,
+				 void *kbuf, void __user *ubuf,
+				 const void *data, const int start_pos,
+				 const int end_pos)
+{
+	if ((count == 0) || (pos < start_pos))
+		return 0;
+
+	if (end_pos < 0 || pos < end_pos) {
+		unsigned int copy =
+			(end_pos < 0 ? count : min(count, end_pos - pos));
+
+		if (kbuf)
+			memcpy(kbuf + pos, data, copy);
+		else if (__copy_to_user(ubuf + pos, data, copy))
+			return -EFAULT;
+	}
+	return 0;
+}
+
+/*
+ * Convert from kernel XSAVES compacted format to standard format and copy
+ * to a ptrace buffer. It supports partial copy but pos always starts from
+ * zero. This is called from xstateregs_get() and there we check the cpu
+ * has XSAVES.
+ */
+int copyout_from_xsaves(unsigned int pos, unsigned int count, void *kbuf,
+			void __user *ubuf, const struct xregs_state *xsave)
+{
+	unsigned int offset, size;
+	int ret, i;
+	struct xstate_header header;
+
+	/*
+	 * Currently copy_regset_to_user() starts from pos 0.
+	 */
+	if (unlikely(pos != 0))
+		return -EFAULT;
+
+	/*
+	 * The destination is a ptrace buffer; we put in only user xstates.
+	 */
+	memset(&header, 0, sizeof(header));
+	header.xfeatures = xsave->header.xfeatures;
+	header.xfeatures &= ~XFEATURE_MASK_SUPERVISOR;
+
+	/*
+	 * Copy xregs_state->header.
+	 */
+	offset = offsetof(struct xregs_state, header);
+	size = sizeof(header);
+
+	ret = xstate_copyout(offset, size, kbuf, ubuf, &header, 0, count);
+
+	if (ret)
+		return ret;
+
+	for (i = 0; i < XFEATURE_MAX; i++) {
+		/*
+		 * Copy only in-use xstates.
+		 */
+		if (((header.xfeatures >> i) & 1) && xfeature_enabled(i)) {
+			void *src = get_xsave_addr_no_check(xsave, i);
+
+			offset = xstate_offsets[i];
+			size = xstate_sizes[i];
+
+			ret = xstate_copyout(offset, size, kbuf, ubuf, src, 0,
+					     count);
+
+			if (ret)
+				return ret;
+
+			if (offset + size >= count)
+				break;
+		}
+	}
+
+	/*
+	 * Fill xsave->i387.sw_reserved value for ptrace frame.
+	 */
+	offset = offsetof(struct fxregs_state, sw_reserved);
+	size = sizeof(xstate_fx_sw_bytes);
+
+	ret = xstate_copyout(offset, size, kbuf, ubuf, xstate_fx_sw_bytes, 0,
+			     count);
+
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+/*
+ * Convert from a ptrace standard-format buffer to kernel XSAVES format
+ * and copy to the target thread. This is called from xstateregs_set() and
+ * there we check the cpu has XSAVES and a whole standard-sized buffer
+ * exists.
+ */
+int copyin_to_xsaves(const void *kbuf, const void __user *ubuf,
+		     struct xregs_state *xsave)
+{
+	unsigned int offset, size;
+	int i;
+	u64 xfeatures;
+
+	offset = offsetof(struct xregs_state, header);
+	size = sizeof(xfeatures);
+
+	if (kbuf)
+		memcpy(&xfeatures, kbuf + offset, size);
+	else if (__copy_from_user(&xfeatures, ubuf + offset, size))
+		return -EFAULT;
+
+	/*
+	 * Reject if the user tries to set any supervisor xstates.
+	 */
+	if (xfeatures & XFEATURE_MASK_SUPERVISOR)
+		return -EINVAL;
+
+	for (i = 0; i < XFEATURE_MAX; i++) {
+		u64 mask = ((u64)1 << i);
+
+		if ((xfeatures & mask) && xfeature_enabled(i)) {
+			void *dst = get_xsave_addr_no_check(xsave, i);
+
+			offset = xstate_offsets[i];
+			size = xstate_sizes[i];
+
+			if (kbuf)
+				memcpy(dst, kbuf + offset, size);
+			else if (__copy_from_user(dst, ubuf + offset, size))
+				return -EFAULT;
+		}
+	}
+
+	/*
+	 * The state that came in from userspace was user-state only.
+	 * Mask all the user states out of 'xfeatures'.
+	 */
+	xsave->header.xfeatures &= XFEATURE_MASK_SUPERVISOR;
+
+	/*
+	 * Add back in the features that came in from userspace.
+	 */
+	xsave->header.xfeatures |= xfeatures;
+
+	return 0;
+}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 08/10] x86/xsaves: Fix XSTATE component offset print out
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (6 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-04-29 20:26   ` Dave Hansen
  2016-03-04 18:12 ` [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components Yu-cheng Yu
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

Component offset print out was incorrect for XSAVES. Correct it and move
to a separate function.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/kernel/fpu/xstate.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index b9d4d59..2e80d6f 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -251,8 +251,6 @@ static void __init setup_xstate_features(void)
 		WARN_ONCE(last_good_offset > xstate_offsets[i],
 			"x86/fpu: misordered xstate at %d\n", last_good_offset);
 		last_good_offset = xstate_offsets[i];
-
-		printk(KERN_INFO "x86/fpu: xstate_offset[%d]: %4d, xstate_sizes[%d]: %4d\n", i, ebx, i, eax);
 	}
 }
 
@@ -355,6 +353,21 @@ static void __init setup_xstate_comp(void)
 }
 
 /*
+ * Print out xstate component offsets and sizes
+ */
+static void __init print_xstate_offset_size(void)
+{
+	int i;
+
+	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
+		if (!xfeature_enabled(i))
+			continue;
+		pr_info("x86/fpu: xstate_offset[%d]: %4d, xstate_sizes[%d]: %4d\n",
+			 i, xstate_comp_offsets[i], i, xstate_sizes[i]);
+	}
+}
+
+/*
  * setup the xstate image representing the init state
  */
 static void __init setup_init_fpu_buf(void)
@@ -687,6 +700,7 @@ void __init fpu__init_system_xstate(void)
 	fpu__init_prepare_fx_sw_frame();
 	setup_init_fpu_buf();
 	setup_xstate_comp();
+	print_xstate_offset_size();
 
 	pr_info("x86/fpu: Enabled xstate features 0x%llx, context size is %d bytes, using '%s' format.\n",
 		xfeatures_mask,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (7 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 08/10] x86/xsaves: Fix XSTATE component offset print out Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-04-29 20:28   ` Dave Hansen
  2016-03-04 18:12 ` [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES Yu-cheng Yu
  2016-04-29 18:09 ` [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Dave Hansen
  10 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

The arrays xstate_offsets[] and xstate_sizes[] record XSAVE area 
offsets and sizes. Values for legacy components i387 and XMMs were
not initialized. Fix it.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/kernel/fpu/xstate.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 2e80d6f..06618ea 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -228,6 +228,15 @@ static void __init setup_xstate_features(void)
 	/* start at the beginnning of the "extended state" */
 	unsigned int last_good_offset = offsetof(struct xregs_state,
 						 extended_state_area);
+	/*
+	 * The FP xstates and SSE xstates are legacy states. They are always
+	 * in the fixed offsets in the xsave area in either compacted form
+	 * or standard form.
+	 */
+	xstate_offsets[0] = 0;
+	xstate_sizes[0] = offsetof(struct fxregs_state, xmm_space);
+	xstate_offsets[1] = xstate_sizes[0];
+	xstate_sizes[1] = FIELD_SIZEOF(struct fxregs_state, xmm_space);
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
 		if (!xfeature_enabled(i))
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (8 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components Yu-cheng Yu
@ 2016-03-04 18:12 ` Yu-cheng Yu
  2016-04-29 20:32   ` Dave Hansen
  2016-05-04 22:41   ` Dave Hansen
  2016-04-29 18:09 ` [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Dave Hansen
  10 siblings, 2 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-03-04 18:12 UTC (permalink / raw)
  To: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel
  Cc: Dave Hansen, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu, Yu-cheng Yu

We did not handle XSAVES* instructions correctly. There were issues in
converting between standard and compacted format when interfacing with
user-space. These issues have been corrected.

Add a WARN_ONCE() to make it clear that XSAVES supervisor states are not
yet implemented.

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/kernel/fpu/init.c   | 15 ---------------
 arch/x86/kernel/fpu/xstate.c | 10 ++++++++++
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index b5952d5..6bfa059 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -226,21 +226,6 @@ static void __init fpu__init_system_xstate_size_legacy(void)
 	}
 
 	user_xstate_size = kernel_xstate_size;
-
-	/*
-	 * Quirk: we don't yet handle the XSAVES* instructions
-	 * correctly, as we don't correctly convert between
-	 * standard and compacted format when interfacing
-	 * with user-space - so disable it for now.
-	 *
-	 * The difference is small: with recent CPUs the
-	 * compacted format is only marginally smaller than
-	 * the standard FPU state format.
-	 *
-	 * ( This is easy to backport while we are fixing
-	 *   XSAVES* support. )
-	 */
-	setup_clear_cpu_cap(X86_FEATURE_XSAVES);
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 06618ea..541f56e 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -204,6 +204,16 @@ void fpu__init_cpu_xstate(void)
 	if (!cpu_has_xsave || !xfeatures_mask)
 		return;
 
+	/*
+	 * Make it clear that XSAVES supervisor states are not yet
+	 * implemented should anyone expect it to work by changing
+	 * bits in XFEATURE_MASK_* macros and XCR0.
+	 */
+	WARN_ONCE((xfeatures_mask & XFEATURE_MASK_SUPERVISOR),
+		"x86/fpu: XSAVES supervisor states are not yet implemented.\n");
+
+	xfeatures_mask &= ~XFEATURE_MASK_SUPERVISOR;
+
 	cr4_set_bits(X86_CR4_OSXSAVE);
 	xsetbv(XCR_XFEATURE_ENABLED_MASK, xfeatures_mask);
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
                   ` (9 preceding siblings ...)
  2016-03-04 18:12 ` [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES Yu-cheng Yu
@ 2016-04-29 18:09 ` Dave Hansen
  2016-04-29 19:43   ` Ingo Molnar
  2016-04-29 19:57   ` Yu-cheng Yu
  10 siblings, 2 replies; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 18:09 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

Hi Folks,

I've heard through the grapevine that there's some concern that we
should not be bothering to enable XSAVES because there's not a
sufficient use case for it.  Maybe it's meager today, but I still think
we should do it.

I'll try to lay out why.

Today, on every Skylake system, this patch saves 128 bytes in each
task_struct.  If there were an Atom system with XSAVES it would save 384
bytes since there is no AVX support on Atom.  If there were a future
processor which has an xstate _past_ AVX-512, but that does not have
AVX-512 itself, that savings goes up to 2048+384 bytes.  I believe it is
*inevitable* that the savings will become substantial.

Plus, if the processors ever start supporting a supervisor state that we
_need_ in Linux, we have to XSAVES support anyway.

It's inevitable that we _will_ need it.

Why do it today?

Now that Skylake is out, we _can_ get reasonable testing of this feature
from early adopters in the wild.  If we turn this on today, and it
breaks, we break a relatively modest number of Skylake systems (1%? 2%?
0.1%?).  Let's say we wait $X years when the benefits are greater.  We
turn it on, and something breaks.  We'll break 50% (or 40% or whatever)
of the systems in production.

Once we *HAVE* XSAVES support, it also opens up the possibilities for
doing things like dynamic XSAVE buffer allocation.  For instance, let
threads that are not _using_ AVX-512 not waste the 2k of space for it.

So why wait?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 18:09 ` [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Dave Hansen
@ 2016-04-29 19:43   ` Ingo Molnar
  2016-04-29 19:57   ` Yu-cheng Yu
  1 sibling, 0 replies; 61+ messages in thread
From: Ingo Molnar @ 2016-04-29 19:43 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu


* Dave Hansen <dave.hansen@linux.intel.com> wrote:

> Hi Folks,
> 
> I've heard through the grapevine that there's some concern that we
> should not be bothering to enable XSAVES because there's not a
> sufficient use case for it. [...]

So I have no fundamental objections against this series - I didn't apply it back 
in March because not all patches had your Reviewed-by tag. Basically after you 
sorted out all the XSAVE dynamic feature detection/sizing issues I was a happy 
camper and have no objection against XSAVES.

Could you please send a refreshed version against the latestest tip:master?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 18:09 ` [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Dave Hansen
  2016-04-29 19:43   ` Ingo Molnar
@ 2016-04-29 19:57   ` Yu-cheng Yu
  2016-04-29 20:03     ` Dave Hansen
  1 sibling, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 19:57 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 11:09:23AM -0700, Dave Hansen wrote:

> Once we *HAVE* XSAVES support, it also opens up the possibilities for
> doing things like dynamic XSAVE buffer allocation.  For instance, let
> threads that are not _using_ AVX-512 not waste the 2k of space for it.

If we can somehow modify exec* system call to scan the executable binary (in user space) and pass along a bitmask containing xfeatures used in the binary, and XSAVES is enabled in the kernel, we can easily save a lot of memory.  The kernel only needs to allocate space for tasks that actually use xstates; most of them do not. 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 19:57   ` Yu-cheng Yu
@ 2016-04-29 20:03     ` Dave Hansen
  2016-04-29 20:07       ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:03 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 12:57 PM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 11:09:23AM -0700, Dave Hansen wrote:
>> Once we *HAVE* XSAVES support, it also opens up the possibilities for
>> doing things like dynamic XSAVE buffer allocation.  For instance, let
>> threads that are not _using_ AVX-512 not waste the 2k of space for
>> it.
> 
> If we can somehow modify exec* system call to scan the executable
> binary (in user space) and pass along a bitmask containing xfeatures
> used in the binary, and XSAVES is enabled in the kernel, we can
> easily save a lot of memory.  The kernel only needs to allocate space
> for tasks that actually use xstates; most of them do not.

That's not feasible.  Think of dynamic libraries or just-in-time
compilers.  What instruction set does /usr/bin/java use, for instance? :)

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:03     ` Dave Hansen
@ 2016-04-29 20:07       ` Yu-cheng Yu
  2016-04-29 20:25         ` Andy Lutomirski
  2016-04-29 20:37         ` Dave Hansen
  0 siblings, 2 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 20:07 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:03:43PM -0700, Dave Hansen wrote:
> That's not feasible.  Think of dynamic libraries or just-in-time
> compilers.  What instruction set does /usr/bin/java use, for instance? :)

The java argument is true. In that case or when the bitmask is missing, we can allocate for all supported features. 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-03-04 18:12 ` [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly Yu-cheng Yu
@ 2016-04-29 20:09   ` Dave Hansen
  2016-04-29 22:43     ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:09 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index 0fbf60c..09945f1 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -130,6 +130,45 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
>  	return err;
>  }
>  
> +static int may_copy_fpregs_to_sigframe(void)
> +{
> +	/*
> +	 * In signal handling path, the kernel already checks if
> +	 * FPU instructions have been used before it calls
> +	 * copy_fpstate_to_sigframe(). We check this here again
> +	 * to detect any potential mis-use and saving invalid
> +	 * register values directly to a signal frame.
> +	 */
> +	WARN_ONCE(!current->thread.fpu.fpstate_active,
> +		  "direct FPU save with no math use\n");

This is probably an OK check for this _particular_ context (since this
context is all ready to copy_to_user() the fpu state).  But is it good
generally?  Why couldn't you have a !fpstate_active thread that _was_
fpregs_active?

Such a thread _could_ do a direct XSAVE with no issues.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 05/10] x86/xsaves: Align xstate components according to CPUID
  2016-03-04 18:12 ` [PATCH v4 05/10] x86/xsaves: Align xstate components according to CPUID Yu-cheng Yu
@ 2016-04-29 20:11   ` Dave Hansen
  0 siblings, 0 replies; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:11 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> CPUID function 0x0d, sub function (i, i > 1) returns in ecx[1] the
> alignment requirement of component i when the compacted format is used.
> 
> If ecx[1] is 0, component i is located immediately following the preceding
> component. If ecx[1] is 1, component i is located on the next 64-byte
> boundary following the preceding component.

Reviewed-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-03-04 18:12 ` [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES Yu-cheng Yu
@ 2016-04-29 20:16   ` Dave Hansen
  2016-04-29 20:38     ` Borislav Petkov
  2016-04-29 20:25   ` Dave Hansen
  1 sibling, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:16 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> +	if (boot_cpu_has(X86_FEATURE_XSAVES)) {
> +		ret = copyout_from_xsaves(pos, count, kbuf, ubuf, xsave);

On a higher level, we really should stop using
"boot_cpu_has(X86_FEATURE_XSAVES)" as a proxy for "the kernel XSAVE
buffer is in the XSAVES format".  I think our use of the _hardware_
CPUID bit is confusing at least the KVM folks.

We probably want a software X86_FEATURE_OS_XSAVES or something.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:07       ` Yu-cheng Yu
@ 2016-04-29 20:25         ` Andy Lutomirski
  2016-04-29 20:40           ` Dave Hansen
  2016-04-29 20:37         ` Dave Hansen
  1 sibling, 1 reply; 61+ messages in thread
From: Andy Lutomirski @ 2016-04-29 20:25 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: Dave Hansen, X86 ML, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, linux-kernel, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 1:07 PM, Yu-cheng Yu <yu-cheng.yu@intel.com> wrote:
> On Fri, Apr 29, 2016 at 01:03:43PM -0700, Dave Hansen wrote:
>> That's not feasible.  Think of dynamic libraries or just-in-time
>> compilers.  What instruction set does /usr/bin/java use, for instance? :)
>
> The java argument is true. In that case or when the bitmask is missing, we can allocate for all supported features.
>

I actually want to see us moving in the direction of unconditionally
allocating everything on process startup.  If we can stop using CR0.TS
entirely, I think everything will be better.

--Andy

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-03-04 18:12 ` [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES Yu-cheng Yu
  2016-04-29 20:16   ` Dave Hansen
@ 2016-04-29 20:25   ` Dave Hansen
  2016-04-29 22:30     ` Yu-cheng Yu
  1 sibling, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:25 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> +	for (i = 0; i < XFEATURE_MAX; i++) {
> +		/*
> +		 * Copy only in-use xstates.
> +		 */
> +		if (((header.xfeatures >> i) & 1) && xfeature_enabled(i)) {
> +			void *src = get_xsave_addr_no_check(xsave, i);

How could a bit in header.xfeatures get set if it is not set in
xfeature_enabled() aka xfeatures_mask aka XCR0?

...
> +int copyin_to_xsaves(const void *kbuf, const void __user *ubuf,
> +		     struct xregs_state *xsave)
> +{
> +	unsigned int offset, size;
> +	int i;
> +	u64 xfeatures;
> +
> +	offset = offsetof(struct xregs_state, header);
> +	size = sizeof(xfeatures);
> +
> +	if (kbuf)
> +		memcpy(&xfeatures, kbuf + offset, size);
> +	else if (__copy_from_user(&xfeatures, ubuf + offset, size))
> +		return -EFAULT;
> +
> +	/*
> +	 * Reject if the user tries to set any supervisor xstates.
> +	 */
> +	if (xfeatures & XFEATURE_MASK_SUPERVISOR)
> +		return -EINVAL;
> +
> +	for (i = 0; i < XFEATURE_MAX; i++) {
> +		u64 mask = ((u64)1 << i);
> +
> +		if ((xfeatures & mask) && xfeature_enabled(i)) {
> +			void *dst = get_xsave_addr_no_check(xsave, i);
> +
> +			offset = xstate_offsets[i];
> +			size = xstate_sizes[i];
> +
> +			if (kbuf)
> +				memcpy(dst, kbuf + offset, size);
> +			else if (__copy_from_user(dst, ubuf + offset, size))
> +				return -EFAULT;
> +		}
> +	}

If a caller tries to pass a non-enabled xfeature in, we appear to just
silently drop it and return success.  Is that really what we want to do
or do we want to error out?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 08/10] x86/xsaves: Fix XSTATE component offset print out
  2016-03-04 18:12 ` [PATCH v4 08/10] x86/xsaves: Fix XSTATE component offset print out Yu-cheng Yu
@ 2016-04-29 20:26   ` Dave Hansen
  0 siblings, 0 replies; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:26 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> Component offset print out was incorrect for XSAVES. Correct it and move
> to a separate function.

Reviewed-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components
  2016-03-04 18:12 ` [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components Yu-cheng Yu
@ 2016-04-29 20:28   ` Dave Hansen
  2016-04-29 22:07     ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:28 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> The arrays xstate_offsets[] and xstate_sizes[] record XSAVE area 
> offsets and sizes. Values for legacy components i387 and XMMs were
> not initialized. Fix it.

Is this just a completeness thing or does it actually break something?

In any case:

Reviewed-by: Dave Hansen <dave.hansen@intel.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-03-04 18:12 ` [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES Yu-cheng Yu
@ 2016-04-29 20:32   ` Dave Hansen
  2016-04-29 23:12     ` Yu-cheng Yu
  2016-05-04 22:41   ` Dave Hansen
  1 sibling, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:32 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> We did not handle XSAVES* instructions correctly. There were issues in
> converting between standard and compacted format when interfacing with
> user-space. These issues have been corrected.
> 
> Add a WARN_ONCE() to make it clear that XSAVES supervisor states are not
> yet implemented.

The reason I haven't acked this patch is that I want to be _sure_ that
we've audited all of the call paths that access the XSAVE buffer to
ensure that they can all either handle the XSAVES format *or* don't care
for whatever reason.

Could you share the steps that you've taken to assure yourself that all
of the call paths are handled and we don't have more bugs?

FWIW, this was the single biggest lesson I learned from the failure the
last time this got turned on: we simply didn't go look for all the
places that the new format had to be handled.  Let's be sure we don't
repeat that.

If we get this *wrong* in another user/kernel interface like we did for
ptrace and the signal save/restore and we ever enable a supervisor state
we've got an almost certain immediate root hole of some kind.  I think
we need to exercise some serious caution here.  Thank $DEITY we don't
have any supported supervisor states at the moment.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:07       ` Yu-cheng Yu
  2016-04-29 20:25         ` Andy Lutomirski
@ 2016-04-29 20:37         ` Dave Hansen
  1 sibling, 0 replies; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:37 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 01:07 PM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 01:03:43PM -0700, Dave Hansen wrote:
>>> That's not feasible.  Think of dynamic libraries or just-in-time 
>>> compilers.  What instruction set does /usr/bin/java use, for
>>> instance? :)
> The java argument is true. In that case or when the bitmask is
> missing, we can allocate for all supported features.

Remember, execve() doesn't replace the task_struct.  How do we resize
the task_struct at execve() time?  If /bin/bash doesn't use AVX, then
fork()s and execve()s an AVX-using program, what do we do?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-04-29 20:16   ` Dave Hansen
@ 2016-04-29 20:38     ` Borislav Petkov
  2016-04-29 20:40       ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Borislav Petkov @ 2016-04-29 20:38 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, Andy Lutomirski, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:16:59PM -0700, Dave Hansen wrote:
> We probably want a software X86_FEATURE_OS_XSAVES or something.

... or a simple variable.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:25         ` Andy Lutomirski
@ 2016-04-29 20:40           ` Dave Hansen
  2016-04-29 20:49             ` Andy Lutomirski
  2016-04-30  7:53             ` Ingo Molnar
  0 siblings, 2 replies; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:40 UTC (permalink / raw)
  To: Andy Lutomirski, Yu-cheng Yu
  Cc: X86 ML, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu

On 04/29/2016 01:25 PM, Andy Lutomirski wrote:
> On Fri, Apr 29, 2016 at 1:07 PM, Yu-cheng Yu <yu-cheng.yu@intel.com> wrote:
>> On Fri, Apr 29, 2016 at 01:03:43PM -0700, Dave Hansen wrote:
>>> That's not feasible.  Think of dynamic libraries or just-in-time
>>> compilers.  What instruction set does /usr/bin/java use, for instance? :)
>>
>> The java argument is true. In that case or when the bitmask is
>> missing, we can allocate for all supported features.
> 
> I actually want to see us moving in the direction of unconditionally
> allocating everything on process startup.  If we can stop using CR0.TS
> entirely, I think everything will be better.

We can absolutely allocate the worst-case XSAVE buffer at task startup
for folks that never want to see a latency spike in the life of the app
no matter what.

But I also think it would be pretty nice if 'ls' didn't pay the 2k cost
to have AVX-512 state if it's not using AVX-512.  We also don't have to
do this with CR0.TS.  We'd actually use a combination of out-of-line
(not appended to task_struct) XSAVE buffers and XGETBV1 to check the
size of our XSAVE buffer before we call XSAVE* and resize it when needed.

Maybe nobody will ever care enough about 2kbytes/thread, though.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-04-29 20:38     ` Borislav Petkov
@ 2016-04-29 20:40       ` Dave Hansen
  2016-04-29 20:46         ` Borislav Petkov
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 20:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, Andy Lutomirski, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 01:38 PM, Borislav Petkov wrote:
> On Fri, Apr 29, 2016 at 01:16:59PM -0700, Dave Hansen wrote:
>> > We probably want a software X86_FEATURE_OS_XSAVES or something.
> ... or a simple variable.

I think we do some instruction patching based on it, so I was just
suggesting the software X86_FEATURE because it would plug in to the
existing scheme easier.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-04-29 20:40       ` Dave Hansen
@ 2016-04-29 20:46         ` Borislav Petkov
  0 siblings, 0 replies; 61+ messages in thread
From: Borislav Petkov @ 2016-04-29 20:46 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel, Andy Lutomirski, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:40:51PM -0700, Dave Hansen wrote:
> I think we do some instruction patching based on it, so I was just
> suggesting the software X86_FEATURE because it would plug in to the
> existing scheme easier.

Ah ok, that's fine then.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:40           ` Dave Hansen
@ 2016-04-29 20:49             ` Andy Lutomirski
  2016-04-29 22:42               ` Dave Hansen
  2016-04-30  7:53             ` Ingo Molnar
  1 sibling, 1 reply; 61+ messages in thread
From: Andy Lutomirski @ 2016-04-29 20:49 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Yu-cheng Yu, X86 ML, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, linux-kernel, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 1:40 PM, Dave Hansen
<dave.hansen@linux.intel.com> wrote:
> On 04/29/2016 01:25 PM, Andy Lutomirski wrote:
>> On Fri, Apr 29, 2016 at 1:07 PM, Yu-cheng Yu <yu-cheng.yu@intel.com> wrote:
>>> On Fri, Apr 29, 2016 at 01:03:43PM -0700, Dave Hansen wrote:
>>>> That's not feasible.  Think of dynamic libraries or just-in-time
>>>> compilers.  What instruction set does /usr/bin/java use, for instance? :)
>>>
>>> The java argument is true. In that case or when the bitmask is
>>> missing, we can allocate for all supported features.
>>
>> I actually want to see us moving in the direction of unconditionally
>> allocating everything on process startup.  If we can stop using CR0.TS
>> entirely, I think everything will be better.
>
> We can absolutely allocate the worst-case XSAVE buffer at task startup
> for folks that never want to see a latency spike in the life of the app
> no matter what.
>
> But I also think it would be pretty nice if 'ls' didn't pay the 2k cost
> to have AVX-512 state if it's not using AVX-512.  We also don't have to
> do this with CR0.TS.  We'd actually use a combination of out-of-line
> (not appended to task_struct) XSAVE buffers and XGETBV1 to check the
> size of our XSAVE buffer before we call XSAVE* and resize it when needed.
>
> Maybe nobody will ever care enough about 2kbytes/thread, though.

I suspect we're so far about 2k/thread that no one cares.

That being said, when I wrote this email, I wasn't thinking about
compacted form at all.  I think we should allocate a viable xstate
area of some sort on startup and use saves/xrstors/xsaveopt/whatever
without fiddling with TS and eagerly save and restore even if no
extended state whatsoever has been used.  I'm certainly okay in
principle with reallocating.

However, what do we do if we run out when memory when trying to reallocate?

-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components
  2016-04-29 20:28   ` Dave Hansen
@ 2016-04-29 22:07     ` Yu-cheng Yu
  2016-04-29 22:13       ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 22:07 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:28:31PM -0700, Dave Hansen wrote:
> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> > The arrays xstate_offsets[] and xstate_sizes[] record XSAVE area 
> > offsets and sizes. Values for legacy components i387 and XMMs were
> > not initialized. Fix it.
> 
> Is this just a completeness thing or does it actually break something?

PTRACE format conversion needs them.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components
  2016-04-29 22:07     ` Yu-cheng Yu
@ 2016-04-29 22:13       ` Dave Hansen
  2016-04-29 22:15         ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 22:13 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 03:07 PM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 01:28:31PM -0700, Dave Hansen wrote:
>> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
>>> The arrays xstate_offsets[] and xstate_sizes[] record XSAVE area 
>>> offsets and sizes. Values for legacy components i387 and XMMs were
>>> not initialized. Fix it.
>>
>> Is this just a completeness thing or does it actually break something?
> 
> PTRACE format conversion needs them.

If you respin these, can you please note that in the changelog and/or
comments?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components
  2016-04-29 22:13       ` Dave Hansen
@ 2016-04-29 22:15         ` Yu-cheng Yu
  0 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 22:15 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 03:13:33PM -0700, Dave Hansen wrote:
> On 04/29/2016 03:07 PM, Yu-cheng Yu wrote:
> > On Fri, Apr 29, 2016 at 01:28:31PM -0700, Dave Hansen wrote:
> >> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> >>> The arrays xstate_offsets[] and xstate_sizes[] record XSAVE area 
> >>> offsets and sizes. Values for legacy components i387 and XMMs were
> >>> not initialized. Fix it.
> >>
> >> Is this just a completeness thing or does it actually break something?
> > 
> > PTRACE format conversion needs them.
> 
> If you respin these, can you please note that in the changelog and/or
> comments?

Do you mean when it is rebased?  I will do that.
  

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-04-29 20:25   ` Dave Hansen
@ 2016-04-29 22:30     ` Yu-cheng Yu
  2016-04-29 22:36       ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 22:30 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:25:34PM -0700, Dave Hansen wrote:
> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> > +	for (i = 0; i < XFEATURE_MAX; i++) {
> > +		/*
> > +		 * Copy only in-use xstates.
> > +		 */
> > +		if (((header.xfeatures >> i) & 1) && xfeature_enabled(i)) {
> > +			void *src = get_xsave_addr_no_check(xsave, i);
> 
> How could a bit in header.xfeatures get set if it is not set in
> xfeature_enabled() aka xfeatures_mask aka XCR0?

Do you mean, we should test xfeature_enabled(i) first, like,

	if (xfeature_enabled(i) && ((header.xfeatures >> i) & 1)) ?

The result will be the same, like you said, if XCR0[i] is not set,
hader.xfeatures[i] cannot be set.  But if XCR0[i] is set,
header.xfeatures[i] can be cleared. 

> 
> If a caller tries to pass a non-enabled xfeature in, we appear to just
> silently drop it and return success.  Is that really what we want to do
> or do we want to error out?

Let it fail.  I will chage it.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-04-29 22:30     ` Yu-cheng Yu
@ 2016-04-29 22:36       ` Dave Hansen
  2016-04-29 22:38         ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 22:36 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 03:30 PM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 01:25:34PM -0700, Dave Hansen wrote:
>> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
>>> +	for (i = 0; i < XFEATURE_MAX; i++) {
>>> +		/*
>>> +		 * Copy only in-use xstates.
>>> +		 */
>>> +		if (((header.xfeatures >> i) & 1) && xfeature_enabled(i)) {
>>> +			void *src = get_xsave_addr_no_check(xsave, i);
>>
>> How could a bit in header.xfeatures get set if it is not set in
>> xfeature_enabled() aka xfeatures_mask aka XCR0?
> 
> Do you mean, we should test xfeature_enabled(i) first, like,
> 
> 	if (xfeature_enabled(i) && ((header.xfeatures >> i) & 1)) ?
> 
> The result will be the same, like you said, if XCR0[i] is not set,
> hader.xfeatures[i] cannot be set.  But if XCR0[i] is set,
> header.xfeatures[i] can be cleared. 

I think the xfeature_enabled(i) is probably redundant.  Does it serve
any actual purpose?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES
  2016-04-29 22:36       ` Dave Hansen
@ 2016-04-29 22:38         ` Yu-cheng Yu
  0 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 22:38 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 03:36:12PM -0700, Dave Hansen wrote:
> On 04/29/2016 03:30 PM, Yu-cheng Yu wrote:
> > On Fri, Apr 29, 2016 at 01:25:34PM -0700, Dave Hansen wrote:
> >> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> >>> +	for (i = 0; i < XFEATURE_MAX; i++) {
> >>> +		/*
> >>> +		 * Copy only in-use xstates.
> >>> +		 */
> >>> +		if (((header.xfeatures >> i) & 1) && xfeature_enabled(i)) {
> >>> +			void *src = get_xsave_addr_no_check(xsave, i);
> >>
> >> How could a bit in header.xfeatures get set if it is not set in
> >> xfeature_enabled() aka xfeatures_mask aka XCR0?
> > 
> > Do you mean, we should test xfeature_enabled(i) first, like,
> > 
> > 	if (xfeature_enabled(i) && ((header.xfeatures >> i) & 1)) ?
> > 
> > The result will be the same, like you said, if XCR0[i] is not set,
> > hader.xfeatures[i] cannot be set.  But if XCR0[i] is set,
> > header.xfeatures[i] can be cleared. 
> 
> I think the xfeature_enabled(i) is probably redundant.  Does it serve
> any actual purpose?

Got it.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:49             ` Andy Lutomirski
@ 2016-04-29 22:42               ` Dave Hansen
  0 siblings, 0 replies; 61+ messages in thread
From: Dave Hansen @ 2016-04-29 22:42 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Yu-cheng Yu, X86 ML, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, linux-kernel, Andy Lutomirski, Borislav Petkov,
	Sai Praneeth Prakhya, Ravi V. Shankar, Fenghua Yu

On 04/29/2016 01:49 PM, Andy Lutomirski wrote:
>> >
>> > But I also think it would be pretty nice if 'ls' didn't pay the 2k cost
>> > to have AVX-512 state if it's not using AVX-512.  We also don't have to
>> > do this with CR0.TS.  We'd actually use a combination of out-of-line
>> > (not appended to task_struct) XSAVE buffers and XGETBV1 to check the
>> > size of our XSAVE buffer before we call XSAVE* and resize it when needed.
>> >
>> > Maybe nobody will ever care enough about 2kbytes/thread, though.
> I suspect we're so far about 2k/thread that no one cares.
> 
...
> However, what do we do if we run out when memory when trying to reallocate?

The thread has to die a horrible death.  We can switch away from it, but
can never switch back to it.  Well we can switch to it, we just can't
return to userspace.

Actually, though...  On my laptop 1/3 of the task_struct is XSAVE state
(it's ~3k).  With AVX-512, ~3/5 of it will be XSAVE, and it will be ~5k
(>1 page).  Breaking it in to two pieces makes it overall less likely
that we'd have to fail an allocation.

We'd be in a situation where we probably can't fork *anyway* if that
happened.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-04-29 20:09   ` Dave Hansen
@ 2016-04-29 22:43     ` Yu-cheng Yu
  2016-04-30  0:36       ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 22:43 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:09:07PM -0700, Dave Hansen wrote:
> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
> > diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> > index 0fbf60c..09945f1 100644
> > --- a/arch/x86/kernel/fpu/signal.c
> > +++ b/arch/x86/kernel/fpu/signal.c
> > @@ -130,6 +130,45 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
> >  	return err;
> >  }
> >  
> > +static int may_copy_fpregs_to_sigframe(void)
> > +{
> > +	/*
> > +	 * In signal handling path, the kernel already checks if
> > +	 * FPU instructions have been used before it calls
> > +	 * copy_fpstate_to_sigframe(). We check this here again
> > +	 * to detect any potential mis-use and saving invalid
> > +	 * register values directly to a signal frame.
> > +	 */
> > +	WARN_ONCE(!current->thread.fpu.fpstate_active,
> > +		  "direct FPU save with no math use\n");
> 
> This is probably an OK check for this _particular_ context (since this
> context is all ready to copy_to_user() the fpu state).  But is it good
> generally?  Why couldn't you have a !fpstate_active thread that _was_
> fpregs_active?
> 
> Such a thread _could_ do a direct XSAVE with no issues.

But it won't come to this function unless fpstate_active is ture?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-04-29 20:32   ` Dave Hansen
@ 2016-04-29 23:12     ` Yu-cheng Yu
  2016-04-30  0:40       ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-04-29 23:12 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 01:32:15PM -0700, Dave Hansen wrote:
> The reason I haven't acked this patch is that I want to be _sure_ that
> we've audited all of the call paths that access the XSAVE buffer to
> ensure that they can all either handle the XSAVES format *or* don't care
> for whatever reason.
> 
> Could you share the steps that you've taken to assure yourself that all
> of the call paths are handled and we don't have more bugs?
> 

We tested for signal, ptrace, context switch, avx, and mpx.  We also run
these tests with your audit patch to detect any format mis-match.
That said, I cannot be sure there are no more bugs.  As you said, we want
to get this feature tested in the field and find potential issues early. 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-04-29 22:43     ` Yu-cheng Yu
@ 2016-04-30  0:36       ` Dave Hansen
  2016-05-02 15:57         ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-30  0:36 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 03:43 PM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 01:09:07PM -0700, Dave Hansen wrote:
>> On 03/04/2016 10:12 AM, Yu-cheng Yu wrote:
>>> +static int may_copy_fpregs_to_sigframe(void)
>>> +{
>>> +	/*
>>> +	 * In signal handling path, the kernel already checks if
>>> +	 * FPU instructions have been used before it calls
>>> +	 * copy_fpstate_to_sigframe(). We check this here again
>>> +	 * to detect any potential mis-use and saving invalid
>>> +	 * register values directly to a signal frame.
>>> +	 */
>>> +	WARN_ONCE(!current->thread.fpu.fpstate_active,
>>> +		  "direct FPU save with no math use\n");
>>
>> This is probably an OK check for this _particular_ context (since this
>> context is all ready to copy_to_user() the fpu state).  But is it good
>> generally?  Why couldn't you have a !fpstate_active thread that _was_
>> fpregs_active?
>>
>> Such a thread _could_ do a direct XSAVE with no issues.
> 
> But it won't come to this function unless fpstate_active is ture?

If may_copy_fpregs_to_sigframe() were called from a slightly different
context, or if we change the call-site, what breaks?

In other words. if we can still "may_copy_fpregs_to_sigframe()" no
matter the state of fpu.fpstate_active, then I don't think we should be
checking it in may_copy_fpregs_to_sigframe().

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-04-29 23:12     ` Yu-cheng Yu
@ 2016-04-30  0:40       ` Dave Hansen
  2016-05-02 16:11         ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-04-30  0:40 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 04/29/2016 04:12 PM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 01:32:15PM -0700, Dave Hansen wrote:
>> The reason I haven't acked this patch is that I want to be _sure_ that
>> we've audited all of the call paths that access the XSAVE buffer to
>> ensure that they can all either handle the XSAVES format *or* don't care
>> for whatever reason.
>>
>> Could you share the steps that you've taken to assure yourself that all
>> of the call paths are handled and we don't have more bugs?
> 
> We tested for signal, ptrace, context switch, avx, and mpx.  We also run
> these tests with your audit patch to detect any format mis-match.
> That said, I cannot be sure there are no more bugs.  As you said, we want
> to get this feature tested in the field and find potential issues early. 

That's better than what we had before, but it relies entirely on testing
coverage and runtime checks.

Is it too much to ask that you also take a look and audit all the places
the XSAVE buffer is accessed in the kernel and ensure that they either
have code to handle standard vs. compacted/supervisor or don't care for
some reason?

I did such an audit once upon a time, but I think it would be a good
exercise to repeat both by a second set of eyes and because some time
has passed.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-29 20:40           ` Dave Hansen
  2016-04-29 20:49             ` Andy Lutomirski
@ 2016-04-30  7:53             ` Ingo Molnar
  2016-05-02 16:28               ` Dave Hansen
  1 sibling, 1 reply; 61+ messages in thread
From: Ingo Molnar @ 2016-04-30  7:53 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andy Lutomirski, Yu-cheng Yu, X86 ML, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski,
	Borislav Petkov, Sai Praneeth Prakhya, Ravi V. Shankar,
	Fenghua Yu


* Dave Hansen <dave.hansen@linux.intel.com> wrote:

> But I also think it would be pretty nice if 'ls' didn't pay the 2k cost to have 
> AVX-512 state if it's not using AVX-512. [...]

A C library might decide to use AVX-512 memset(). RAM is cheap, while allocation 
complexity, especially in the kernel, has various other costs.

I mean, we should not worry about per thread allocation sizes that can be compared 
to the kernel stack size.

We can still use the compacted area handling instructions, because presumably 
those are the fastest and are also the most optimized ones? But I wouldn't use 
them to do dynamic allocation: just allocate the maximum possible FPU save area at 
task creation time and never again worry about that detail.

Ok?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-04-30  0:36       ` Dave Hansen
@ 2016-05-02 15:57         ` Yu-cheng Yu
  2016-05-02 16:06           ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 15:57 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 05:36:48PM -0700, Dave Hansen wrote:
> If may_copy_fpregs_to_sigframe() were called from a slightly different
> context, or if we change the call-site, what breaks?
> 
> In other words. if we can still "may_copy_fpregs_to_sigframe()" no
> matter the state of fpu.fpstate_active, then I don't think we should be
> checking it in may_copy_fpregs_to_sigframe().

Do you mean, don't check fpu.fpstate_active here?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 15:57         ` Yu-cheng Yu
@ 2016-05-02 16:06           ` Dave Hansen
  2016-05-02 16:34             ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-05-02 16:06 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/02/2016 08:57 AM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 05:36:48PM -0700, Dave Hansen wrote:
>> If may_copy_fpregs_to_sigframe() were called from a slightly different
>> context, or if we change the call-site, what breaks?
>>
>> In other words. if we can still "may_copy_fpregs_to_sigframe()" no
>> matter the state of fpu.fpstate_active, then I don't think we should be
>> checking it in may_copy_fpregs_to_sigframe().
> 
> Do you mean, don't check fpu.fpstate_active here?

Not really.  I'm asking *why* the check is there.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-04-30  0:40       ` Dave Hansen
@ 2016-05-02 16:11         ` Yu-cheng Yu
  2016-05-04 22:15           ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 16:11 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Fri, Apr 29, 2016 at 05:40:44PM -0700, Dave Hansen wrote:
> That's better than what we had before, but it relies entirely on testing
> coverage and runtime checks.
> 
> Is it too much to ask that you also take a look and audit all the places
> the XSAVE buffer is accessed in the kernel and ensure that they either
> have code to handle standard vs. compacted/supervisor or don't care for
> some reason?
> 
> I did such an audit once upon a time, but I think it would be a good
> exercise to repeat both by a second set of eyes and because some time
> has passed.

I think there are 12 files that can be directly impacted by XSAVES. 

	arch/x86/include/asm/fpu/internal.h
	arch/x86/include/asm/fpu/types.h
	arch/x86/include/asm/fpu/xstate.h
	arch/x86/include/uapi/asm/sigcontext.h
	arch/x86/kernel/traps.c
	arch/x86/kernel/fpu/core.c
	arch/x86/kernel/fpu/init.c
	arch/x86/kernel/fpu/regset.c
	arch/x86/kernel/fpu/signal.c
	arch/x86/kernel/fpu/xstate.c
	arch/x86/kvm/x86.c
	arch/x86/mm/mpx.c

They have been reviewed from the perspective of the compacted format.
Please let me know anything else.
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-04-30  7:53             ` Ingo Molnar
@ 2016-05-02 16:28               ` Dave Hansen
  2016-05-02 18:32                 ` Ingo Molnar
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-05-02 16:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andy Lutomirski, Yu-cheng Yu, X86 ML, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski,
	Borislav Petkov, Sai Praneeth Prakhya, Ravi V. Shankar,
	Fenghua Yu

On 04/30/2016 12:53 AM, Ingo Molnar wrote:
> We can still use the compacted area handling instructions, because presumably 
> those are the fastest and are also the most optimized ones? But I wouldn't use 
> them to do dynamic allocation: just allocate the maximum possible FPU save area at 
> task creation time and never again worry about that detail.
> 
> Ok?

Sounds sane to me.

BTW, I hacked up your "fpu performance" to compare XSAVE vs. XSAVES:

> [    0.048347] x86/fpu: Cost of: XSAVE                       insn          :   127 cycles
> [    0.049134] x86/fpu: Cost of: XSAVES                      insn          :   113 cycles
> [    0.048492] x86/fpu: Cost of: XRSTOR                      insn          :   120 cycles
> [    0.049267] x86/fpu: Cost of: XRSTORS                     insn          :   102 cycles

So I guess we can add that to the list of things that XSAVES is good
for.  Granted, the real-world benefit is probably hard to measure
because the cache residency of the XSAVE buffer isn't as good when
_actually_ context switching, but this at least shows a small
theoretical advantage for XSAVES.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 16:06           ` Dave Hansen
@ 2016-05-02 16:34             ` Yu-cheng Yu
  2016-05-02 16:43               ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 16:34 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Mon, May 02, 2016 at 09:06:41AM -0700, Dave Hansen wrote:
> On 05/02/2016 08:57 AM, Yu-cheng Yu wrote:
> > On Fri, Apr 29, 2016 at 05:36:48PM -0700, Dave Hansen wrote:
> >> If may_copy_fpregs_to_sigframe() were called from a slightly different
> >> context, or if we change the call-site, what breaks?
> >>
> >> In other words. if we can still "may_copy_fpregs_to_sigframe()" no
> >> matter the state of fpu.fpstate_active, then I don't think we should be
> >> checking it in may_copy_fpregs_to_sigframe().
> > 
> > Do you mean, don't check fpu.fpstate_active here?
> 
> Not really.  I'm asking *why* the check is there.

If (fpu.fpstate_active == 0), then the task does not use FPU; we don't
want to save these registers, right?  

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 16:34             ` Yu-cheng Yu
@ 2016-05-02 16:43               ` Dave Hansen
  2016-05-02 17:19                 ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-05-02 16:43 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/02/2016 09:34 AM, Yu-cheng Yu wrote:
> On Mon, May 02, 2016 at 09:06:41AM -0700, Dave Hansen wrote:
>> On 05/02/2016 08:57 AM, Yu-cheng Yu wrote:
>>> On Fri, Apr 29, 2016 at 05:36:48PM -0700, Dave Hansen wrote:
>>>> If may_copy_fpregs_to_sigframe() were called from a slightly different
>>>> context, or if we change the call-site, what breaks?
>>>>
>>>> In other words. if we can still "may_copy_fpregs_to_sigframe()" no
>>>> matter the state of fpu.fpstate_active, then I don't think we should be
>>>> checking it in may_copy_fpregs_to_sigframe().
>>>
>>> Do you mean, don't check fpu.fpstate_active here?
>>
>> Not really.  I'm asking *why* the check is there.
> 
> If (fpu.fpstate_active == 0), then the task does not use FPU; we don't
> want to save these registers, right?  

No.  It's possible to have fpstate_active=0 while fpregs_active=1.  Such
a task uses the FPU, but just hasn't done an XSAVE* to save the register
content to the fpstate buffer.

Note, this is just theoretical, and does not happen in this particular
call path today.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 16:43               ` Dave Hansen
@ 2016-05-02 17:19                 ` Yu-cheng Yu
  2016-05-02 17:33                   ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 17:19 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Mon, May 02, 2016 at 09:43:47AM -0700, Dave Hansen wrote:
> > If (fpu.fpstate_active == 0), then the task does not use FPU; we don't
> > want to save these registers, right?  
> 
> No.  It's possible to have fpstate_active=0 while fpregs_active=1.  Such
> a task uses the FPU, but just hasn't done an XSAVE* to save the register
> content to the fpstate buffer.
> 
> Note, this is just theoretical, and does not happen in this particular
> call path today.

What about...

static int may_copy_fpregs_to_sigframe(void)
{
	if (fpregs_active())
		return 1;


	WARN_ONCE(!current->thread.fpu.fpstate_active,
		  "direct FPU save with no math use\n");

	if (boot_cpu_has(X86_FEATURE_XSAVES))
		return 1;

	return 0;
} 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 17:19                 ` Yu-cheng Yu
@ 2016-05-02 17:33                   ` Dave Hansen
  2016-05-02 21:18                     ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-05-02 17:33 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/02/2016 10:19 AM, Yu-cheng Yu wrote:
> On Mon, May 02, 2016 at 09:43:47AM -0700, Dave Hansen wrote:
>>> If (fpu.fpstate_active == 0), then the task does not use FPU; we don't
>>> want to save these registers, right?  
>>
>> No.  It's possible to have fpstate_active=0 while fpregs_active=1.  Such
>> a task uses the FPU, but just hasn't done an XSAVE* to save the register
>> content to the fpstate buffer.
>>
>> Note, this is just theoretical, and does not happen in this particular
>> call path today.
> 
> What about...
> 
> static int may_copy_fpregs_to_sigframe(void)
> {
> 	if (fpregs_active())
> 		return 1;
> 
> 	WARN_ONCE(!current->thread.fpu.fpstate_active,
> 		  "direct FPU save with no math use\n");
> 
> 	if (boot_cpu_has(X86_FEATURE_XSAVES))
> 		return 1;
> 
> 	return 0;
> } 

I don't think that changes anything.  We still have a check in there
that has no purpose.  You've changed the ordering so that the specific
example that I pointed out no longer triggers it.  But, the underlying
issue remains.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues
  2016-05-02 16:28               ` Dave Hansen
@ 2016-05-02 18:32                 ` Ingo Molnar
  0 siblings, 0 replies; 61+ messages in thread
From: Ingo Molnar @ 2016-05-02 18:32 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andy Lutomirski, Yu-cheng Yu, X86 ML, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, linux-kernel, Andy Lutomirski,
	Borislav Petkov, Sai Praneeth Prakhya, Ravi V. Shankar,
	Fenghua Yu


* Dave Hansen <dave.hansen@linux.intel.com> wrote:

> On 04/30/2016 12:53 AM, Ingo Molnar wrote:
> > We can still use the compacted area handling instructions, because presumably 
> > those are the fastest and are also the most optimized ones? But I wouldn't use 
> > them to do dynamic allocation: just allocate the maximum possible FPU save area at 
> > task creation time and never again worry about that detail.
> > 
> > Ok?
> 
> Sounds sane to me.
> 
> BTW, I hacked up your "fpu performance" to compare XSAVE vs. XSAVES:
> 
> > [    0.048347] x86/fpu: Cost of: XSAVE                       insn          :   127 cycles
> > [    0.049134] x86/fpu: Cost of: XSAVES                      insn          :   113 cycles
> > [    0.048492] x86/fpu: Cost of: XRSTOR                      insn          :   120 cycles
> > [    0.049267] x86/fpu: Cost of: XRSTORS                     insn          :   102 cycles
> 
> So I guess we can add that to the list of things that XSAVES is good for.

Absolutely!

> [...]  Granted, the real-world benefit is probably hard to measure because the 
> cache residency of the XSAVE buffer isn't as good when _actually_ context 
> switching, but this at least shows a small theoretical advantage for XSAVES.

Yeah, and anything that was measured for real is far from being theoretical. It's 
simply a best-case microbenchmark figure, but it's still a nice 10+ cycles 
improvement overall - which might become bigger in future CPU generations.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 17:33                   ` Dave Hansen
@ 2016-05-02 21:18                     ` Yu-cheng Yu
  2016-05-02 21:24                       ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 21:18 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Mon, May 02, 2016 at 10:33:10AM -0700, Dave Hansen wrote:
> On 05/02/2016 10:19 AM, Yu-cheng Yu wrote:
> > On Mon, May 02, 2016 at 09:43:47AM -0700, Dave Hansen wrote:
> >>> If (fpu.fpstate_active == 0), then the task does not use FPU; we don't
> >>> want to save these registers, right?  
> >>
> >> No.  It's possible to have fpstate_active=0 while fpregs_active=1.  Such
> >> a task uses the FPU, but just hasn't done an XSAVE* to save the register
> >> content to the fpstate buffer.
> >>
> >> Note, this is just theoretical, and does not happen in this particular
> >> call path today.
> > 
> > What about...
> > 
> > static int may_copy_fpregs_to_sigframe(void)
> > {
> > 	if (fpregs_active())
> > 		return 1;
> > 
> > 	WARN_ONCE(!current->thread.fpu.fpstate_active,
> > 		  "direct FPU save with no math use\n");
> > 
> > 	if (boot_cpu_has(X86_FEATURE_XSAVES))
> > 		return 1;
> > 
> > 	return 0;
> > } 
> 
> I don't think that changes anything.  We still have a check in there
> that has no purpose.  You've changed the ordering so that the specific
> example that I pointed out no longer triggers it.  But, the underlying
> issue remains.

Before Linux gets into copy_fpstate_to_sigframe(),
current->thread.fpu.fpstate_active must be true.
For eagerfpu, fpregs_active() must also be true.
For lazyfpu, once we try to do FSAVE/FXSAVE/XSAVE,
fpregs_active() will become true as well.

We should have not based on boot_cpu_has(X86_FEATURE_XSAVES)
at all. 

Why don't we make it simple and always copy_fpregs_to_signal_frame()?
Or, only for the lazy case, i.e. !fpregs_active(), we do __copy_to_user().

Anyway, I think we can just replace may_copy_fpregs_to_sigframe() with
!fpregs_active().

Comments? 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 21:18                     ` Yu-cheng Yu
@ 2016-05-02 21:24                       ` Yu-cheng Yu
  2016-05-02 21:32                         ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 21:24 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Mon, May 02, 2016 at 02:18:17PM -0700, Yu-cheng Yu wrote:
> Before Linux gets into copy_fpstate_to_sigframe(),
> current->thread.fpu.fpstate_active must be true.
> For eagerfpu, fpregs_active() must also be true.
> For lazyfpu, once we try to do FSAVE/FXSAVE/XSAVE,
> fpregs_active() will become true as well.
> 
> We should have not based on boot_cpu_has(X86_FEATURE_XSAVES)
> at all. 
> 
> Why don't we make it simple and always copy_fpregs_to_signal_frame()?
> Or, only for the lazy case, i.e. !fpregs_active(), we do __copy_to_user().

For (lazy && not XSAVES) actually!

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 21:24                       ` Yu-cheng Yu
@ 2016-05-02 21:32                         ` Dave Hansen
  2016-05-02 22:17                           ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-05-02 21:32 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/02/2016 02:24 PM, Yu-cheng Yu wrote:
> On Mon, May 02, 2016 at 02:18:17PM -0700, Yu-cheng Yu wrote:
>> > Before Linux gets into copy_fpstate_to_sigframe(),
>> > current->thread.fpu.fpstate_active must be true.
>> > For eagerfpu, fpregs_active() must also be true.
>> > For lazyfpu, once we try to do FSAVE/FXSAVE/XSAVE,
>> > fpregs_active() will become true as well.
>> > 
>> > We should have not based on boot_cpu_has(X86_FEATURE_XSAVES)
>> > at all. 
>> > 
>> > Why don't we make it simple and always copy_fpregs_to_signal_frame()?
>> > Or, only for the lazy case, i.e. !fpregs_active(), we do __copy_to_user().
> For (lazy && not XSAVES) actually!

I think we're off in the weeds here.

Please just send an updated patch for what you want to do here.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 21:32                         ` Dave Hansen
@ 2016-05-02 22:17                           ` Yu-cheng Yu
  2016-05-02 22:37                             ` Dave Hansen
  0 siblings, 1 reply; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-02 22:17 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Mon, May 02, 2016 at 02:32:14PM -0700, Dave Hansen wrote:
> 
> I think we're off in the weeds here.
> 
> Please just send an updated patch for what you want to do here.

>From 43134a773d23ae8bab9f158d143c5cfb76bc0e9c Mon Sep 17 00:00:00 2001
From: Yu-cheng Yu <yu-cheng.yu@intel.com>
Date: Sat, 14 Nov 2015 16:59:45 -0800
Subject: [PATCH] x86/xsaves: Introduce a new check that allows correct xstates
 copy from kernel to user directly

XSAVES is a kernel instruction and uses a compacted format. When working with user space, the kernel should provide
standard-format, non-supervisor state data. We cannot do __copy_to_user() from a compacted- format kernel xstate area to a
signal frame.

Dave Hansen proposes this method to simplify copy xstate directly to user.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off by: Yu-cheng Yu <yu-cheng.yu@intel.com>
---
 arch/x86/include/asm/fpu/xstate.h | 1 +
 arch/x86/kernel/fpu/signal.c      | 3 ++-
 arch/x86/kernel/fpu/xstate.c      | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 16df2c4..d812cf3 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -47,5 +47,6 @@ extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
 void fpu__xstate_clear_all_cpu_caps(void);
 void *get_xsave_addr(struct xregs_state *xsave, int xstate);
 const void *get_xsave_field_ptr(int xstate_field);
+int using_compacted_format(void);
 
 #endif
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 0fbf60c..d7fdd8c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -8,6 +8,7 @@
 #include <asm/fpu/internal.h>
 #include <asm/fpu/signal.h>
 #include <asm/fpu/regset.h>
+#include <asm/fpu/xstate.h>
 
 #include <asm/sigframe.h>
 
@@ -167,7 +168,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	if (fpregs_active()) {
+	if (fpregs_active() || using_compacted_format()) {
 		/* Save the live register state to the user directly. */
 		if (copy_fpregs_to_sigframe(buf_fx))
 			return -1;
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 170c164..2b59bd7 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -415,7 +415,7 @@ static int xfeature_size(int xfeature_nr)
  * that it is obvious which aspect of 'XSAVES' is being handled
  * by the calling code.
  */
-static int using_compacted_format(void)
+int using_compacted_format(void)
 {
 	return cpu_has_xsaves;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly
  2016-05-02 22:17                           ` Yu-cheng Yu
@ 2016-05-02 22:37                             ` Dave Hansen
  0 siblings, 0 replies; 61+ messages in thread
From: Dave Hansen @ 2016-05-02 22:37 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/02/2016 03:17 PM, Yu-cheng Yu wrote:
> @@ -167,7 +168,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
>  			sizeof(struct user_i387_ia32_struct), NULL,
>  			(struct _fpstate_32 __user *) buf) ? -1 : 1;
>  
> -	if (fpregs_active()) {
> +	if (fpregs_active() || using_compacted_format()) {
>  		/* Save the live register state to the user directly. */
>  		if (copy_fpregs_to_sigframe(buf_fx))
>  			return -1;

So, compared to the first patch, you move the fpregs_active() check out
to the caller of may_copy_fpregs_to_sigframe() (good) and removed a
bunch of comments explaining what was going on (bad).

Do we really want all those comments to die?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-05-02 16:11         ` Yu-cheng Yu
@ 2016-05-04 22:15           ` Dave Hansen
  2016-05-04 22:21             ` Yu-cheng Yu
  0 siblings, 1 reply; 61+ messages in thread
From: Dave Hansen @ 2016-05-04 22:15 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/02/2016 09:11 AM, Yu-cheng Yu wrote:
> On Fri, Apr 29, 2016 at 05:40:44PM -0700, Dave Hansen wrote:
>> That's better than what we had before, but it relies entirely on testing
>> coverage and runtime checks.
>>
>> Is it too much to ask that you also take a look and audit all the places
>> the XSAVE buffer is accessed in the kernel and ensure that they either
>> have code to handle standard vs. compacted/supervisor or don't care for
>> some reason?
>>
>> I did such an audit once upon a time, but I think it would be a good
>> exercise to repeat both by a second set of eyes and because some time
>> has passed.
> 
> I think there are 12 files that can be directly impacted by XSAVES. 
> 
> 	arch/x86/include/asm/fpu/internal.h
> 	arch/x86/include/asm/fpu/types.h
> 	arch/x86/include/asm/fpu/xstate.h
> 	arch/x86/include/uapi/asm/sigcontext.h
> 	arch/x86/kernel/traps.c
> 	arch/x86/kernel/fpu/core.c
> 	arch/x86/kernel/fpu/init.c
> 	arch/x86/kernel/fpu/regset.c
> 	arch/x86/kernel/fpu/signal.c
> 	arch/x86/kernel/fpu/xstate.c
> 	arch/x86/kvm/x86.c
> 	arch/x86/mm/mpx.c
> 
> They have been reviewed from the perspective of the compacted format.
> Please let me know anything else.

Can you double-check that nothing has changed in mainline (or tip for
that matter) since you first did these checks?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-05-04 22:15           ` Dave Hansen
@ 2016-05-04 22:21             ` Yu-cheng Yu
  0 siblings, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-04 22:21 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Wed, May 04, 2016 at 03:15:38PM -0700, Dave Hansen wrote:
> On 05/02/2016 09:11 AM, Yu-cheng Yu wrote:
> > On Fri, Apr 29, 2016 at 05:40:44PM -0700, Dave Hansen wrote:
> >> That's better than what we had before, but it relies entirely on testing
> >> coverage and runtime checks.
> >>
> >> Is it too much to ask that you also take a look and audit all the places
> >> the XSAVE buffer is accessed in the kernel and ensure that they either
> >> have code to handle standard vs. compacted/supervisor or don't care for
> >> some reason?
> >>
> >> I did such an audit once upon a time, but I think it would be a good
> >> exercise to repeat both by a second set of eyes and because some time
> >> has passed.
> > 
> > I think there are 12 files that can be directly impacted by XSAVES. 
> > 
> > 	arch/x86/include/asm/fpu/internal.h
> > 	arch/x86/include/asm/fpu/types.h
> > 	arch/x86/include/asm/fpu/xstate.h
> > 	arch/x86/include/uapi/asm/sigcontext.h
> > 	arch/x86/kernel/traps.c
> > 	arch/x86/kernel/fpu/core.c
> > 	arch/x86/kernel/fpu/init.c
> > 	arch/x86/kernel/fpu/regset.c
> > 	arch/x86/kernel/fpu/signal.c
> > 	arch/x86/kernel/fpu/xstate.c
> > 	arch/x86/kvm/x86.c
> > 	arch/x86/mm/mpx.c
> > 
> > They have been reviewed from the perspective of the compacted format.
> > Please let me know anything else.
> 
> Can you double-check that nothing has changed in mainline (or tip for
> that matter) since you first did these checks?

Yes.  I am also doing some tests now.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-03-04 18:12 ` [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES Yu-cheng Yu
  2016-04-29 20:32   ` Dave Hansen
@ 2016-05-04 22:41   ` Dave Hansen
  2016-05-04 22:46     ` Yu-cheng Yu
  2016-05-04 22:46     ` Dave Hansen
  1 sibling, 2 replies; 61+ messages in thread
From: Dave Hansen @ 2016-05-04 22:41 UTC (permalink / raw)
  To: Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar,
	linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

It's my fault, but you also need to go update
	
	fpu__xfeature_set_state()
and
	__raw_xsave_addr()

The theoretical problem is that you might ask for a __raw_xsave_addr()
of a component which has been compacted out of an XSAVES buffer and thus
has no address.  We could work around this by doing a memmove() and
moving the components "up" after the one we are trying to set in order
to make space.

But, since we *always* call XSAVES with an instruction mask of -1 and
end up with a requested feature bitmap (RFBM) equal to XCR0, I think we
can do a shortcut because we'll practically *always* have an
xcomp_bv==RFBM==XCR0, which means that all (present) components will
always have an address.

So, the alternative to doing the memmove() is to add some WARN_ON_FPU()
checks to enforce xcomp_bv==RFBM==XCR0 in places where we call
XSAVES/XRSTORS and __raw_xsave_addr(), maybe more.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-05-04 22:41   ` Dave Hansen
@ 2016-05-04 22:46     ` Yu-cheng Yu
  2016-05-04 22:46     ` Dave Hansen
  1 sibling, 0 replies; 61+ messages in thread
From: Yu-cheng Yu @ 2016-05-04 22:46 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, linux-kernel,
	Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On Wed, May 04, 2016 at 03:41:49PM -0700, Dave Hansen wrote:
> It's my fault, but you also need to go update
> 	
> 	fpu__xfeature_set_state()
> and
> 	__raw_xsave_addr()
> 
> The theoretical problem is that you might ask for a __raw_xsave_addr()
> of a component which has been compacted out of an XSAVES buffer and thus
> has no address.  We could work around this by doing a memmove() and
> moving the components "up" after the one we are trying to set in order
> to make space.
> 
> But, since we *always* call XSAVES with an instruction mask of -1 and
> end up with a requested feature bitmap (RFBM) equal to XCR0, I think we
> can do a shortcut because we'll practically *always* have an
> xcomp_bv==RFBM==XCR0, which means that all (present) components will
> always have an address.
> 
> So, the alternative to doing the memmove() is to add some WARN_ON_FPU()
> checks to enforce xcomp_bv==RFBM==XCR0 in places where we call
> XSAVES/XRSTORS and __raw_xsave_addr(), maybe more.

In the coming version 5 patches, we are going to have one additional 
patch for fixing __fpu_restore_sig() for the compacted format.  I changed
my existing patch a little and run into some problems. Fixing it now.

Our ptrace tests went OK before, but are failing now. It might be relating
to what you are saying? I will check it.
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES
  2016-05-04 22:41   ` Dave Hansen
  2016-05-04 22:46     ` Yu-cheng Yu
@ 2016-05-04 22:46     ` Dave Hansen
  1 sibling, 0 replies; 61+ messages in thread
From: Dave Hansen @ 2016-05-04 22:46 UTC (permalink / raw)
  To: Dave Hansen, Yu-cheng Yu, x86, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, linux-kernel
  Cc: Andy Lutomirski, Borislav Petkov, Sai Praneeth Prakhya,
	Ravi V. Shankar, Fenghua Yu

On 05/04/2016 03:41 PM, Dave Hansen wrote:
> But, since we *always* call XSAVES with an instruction mask of -1 and
> end up with a requested feature bitmap (RFBM) equal to XCR0, I think we
> can do a shortcut because we'll practically *always* have an
> xcomp_bv==RFBM==XCR0, which means that all (present) components will
> always have an address.

Actually, we depend on that anyway.  xstate_comp_offsets[] is completely
bogus if we don't have xcomp_bv==RFBM==XCR0.

Perhaps we need to add those checks anyway.

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2016-05-04 22:50 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-04 18:12 [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 01/10] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 02/10] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 03/10] x86/xsaves: Keep init_fpstate.xsave.header.xfeatures as zero for init optimization Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 04/10] x86/xsaves: Introduce a new check that allows correct xstates copy from kernel to user directly Yu-cheng Yu
2016-04-29 20:09   ` Dave Hansen
2016-04-29 22:43     ` Yu-cheng Yu
2016-04-30  0:36       ` Dave Hansen
2016-05-02 15:57         ` Yu-cheng Yu
2016-05-02 16:06           ` Dave Hansen
2016-05-02 16:34             ` Yu-cheng Yu
2016-05-02 16:43               ` Dave Hansen
2016-05-02 17:19                 ` Yu-cheng Yu
2016-05-02 17:33                   ` Dave Hansen
2016-05-02 21:18                     ` Yu-cheng Yu
2016-05-02 21:24                       ` Yu-cheng Yu
2016-05-02 21:32                         ` Dave Hansen
2016-05-02 22:17                           ` Yu-cheng Yu
2016-05-02 22:37                             ` Dave Hansen
2016-03-04 18:12 ` [PATCH v4 05/10] x86/xsaves: Align xstate components according to CPUID Yu-cheng Yu
2016-04-29 20:11   ` Dave Hansen
2016-03-04 18:12 ` [PATCH v4 06/10] x86/xsaves: Supervisor state component offset Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 07/10] x86/xsaves: Fix PTRACE frames for XSAVES Yu-cheng Yu
2016-04-29 20:16   ` Dave Hansen
2016-04-29 20:38     ` Borislav Petkov
2016-04-29 20:40       ` Dave Hansen
2016-04-29 20:46         ` Borislav Petkov
2016-04-29 20:25   ` Dave Hansen
2016-04-29 22:30     ` Yu-cheng Yu
2016-04-29 22:36       ` Dave Hansen
2016-04-29 22:38         ` Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 08/10] x86/xsaves: Fix XSTATE component offset print out Yu-cheng Yu
2016-04-29 20:26   ` Dave Hansen
2016-03-04 18:12 ` [PATCH v4 09/10] x86/xsaves: Fix xstate_offsets, xstate_sizes for legacy components Yu-cheng Yu
2016-04-29 20:28   ` Dave Hansen
2016-04-29 22:07     ` Yu-cheng Yu
2016-04-29 22:13       ` Dave Hansen
2016-04-29 22:15         ` Yu-cheng Yu
2016-03-04 18:12 ` [PATCH v4 10/10] x86/xsaves: Re-enable XSAVES Yu-cheng Yu
2016-04-29 20:32   ` Dave Hansen
2016-04-29 23:12     ` Yu-cheng Yu
2016-04-30  0:40       ` Dave Hansen
2016-05-02 16:11         ` Yu-cheng Yu
2016-05-04 22:15           ` Dave Hansen
2016-05-04 22:21             ` Yu-cheng Yu
2016-05-04 22:41   ` Dave Hansen
2016-05-04 22:46     ` Yu-cheng Yu
2016-05-04 22:46     ` Dave Hansen
2016-04-29 18:09 ` [PATCH v4 0/10] x86/xsaves: Fix XSAVES known issues Dave Hansen
2016-04-29 19:43   ` Ingo Molnar
2016-04-29 19:57   ` Yu-cheng Yu
2016-04-29 20:03     ` Dave Hansen
2016-04-29 20:07       ` Yu-cheng Yu
2016-04-29 20:25         ` Andy Lutomirski
2016-04-29 20:40           ` Dave Hansen
2016-04-29 20:49             ` Andy Lutomirski
2016-04-29 22:42               ` Dave Hansen
2016-04-30  7:53             ` Ingo Molnar
2016-05-02 16:28               ` Dave Hansen
2016-05-02 18:32                 ` Ingo Molnar
2016-04-29 20:37         ` Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).