linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH Bugfix v2 0/4] x86/xsave/xsaves: Fix a few xsave/xsaves related bugs
@ 2015-04-22  4:51 Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 1/4] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Fenghua Yu @ 2015-04-22  4:51 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Asit K Mallick,
	Dave Hansen, Glenn Williamson
  Cc: linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

This patchset is supposed to fix some xsave/xsaves related issues.

The patch 1/4 fixes an xstate offsets and sizes enumeration issue. During
enumerating offsets and sizes starting from 2 to the last enabled feature,
if one xstate's size is 0, current code thinks there is no other xstate
after this xstate and breaks from enumeration. This is not true because
architecturally it's possible to have a few xstates disabled between
xstate 2 and the last enabled xstate. The offsets and sizes of
the xstates that are not enumerated after the disabled xstate will be
consumed and cause issues in runtime.

The patch 2/4 introduces a new global variable "user_xstate_size". This
variable is used for standard formatted xsave area size in signal frame.
Current code incorrectly uses the smaller compacted formatted xsave area
size for signal frame and will cause issues in xstate access in signal
frame.

The patch 3/4 is not fixing a bug. But it renames "xstate_size" to
"kernel_xstate_size" to explicitly distinguish between xstate size in
kernel space and the one in user space. It just makes kernel code more
clear.

The patch 4/4 claims that the structure of xsave_struct is
non-architectural and fields/xstates in the structure is not defined
in compilation time. No new states should be added in xsave_struct.
The xsave area should be constructed during kernel booting time.

We may hit the issues on either existing platforms or upcoming platforms.
We had better to have the patches in upstream and backport them to stable
kernel and distros.

Fenghua Yu (4):
  x86/xsave.c: Fix xstate offsets and sizes enumeration
  x86/xsaves: Define and use user_xstate_size for xstate size in signal
    context
  x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly
    distinguish xstate size in kernel from user space
  x86/xsave: Don't add new states in xsave_struct

 arch/x86/include/asm/fpu-internal.h |   7 ++-
 arch/x86/include/asm/processor.h    |  23 +++----
 arch/x86/include/asm/xsave.h        |   1 -
 arch/x86/kernel/i387.c              |  18 +++---
 arch/x86/kernel/process.c           |   2 +-
 arch/x86/kernel/xsave.c             | 120 +++++++++++++++++++++++++++++-------
 6 files changed, 118 insertions(+), 53 deletions(-)

-- 
1.8.1.2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH Bugfix v2 1/4] x86/xsave.c: Fix xstate offsets and sizes enumeration
  2015-04-22  4:51 [PATCH Bugfix v2 0/4] x86/xsave/xsaves: Fix a few xsave/xsaves related bugs Fenghua Yu
@ 2015-04-22  4:51 ` Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 18+ messages in thread
From: Fenghua Yu @ 2015-04-22  4:51 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Asit K Mallick,
	Dave Hansen, Glenn Williamson
  Cc: linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

When enumerating xstate offsets and sizes from cpuid (eax=0x0d, ecx>=2),
it's possible that state m is not implemented while state n (n>m)
is implemented. So enumeration shouldn't stop at state m.

There is no platform configured like above yet. But this could be a problem
in the future. For example, suppose XCR0=0xe7, that means FP, SSE, AVX, and
AVX-512 states are enabled and MPX states (bit 3 and 4) are not enabled.
Then in setup_xstate_features(), after finding BNDREGS size is 0 (i.e. eax
from CPUID xstate subleaf 3, break from the for loop. That stops finding
xstate_offsets and xstate_sizes for AVX-512. Later on incorrect
xstate_offsets and xstate_sizes for AVX-512 will be used in a few places
and will causes issues.

This patch enumerates xstate offsets and sizes for all kernel supported
xstates. If a state is not implemented in hardware or not enabled in XCR0,
its size is set as zero and its offset is read from cpuid.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/kernel/xsave.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 87a815b..3c0a9d1 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -465,23 +465,18 @@ static inline void xstate_enable(void)
  */
 static void __init setup_xstate_features(void)
 {
-	int eax, ebx, ecx, edx, leaf = 0x2;
+	int eax, ebx, ecx, edx, leaf;
 
 	xstate_features = fls64(pcntxt_mask);
 	xstate_offsets = alloc_bootmem(xstate_features * sizeof(int));
 	xstate_sizes = alloc_bootmem(xstate_features * sizeof(int));
 
-	do {
+	for (leaf = 2; leaf < xstate_features; leaf++) {
 		cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
 
-		if (eax == 0)
-			break;
-
 		xstate_offsets[leaf] = ebx;
 		xstate_sizes[leaf] = eax;
-
-		leaf++;
-	} while (1);
+	}
 }
 
 /*
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22  4:51 [PATCH Bugfix v2 0/4] x86/xsave/xsaves: Fix a few xsave/xsaves related bugs Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 1/4] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
@ 2015-04-22  4:51 ` Fenghua Yu
  2015-04-22 18:45   ` Dave Hansen
                     ` (3 more replies)
  2015-04-22  4:51 ` [PATCH Bugfix v2 3/4] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 4/4] x86/xsave: Don't add new states in xsave_struct Fenghua Yu
  3 siblings, 4 replies; 18+ messages in thread
From: Fenghua Yu @ 2015-04-22  4:51 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Asit K Mallick,
	Dave Hansen, Glenn Williamson
  Cc: linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

If "xsaves" is enabled, kernel always uses compact format of xsave area.
But user space still uses standard format of xsave area. Thus, xstate size
in kernel's xsave area is smaller than xstate size in user's xsave area.
xstate in signal frame should be in standard format for user's signal
handler to access.

In no "xsaves" case, xsave area in both user space and kernel space are in
standard format. Therefore, user's and kernel's xstate sizes are equal.

In "xsaves" case, xsave area in user space is in standard format while
xsave area in kernel space is in compact format. Therefore, kernel's
xstate size is less than user's xstate size.

So here is the problem: currently kernel uses the kernel's xstate size
for xstate size in signal frame. This is not a problem in no "xsaves" case.
But it is an issue in "xsaves" case because kernel's xstate size is smaller
than user's xstate size. When setting up signal math frame in
alloc_ mathframe(), the fpstate is in standard format; but a smaller size
of fpstate buffer is allocated in signal frame for standard format
xstate. Then kernel saves only part of xstate registers into this smaller
user's fpstate buffer and user will see part of the xstate registers in
signal context. Similar issue happens after returning from signal handler:
kernel will only restore part of xstate registers from user's fpstate
buffer in signal frame.

This patch defines and uses user_xstate_size for xstate size in signal
frame. It's read from returned value in ebx from CPUID leaf 0x0D subleaf
0x0. This is maximum size required by enabled states in XCR0 and may be
different from ecx when states at the end of the xsave area are not
enabled. This value indicates the size required for XSAVE to save all
supported user states in legacy/standard format.

And in order to copy kernel's xsave area in compact format to user xsave
area in standard format, we use copy_to_user_xstate().

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/include/asm/fpu-internal.h |  3 +-
 arch/x86/include/asm/processor.h    |  1 +
 arch/x86/include/asm/xsave.h        |  1 -
 arch/x86/kernel/xsave.c             | 95 +++++++++++++++++++++++++++++++++----
 4 files changed, 90 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index da5e967..c00c769 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -496,7 +496,8 @@ extern int __restore_xstate_sig(void __user *buf, void __user *fx, int size);
 
 static inline int xstate_sigframe_size(void)
 {
-	return use_xsave() ? xstate_size + FP_XSTATE_MAGIC2_SIZE : xstate_size;
+	return use_xsave() ? user_xstate_size + FP_XSTATE_MAGIC2_SIZE :
+		user_xstate_size;
 }
 
 static inline int restore_xstate_sig(void __user *buf, int ia32_frame)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 23ba676..576ff8c 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -483,6 +483,7 @@ DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
 #endif	/* X86_64 */
 
 extern unsigned int xstate_size;
+extern unsigned int user_xstate_size;
 extern void free_thread_xstate(struct task_struct *);
 extern struct kmem_cache *task_xstate_cachep;
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index c9a6d68..7799d18 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -44,7 +44,6 @@
 #define REX_PREFIX
 #endif
 
-extern unsigned int xstate_size;
 extern u64 pcntxt_mask;
 extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern struct xsave_struct *init_xstate_buf;
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index 3c0a9d1..b8373a0 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -29,6 +29,7 @@ static struct _fpx_sw_bytes fx_sw_reserved, fx_sw_reserved_ia32;
 static unsigned int *xstate_offsets, *xstate_sizes;
 static unsigned int xstate_comp_offsets[sizeof(pcntxt_mask)*8];
 static unsigned int xstate_features;
+unsigned int user_xstate_size;
 
 /*
  * If a processor implementation discern that a processor state component is
@@ -85,7 +86,7 @@ void __sanitize_i387_state(struct task_struct *tsk)
 	 */
 	while (xstate_bv) {
 		if (xstate_bv & 0x1) {
-			int offset = xstate_offsets[feature_bit];
+			int offset = xstate_comp_offsets[feature_bit];
 			int size = xstate_sizes[feature_bit];
 
 			memcpy(((void *) fx) + offset,
@@ -116,7 +117,7 @@ static inline int check_for_xstate(struct i387_fxsave_struct __user *buf,
 	/* Check for the first magic field and other error scenarios. */
 	if (fx_sw->magic1 != FP_XSTATE_MAGIC1 ||
 	    fx_sw->xstate_size < min_xstate_size ||
-	    fx_sw->xstate_size > xstate_size ||
+	    fx_sw->xstate_size > user_xstate_size ||
 	    fx_sw->xstate_size > fx_sw->extended_size)
 		return -1;
 
@@ -173,7 +174,8 @@ static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
 	if (!use_xsave())
 		return err;
 
-	err |= __put_user(FP_XSTATE_MAGIC2, (__u32 *)(buf + xstate_size));
+	err |= __put_user(FP_XSTATE_MAGIC2,
+			  (__u32 *)(buf + user_xstate_size));
 
 	/*
 	 * Read the xstate_bv which we copied (directly from the cpu or
@@ -210,12 +212,85 @@ static inline int save_user_xstate(struct xsave_struct __user *buf)
 	else
 		err = fsave_user((struct i387_fsave_struct __user *) buf);
 
-	if (unlikely(err) && __clear_user(buf, xstate_size))
+	if (unlikely(err) && __clear_user(buf, user_xstate_size))
 		err = -EFAULT;
 	return err;
 }
 
 /*
+ * Copy xsave area from kernel to user.
+ */
+static int copy_to_user_xstate(void __user *buf_fx, struct xsave_struct *xsave)
+{
+	int i;
+	u64 xcomp_bv;
+	struct xsave_struct __user *user_xsave;
+
+	/*
+	 * If kernel's xsave area is in standard format, copy the xsave
+	 * area directly from kernel to user buffer because both of the
+	 * xsave areas are in the same format.
+	 */
+	if (!cpu_has_xsaves) {
+		if (__copy_to_user(buf_fx, xsave, user_xstate_size))
+			return -1;
+	}
+
+	/*
+	 * If kernel's xsave area is in compact format, copy xstates one
+	 * by one from kernel to user buffer because user's xsave area
+	 * is in standard format which is different from kernel.
+	 *
+	 * This is inefficient. It makes a bunch of smaller
+	 * copy_to_user() calls when it might be possible to make fewer
+	 * large ones. But, this code does not get called on "eagerfpu"
+	 * (user_has_fpu() always true) systems which also happen to be
+	 * the ones with lots of xsave state.
+	 *
+	 * First, clear buf_fx.
+	 */
+	if (__clear_user(buf_fx, user_xstate_size))
+		return -1;
+
+	/*
+	 * copy FP, SSE, and header to user.
+	 */
+	if (__copy_to_user(buf_fx, xsave, FXSAVE_SIZE + XSAVE_HDR_SIZE))
+		return -1;
+
+	/*
+	 * Clear xcomp_bv[63] in user's xsave area header to indicate
+	 * buf_fx is in standard format.
+	 */
+	xcomp_bv = xsave->xsave_hdr.xcomp_bv;
+	user_xsave = buf_fx;
+	xcomp_bv &= ~((u64)1 << 63);
+
+	if (__put_user(xcomp_bv, &user_xsave->xsave_hdr.xcomp_bv))
+		return -1;
+
+	/*
+	 * Copy rest of xstates in compact format to user.
+	 */
+	for (i = 2; i < xstate_features; i++) {
+		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
+			int user_offset, kernel_offset;
+			int size;
+
+			user_offset = xstate_offsets[i];
+			kernel_offset = xstate_comp_offsets[i];
+			size = xstate_sizes[i];
+
+			if (__copy_to_user(buf_fx + user_offset,
+				     xsave + kernel_offset, size))
+				return -1;
+		}
+	}
+
+	return 0;
+}
+
+/*
  * Save the fpu, extended register state to the user signal frame.
  *
  * 'buf_fx' is the 64-byte aligned pointer at which the [f|fx|x]save
@@ -261,7 +336,7 @@ int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 			fpu_fxsave(&tsk->thread.fpu);
 	} else {
 		sanitize_i387_state(tsk);
-		if (__copy_to_user(buf_fx, xsave, xstate_size))
+		if (copy_to_user_xstate(buf_fx, xsave))
 			return -1;
 	}
 
@@ -434,7 +509,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 static void prepare_fx_sw_frame(void)
 {
 	int fsave_header_size = sizeof(struct i387_fsave_struct);
-	int size = xstate_size + FP_XSTATE_MAGIC2_SIZE;
+	int size = user_xstate_size + FP_XSTATE_MAGIC2_SIZE;
 
 	if (config_enabled(CONFIG_X86_32))
 		size += fsave_header_size;
@@ -442,7 +517,7 @@ static void prepare_fx_sw_frame(void)
 	fx_sw_reserved.magic1 = FP_XSTATE_MAGIC1;
 	fx_sw_reserved.extended_size = size;
 	fx_sw_reserved.xstate_bv = pcntxt_mask;
-	fx_sw_reserved.xstate_size = xstate_size;
+	fx_sw_reserved.xstate_size = user_xstate_size;
 
 	if (config_enabled(CONFIG_IA32_EMULATION)) {
 		fx_sw_reserved_ia32 = fx_sw_reserved;
@@ -576,14 +651,18 @@ __setup("eagerfpu=", eager_fpu_setup);
 
 /*
  * Calculate total size of enabled xstates in XCR0/pcntxt_mask.
+ *
+ * XCR0/pcntxt_mask should not be changed after this.
  */
 static void __init init_xstate_size(void)
 {
 	unsigned int eax, ebx, ecx, edx;
 	int i;
 
+	cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
+	user_xstate_size = ebx;
+
 	if (!cpu_has_xsaves) {
-		cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
 		xstate_size = ebx;
 		return;
 	}
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH Bugfix v2 3/4] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space
  2015-04-22  4:51 [PATCH Bugfix v2 0/4] x86/xsave/xsaves: Fix a few xsave/xsaves related bugs Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 1/4] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
@ 2015-04-22  4:51 ` Fenghua Yu
  2015-04-22  4:51 ` [PATCH Bugfix v2 4/4] x86/xsave: Don't add new states in xsave_struct Fenghua Yu
  3 siblings, 0 replies; 18+ messages in thread
From: Fenghua Yu @ 2015-04-22  4:51 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Asit K Mallick,
	Dave Hansen, Glenn Williamson
  Cc: linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

User space uses standard format xsave area. fpstate in signal frame should
have standard format size.

To explicitly distinguish between xstate size in kernel space and the one
in user space, we rename xstate_size to kernel_xstate_size. This patch is
not fixing a bug. It just makes kernel code more clear.

So we define the xsave area sizes in two global variables:

kernel_xstate_size (previous xstate_size): the xsave area size used in
xsave area allocated in kernel
user_xstate_size: the xsave area size used in xsave area used by user.

In no "xsaves" case, xsave area in both user space and kernel space are in
standard format. Therefore, kernel_xstate_size and user_xstate_size are
equal.

In "xsaves" case, xsave area in user space is in standard format while
xsave area in kernel space is in compact format. Therefore, kernel's
xstate size is less than user's xstate size.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/include/asm/fpu-internal.h |  4 ++--
 arch/x86/include/asm/processor.h    |  2 +-
 arch/x86/kernel/i387.c              | 18 +++++++++---------
 arch/x86/kernel/process.c           |  2 +-
 arch/x86/kernel/xsave.c             | 14 +++++++-------
 5 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index c00c769..5d9ba0c 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -597,14 +597,14 @@ static inline void fpu_free(struct fpu *fpu)
 static inline void fpu_copy(struct task_struct *dst, struct task_struct *src)
 {
 	if (use_eager_fpu()) {
-		memset(&dst->thread.fpu.state->xsave, 0, xstate_size);
+		memset(&dst->thread.fpu.state->xsave, 0, kernel_xstate_size);
 		__save_fpu(dst);
 	} else {
 		struct fpu *dfpu = &dst->thread.fpu;
 		struct fpu *sfpu = &src->thread.fpu;
 
 		unlazy_fpu(src);
-		memcpy(dfpu->state, sfpu->state, xstate_size);
+		memcpy(dfpu->state, sfpu->state, kernel_xstate_size);
 	}
 }
 
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 576ff8c..f26051b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -482,7 +482,7 @@ DECLARE_PER_CPU(struct irq_stack *, hardirq_stack);
 DECLARE_PER_CPU(struct irq_stack *, softirq_stack);
 #endif	/* X86_64 */
 
-extern unsigned int xstate_size;
+extern unsigned int kernel_xstate_size;
 extern unsigned int user_xstate_size;
 extern void free_thread_xstate(struct task_struct *);
 extern struct kmem_cache *task_xstate_cachep;
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index 00918327..47759a5 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -133,8 +133,8 @@ void unlazy_fpu(struct task_struct *tsk)
 EXPORT_SYMBOL(unlazy_fpu);
 
 unsigned int mxcsr_feature_mask __read_mostly = 0xffffffffu;
-unsigned int xstate_size;
-EXPORT_SYMBOL_GPL(xstate_size);
+unsigned int kernel_xstate_size;
+EXPORT_SYMBOL_GPL(kernel_xstate_size);
 static struct i387_fxsave_struct fx_scratch;
 
 static void mxcsr_feature_mask_init(void)
@@ -154,7 +154,7 @@ static void mxcsr_feature_mask_init(void)
 static void init_thread_xstate(void)
 {
 	/*
-	 * Note that xstate_size might be overwriten later during
+	 * Note that kernel_xstate_size might be overwriten later during
 	 * xsave_init().
 	 */
 
@@ -165,14 +165,14 @@ static void init_thread_xstate(void)
 		 */
 		setup_clear_cpu_cap(X86_FEATURE_XSAVE);
 		setup_clear_cpu_cap(X86_FEATURE_XSAVEOPT);
-		xstate_size = sizeof(struct i387_soft_struct);
+		kernel_xstate_size = sizeof(struct i387_soft_struct);
 		return;
 	}
 
 	if (cpu_has_fxsr)
-		xstate_size = sizeof(struct i387_fxsave_struct);
+		kernel_xstate_size = sizeof(struct i387_fxsave_struct);
 	else
-		xstate_size = sizeof(struct i387_fsave_struct);
+		kernel_xstate_size = sizeof(struct i387_fsave_struct);
 }
 
 /*
@@ -208,9 +208,9 @@ void fpu_init(void)
 
 	/*
 	 * init_thread_xstate is only called once to avoid overriding
-	 * xstate_size during boot time or during CPU hotplug.
+	 * kernel_xstate_size during boot time or during CPU hotplug.
 	 */
-	if (xstate_size == 0)
+	if (kernel_xstate_size == 0)
 		init_thread_xstate();
 
 	mxcsr_feature_mask_init();
@@ -225,7 +225,7 @@ void fpu_finit(struct fpu *fpu)
 		return;
 	}
 
-	memset(fpu->state, 0, xstate_size);
+	memset(fpu->state, 0, kernel_xstate_size);
 
 	if (cpu_has_fxsr) {
 		fx_finit(&fpu->state->fxsave);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 8213da6..ded2c82 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -113,7 +113,7 @@ void arch_release_task_struct(struct task_struct *tsk)
 void arch_task_cache_init(void)
 {
         task_xstate_cachep =
-        	kmem_cache_create("task_xstate", xstate_size,
+		kmem_cache_create("task_xstate", kernel_xstate_size,
 				  __alignof__(union thread_xstate),
 				  SLAB_PANIC | SLAB_NOTRACK, NULL);
 	setup_xstate_comp();
diff --git a/arch/x86/kernel/xsave.c b/arch/x86/kernel/xsave.c
index b8373a0..98c236f 100644
--- a/arch/x86/kernel/xsave.c
+++ b/arch/x86/kernel/xsave.c
@@ -409,7 +409,7 @@ int __restore_xstate_sig(void __user *buf, void __user *buf_fx, int size)
 {
 	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
-	int state_size = xstate_size;
+	int state_size = kernel_xstate_size;
 	u64 xstate_bv = 0;
 	int fx_only = 0;
 
@@ -609,7 +609,7 @@ static void __init setup_init_fpu_buf(void)
 	 * Setup init_xstate_buf to represent the init state of
 	 * all the features managed by the xsave
 	 */
-	init_xstate_buf = alloc_bootmem_align(xstate_size,
+	init_xstate_buf = alloc_bootmem_align(kernel_xstate_size,
 					      __alignof__(struct xsave_struct));
 	fx_finit(&init_xstate_buf->i387);
 
@@ -663,15 +663,15 @@ static void __init init_xstate_size(void)
 	user_xstate_size = ebx;
 
 	if (!cpu_has_xsaves) {
-		xstate_size = ebx;
+		kernel_xstate_size = ebx;
 		return;
 	}
 
-	xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+	kernel_xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
 	for (i = 2; i < 64; i++) {
 		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
 			cpuid_count(XSTATE_CPUID, i, &eax, &ebx, &ecx, &edx);
-			xstate_size += eax;
+			kernel_xstate_size += eax;
 		}
 	}
 }
@@ -709,7 +709,7 @@ static void __init xstate_enable_boot_cpu(void)
 	 */
 	init_xstate_size();
 
-	update_regset_xstate_info(xstate_size, pcntxt_mask);
+	update_regset_xstate_info(kernel_xstate_size, pcntxt_mask);
 	prepare_fx_sw_frame();
 	setup_init_fpu_buf();
 
@@ -728,7 +728,7 @@ static void __init xstate_enable_boot_cpu(void)
 	}
 
 	pr_info("enabled xstate_bv 0x%llx, cntxt size 0x%x using %s\n",
-		pcntxt_mask, xstate_size,
+		pcntxt_mask, kernel_xstate_size,
 		cpu_has_xsaves ? "compacted form" : "standard form");
 }
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH Bugfix v2 4/4] x86/xsave: Don't add new states in xsave_struct
  2015-04-22  4:51 [PATCH Bugfix v2 0/4] x86/xsave/xsaves: Fix a few xsave/xsaves related bugs Fenghua Yu
                   ` (2 preceding siblings ...)
  2015-04-22  4:51 ` [PATCH Bugfix v2 3/4] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Fenghua Yu
@ 2015-04-22  4:51 ` Fenghua Yu
  3 siblings, 0 replies; 18+ messages in thread
From: Fenghua Yu @ 2015-04-22  4:51 UTC (permalink / raw)
  To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner, Asit K Mallick,
	Dave Hansen, Glenn Williamson
  Cc: linux-kernel, x86, Fenghua Yu

From: Fenghua Yu <fenghua.yu@intel.com>

The structure of xsave_struct is non-architectural. Some xstates could be
disabled and leave some holes in the xsave area. In compact format,
offsets of xstates in the xsave area are decided during booting time.

So the fields in xsave_struct are not static and fixed during compilation
time. The offsets and sizes of the fields in the structure should be
detected from cpuid during runtime.

Therefore, we don't add new states in xsave_struct except legacy fpu/sse
states and header fields whose offsets and sizes are defined
architecturally.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
---
 arch/x86/include/asm/processor.h | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f26051b..163defc 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -386,16 +386,6 @@ struct i387_soft_struct {
 	u32			entry_eip;
 };
 
-struct ymmh_struct {
-	/* 16 * 16 bytes for each YMMH-reg = 256 bytes */
-	u32 ymmh_space[64];
-};
-
-/* We don't support LWP yet: */
-struct lwp_struct {
-	u8 reserved[128];
-};
-
 struct bndreg {
 	u64 lower_bound;
 	u64 upper_bound;
@@ -415,11 +405,11 @@ struct xsave_hdr_struct {
 struct xsave_struct {
 	struct i387_fxsave_struct i387;
 	struct xsave_hdr_struct xsave_hdr;
-	struct ymmh_struct ymmh;
-	struct lwp_struct lwp;
-	struct bndreg bndreg[4];
-	struct bndcsr bndcsr;
-	/* new processor state extensions will go here */
+	/*
+	 * Please don't add more states here. They are non-architectural.
+	 * Offset and size of each state should be calculated during boot time.
+	 * So adding states here is meanless.
+	 */
 } __attribute__ ((packed, aligned (64)));
 
 union thread_xstate {
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
@ 2015-04-22 18:45   ` Dave Hansen
  2015-04-22 19:05     ` Yu, Fenghua
  2015-04-28 14:28   ` Dave Hansen
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2015-04-22 18:45 UTC (permalink / raw)
  To: Fenghua Yu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Asit K Mallick, Glenn Williamson
  Cc: linux-kernel, x86

On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> +	/*
> +	 * Copy rest of xstates in compact format to user.
> +	 */
> +	for (i = 2; i < xstate_features; i++) {
> +		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
> +			int user_offset, kernel_offset;
> +			int size;
> +
> +			user_offset = xstate_offsets[i];
> +			kernel_offset = xstate_comp_offsets[i];
> +			size = xstate_sizes[i];
> +
> +			if (__copy_to_user(buf_fx + user_offset,
> +				     xsave + kernel_offset, size))
> +				return -1;
> +		}
> +	}

Is this checking the right bitmap?

The 'xsaves' documentation says: "If RFBM[i] = 1, XSTATE_BV[i] is set to
the value of XINUSE[i]".  Where "XINUSE denotes the state-component
bitmap corresponding to the init optimization".

So shouldn't this be checking xsave->xsave_hdr.xstate_bv instead of
pcntxt_mask?  The will be equal unless the "init optimization" is in play.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22 18:45   ` Dave Hansen
@ 2015-04-22 19:05     ` Yu, Fenghua
  2015-04-22 19:34       ` Dave Hansen
  0 siblings, 1 reply; 18+ messages in thread
From: Yu, Fenghua @ 2015-04-22 19:05 UTC (permalink / raw)
  To: Hansen, Dave, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

> From: Hansen, Dave
> Sent: Wednesday, April 22, 2015 11:45 AM
> To: Yu, Fenghua; H. Peter Anvin; Ingo Molnar; Thomas Gleixner; Mallick, Asit
> K; Williamson, Glenn P
> Cc: linux-kernel; x86
> Subject: Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use
> user_xstate_size for xstate size in signal context
> 
> On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> > +	/*
> > +	 * Copy rest of xstates in compact format to user.
> > +	 */
> > +	for (i = 2; i < xstate_features; i++) {
> > +		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
> > +			int user_offset, kernel_offset;
> > +			int size;
> > +
> > +			user_offset = xstate_offsets[i];
> > +			kernel_offset = xstate_comp_offsets[i];
> > +			size = xstate_sizes[i];
> > +
> > +			if (__copy_to_user(buf_fx + user_offset,
> > +				     xsave + kernel_offset, size))
> > +				return -1;
> > +		}
> > +	}
> 
> Is this checking the right bitmap?
> 
> The 'xsaves' documentation says: "If RFBM[i] = 1, XSTATE_BV[i] is set to the
> value of XINUSE[i]".  Where "XINUSE denotes the state-component bitmap
> corresponding to the init optimization".
> 
> So shouldn't this be checking xsave->xsave_hdr.xstate_bv instead of
> pcntxt_mask?  The will be equal unless the "init optimization" is in play.

Xsave->xsave_hdr.xstate_bv is equal to pcntxt_mask (see setup_init_fpu_buf()).

And here XINUSE and RFBM are irrelevant because they are used in init or modified
optimization during xsave*/xrstore*. We only copy out the xstates from kernel
to user and XINUSE and RFBM are not in scope here.

So I would think pcntxt_mask is good to be used here.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22 19:05     ` Yu, Fenghua
@ 2015-04-22 19:34       ` Dave Hansen
  2015-04-23  0:06         ` Yu, Fenghua
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2015-04-22 19:34 UTC (permalink / raw)
  To: Yu, Fenghua, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

On 04/22/2015 12:05 PM, Yu, Fenghua wrote:
>> From: Hansen, Dave
>> Sent: Wednesday, April 22, 2015 11:45 AM
>> To: Yu, Fenghua; H. Peter Anvin; Ingo Molnar; Thomas Gleixner; Mallick, Asit
>> K; Williamson, Glenn P
>> Cc: linux-kernel; x86
>> Subject: Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use
>> user_xstate_size for xstate size in signal context
>>
>> On 04/21/2015 09:51 PM, Fenghua Yu wrote:
>>> +	/*
>>> +	 * Copy rest of xstates in compact format to user.
>>> +	 */
>>> +	for (i = 2; i < xstate_features; i++) {
>>> +		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
>>> +			int user_offset, kernel_offset;
>>> +			int size;
>>> +
>>> +			user_offset = xstate_offsets[i];
>>> +			kernel_offset = xstate_comp_offsets[i];
>>> +			size = xstate_sizes[i];
>>> +
>>> +			if (__copy_to_user(buf_fx + user_offset,
>>> +				     xsave + kernel_offset, size))
>>> +				return -1;
>>> +		}
>>> +	}
>>
>> Is this checking the right bitmap?
>>
>> The 'xsaves' documentation says: "If RFBM[i] = 1, XSTATE_BV[i] is set to the
>> value of XINUSE[i]".  Where "XINUSE denotes the state-component bitmap
>> corresponding to the init optimization".
>>
>> So shouldn't this be checking xsave->xsave_hdr.xstate_bv instead of
>> pcntxt_mask?  The will be equal unless the "init optimization" is in play.
> 
> Xsave->xsave_hdr.xstate_bv is equal to pcntxt_mask (see setup_init_fpu_buf()).

No.  I don't think it works this way.  xsaves *WRITES* to the
XSTATE_BV[] field in addition to reading it.  According to the SDM:
"execution of XSAVES writes the XSTATE_BV field of the XSAVE header (see
Section 13.4.2), setting XSTATE_BV[i] (0 ≤ i ≤ 63) as follows...
Execution of XSAVES performs the init optimization to reduce the amount
of data written to memory. If XINUSE[i] = 0, state component i is not
saved to the XSAVE area (even if RFBM[i] = 1)".

> And here XINUSE and RFBM are irrelevant because they are used in init or modified
> optimization during xsave*/xrstore*. We only copy out the xstates from kernel
> to user and XINUSE and RFBM are not in scope here.

I think they are in use.  The kernel created its buffer with 'xsaves' so
when it is reading the buffer it needs to take those optimizations in to
consideration.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22 19:34       ` Dave Hansen
@ 2015-04-23  0:06         ` Yu, Fenghua
  2015-04-23  0:21           ` Dave Hansen
  2015-04-23  0:34           ` Dave Hansen
  0 siblings, 2 replies; 18+ messages in thread
From: Yu, Fenghua @ 2015-04-23  0:06 UTC (permalink / raw)
  To: Hansen, Dave, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 3646 bytes --]

> From: Hansen, Dave
> Sent: Wednesday, April 22, 2015 12:34 PM
> To: Yu, Fenghua; H. Peter Anvin; Ingo Molnar; Thomas Gleixner; Mallick, Asit
> K; Williamson, Glenn P
> Cc: linux-kernel; x86
> Subject: Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use
> user_xstate_size for xstate size in signal context
> 
> On 04/22/2015 12:05 PM, Yu, Fenghua wrote:
> >> From: Hansen, Dave
> >> Sent: Wednesday, April 22, 2015 11:45 AM
> >> To: Yu, Fenghua; H. Peter Anvin; Ingo Molnar; Thomas Gleixner;
> >> Mallick, Asit K; Williamson, Glenn P
> >> Cc: linux-kernel; x86
> >> Subject: Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use
> >> user_xstate_size for xstate size in signal context
> >>
> >> On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> >>> +	/*
> >>> +	 * Copy rest of xstates in compact format to user.
> >>> +	 */
> >>> +	for (i = 2; i < xstate_features; i++) {
> >>> +		if (test_bit(i, (unsigned long *)&pcntxt_mask)) {
> >>> +			int user_offset, kernel_offset;
> >>> +			int size;
> >>> +
> >>> +			user_offset = xstate_offsets[i];
> >>> +			kernel_offset = xstate_comp_offsets[i];
> >>> +			size = xstate_sizes[i];
> >>> +
> >>> +			if (__copy_to_user(buf_fx + user_offset,
> >>> +				     xsave + kernel_offset, size))
> >>> +				return -1;
> >>> +		}
> >>> +	}
> >>
> >> Is this checking the right bitmap?
> >>
> >> The 'xsaves' documentation says: "If RFBM[i] = 1, XSTATE_BV[i] is set
> >> to the value of XINUSE[i]".  Where "XINUSE denotes the
> >> state-component bitmap corresponding to the init optimization".
> >>
> >> So shouldn't this be checking xsave->xsave_hdr.xstate_bv instead of
> >> pcntxt_mask?  The will be equal unless the "init optimization" is in play.
> >
> > Xsave->xsave_hdr.xstate_bv is equal to pcntxt_mask (see
> setup_init_fpu_buf()).
> 
> No.  I don't think it works this way.  xsaves *WRITES* to the XSTATE_BV[]
> field in addition to reading it.  According to the SDM:
> "execution of XSAVES writes the XSTATE_BV field of the XSAVE header (see
> Section 13.4.2), setting XSTATE_BV[i] (0 ≤ i ≤ 63) as follows...
> Execution of XSAVES performs the init optimization to reduce the amount of
> data written to memory. If XINUSE[i] = 0, state component i is not saved to
> the XSAVE area (even if RFBM[i] = 1)".
> 
> > And here XINUSE and RFBM are irrelevant because they are used in init
> > or modified optimization during xsave*/xrstore*. We only copy out the
> > xstates from kernel to user and XINUSE and RFBM are not in scope here.
> 
> I think they are in use.  The kernel created its buffer with 'xsaves' so when it
> is reading the buffer it needs to take those optimizations in to consideration.
> 

We need to copy ALL of supported xstates to user. Using xstate_bv only copies
partial xstates that are in non-init status.

Xstate_bv only has xstates that are in non-init status. The xstates that are in
init status will not be copied to user if using xstate_bv. In this case, the xstates
that are in init status will be all zeros in the user buffer (remember we clear
the user buffer before). That's incorrect.

If pcntxt_mask is used to copy xstates, all of supported xstates either in init or
non-init status will be copied correctly to the user buffer. All of other parts in
the user buffer will be zeros.

So there should be a problem to use pcntxt_mask here.

But I agree the init_xstate_buf was not initialized correctly earlier in the boot
time. But that's a separate issue to be fixed in a different patch.

Thanks.

-Fenghua
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-23  0:06         ` Yu, Fenghua
@ 2015-04-23  0:21           ` Dave Hansen
  2015-04-23  0:23             ` Yu, Fenghua
  2015-04-23  0:34           ` Dave Hansen
  1 sibling, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2015-04-23  0:21 UTC (permalink / raw)
  To: Yu, Fenghua, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

On 04/22/2015 05:06 PM, Yu, Fenghua wrote:
> Xsave->xsave_hdr.xstate_bv is equal to pcntxt_mask (see
> setup_init_fpu_buf()).

This is going to be a long thread.  But, let's take it a step at a time.

Do you agree that this earlier statement is incorrect?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-23  0:21           ` Dave Hansen
@ 2015-04-23  0:23             ` Yu, Fenghua
  0 siblings, 0 replies; 18+ messages in thread
From: Yu, Fenghua @ 2015-04-23  0:23 UTC (permalink / raw)
  To: Hansen, Dave, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 814 bytes --]

> From: Hansen, Dave
> Sent: Wednesday, April 22, 2015 5:21 PM
> To: Yu, Fenghua; H. Peter Anvin; Ingo Molnar; Thomas Gleixner; Mallick, Asit
> K; Williamson, Glenn P
> Cc: linux-kernel; x86
> Subject: Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use
> user_xstate_size for xstate size in signal context
> 
> On 04/22/2015 05:06 PM, Yu, Fenghua wrote:
> > Xsave->xsave_hdr.xstate_bv is equal to pcntxt_mask (see
> > setup_init_fpu_buf()).
> 
> This is going to be a long thread.  But, let's take it a step at a time.
> 
> Do you agree that this earlier statement is incorrect?

The statement is incorrect. But the code is correct.

Thanks.

-Fenghua
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-23  0:06         ` Yu, Fenghua
  2015-04-23  0:21           ` Dave Hansen
@ 2015-04-23  0:34           ` Dave Hansen
  2015-04-23 17:09             ` Yu, Fenghua
  1 sibling, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2015-04-23  0:34 UTC (permalink / raw)
  To: Yu, Fenghua, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

On 04/22/2015 05:06 PM, Yu, Fenghua wrote:
> We need to copy ALL of supported xstates to user. Using xstate_bv only copies
> partial xstates that are in non-init status.
> 
> Xstate_bv only has xstates that are in non-init status. The xstates that are in
> init status will not be copied to user if using xstate_bv. In this case, the xstates
> that are in init status will be all zeros in the user buffer (remember we clear
> the user buffer before). That's incorrect.

Do we *need* to fill every field in the user buffer?  I don't think
that's a given.  The xstate_bv that we copy out to userspace would
clearly spell out which fields we've populated.  Leaving the rest zeros
might be OK.

> So there should be a problem to use pcntxt_mask here.

I'm not convinced.  Let's step through this.

1. kernel does an xsaves
2. Due to the 'init optimization' xstate_bv != pcntxt_mask, and some
   fields of kernel xsave buffer are untouched
3. Kernel looks at pcntxt_mask, which means it copies untouched xsave
   buffer fields out to userspace

If those untouched fields of the xsave buffer are in the "init state",
then we've got no problem.  But, are those "untouched since (1)" fields
in the kernel xsave buf *GUARANTEED* to be in the init state?

What stops this sequence of events from happening?

1. task sets xsave fields to non-init state
2. kernel 'xsaves' xsave fields to fpu.state->xsave
3. userspace gets all xsave fields to init state again
4. kernel calls 'xsaves' again, but 'xsaves' declines to  fill the
   fpu.state->xsave fields because of the init optimization, leaves some
   xstate_bv bits = 0.
5. Signal handling code looks at pcntxt_mask, copies stale
   non-init-state fields from (2) out to userspace
6. Userspace sees wrong fields.  Boom.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-23  0:34           ` Dave Hansen
@ 2015-04-23 17:09             ` Yu, Fenghua
  2015-04-23 21:32               ` Dave Hansen
  0 siblings, 1 reply; 18+ messages in thread
From: Yu, Fenghua @ 2015-04-23 17:09 UTC (permalink / raw)
  To: Hansen, Dave, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1510 bytes --]

> From: Hansen, Dave
> Sent: Wednesday, April 22, 2015 5:35 PM
> On 04/22/2015 05:06 PM, Yu, Fenghua wrote:
> > We need to copy ALL of supported xstates to user. Using xstate_bv only
> > copies partial xstates that are in non-init status.
> >
> > Xstate_bv only has xstates that are in non-init status. The xstates
> > that are in init status will not be copied to user if using xstate_bv.
> > In this case, the xstates that are in init status will be all zeros in
> > the user buffer (remember we clear the user buffer before). That's
> incorrect.
> I'm not convinced.  Let's step through this.
> 
> 1. kernel does an xsaves
> 2. Due to the 'init optimization' xstate_bv != pcntxt_mask, and some
>    fields of kernel xsave buffer are untouched 3. Kernel looks at pcntxt_mask,
> which means it copies untouched xsave
>    buffer fields out to userspace
> 
> If those untouched fields of the xsave buffer are in the "init state", then
> we've got no problem.  But, are those "untouched since (1)" fields in the
> kernel xsave buf *GUARANTEED* to be in the init state?

In fact, those untouched fields in kernel xsave buf "IS GUARANTEED" to be in
the init state.

Please check __sanitize_i387_state() called just before copy_to_user_xstate().
That functions GUARANTEES the untouched fields in kernel xsave buf to be in
init state.

Thanks.

-Fenghua

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-23 17:09             ` Yu, Fenghua
@ 2015-04-23 21:32               ` Dave Hansen
  0 siblings, 0 replies; 18+ messages in thread
From: Dave Hansen @ 2015-04-23 21:32 UTC (permalink / raw)
  To: Yu, Fenghua, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

On 04/23/2015 10:09 AM, Yu, Fenghua wrote:
>> > If those untouched fields of the xsave buffer are in the "init state", then
>> > we've got no problem.  But, are those "untouched since (1)" fields in the
>> > kernel xsave buf *GUARANTEED* to be in the init state?
> In fact, those untouched fields in kernel xsave buf "IS GUARANTEED" to be in
> the init state.
> 
> Please check __sanitize_i387_state() called just before copy_to_user_xstate().
> That functions GUARANTEES the untouched fields in kernel xsave buf to be in
> init state.

OK, that makes sense.  Other than the horribly-named
sanitize_i387_state(). :)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
  2015-04-22 18:45   ` Dave Hansen
@ 2015-04-28 14:28   ` Dave Hansen
  2015-04-28 22:09   ` Dave Hansen
  2015-04-29 13:53   ` Dave Hansen
  3 siblings, 0 replies; 18+ messages in thread
From: Dave Hansen @ 2015-04-28 14:28 UTC (permalink / raw)
  To: Fenghua Yu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Asit K Mallick, Glenn Williamson
  Cc: linux-kernel, x86, Fengguang Wu

On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> +static int copy_to_user_xstate(void __user *buf_fx, struct xsave_struct *xsave)
> +{
...
> +			if (__copy_to_user(buf_fx + user_offset,
> +				     xsave + kernel_offset, size))
> +				return -1;
> +		}
> +	}
> +

0day found this, but I'm replying to the original thread.

> [   13.595511] BUG: KASan: out of bounds access in save_xstate_sig+0x1bf/0x470 at addr ffff880000184208
> [   13.596280] Read of size 8 by task init/1

copy_to_user_xstate() gets inlined in to save_xstate_sig().  The read
probably means that this was an access to 'xsave' since we are reading a
kernel structure out to userspace.

'xsave' is a 'struct xsave_struct *' and yet it's being incremented like
'buf_fx' which is a 'void *'.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
  2015-04-22 18:45   ` Dave Hansen
  2015-04-28 14:28   ` Dave Hansen
@ 2015-04-28 22:09   ` Dave Hansen
  2015-04-28 22:11     ` Yu, Fenghua
  2015-04-29 13:53   ` Dave Hansen
  3 siblings, 1 reply; 18+ messages in thread
From: Dave Hansen @ 2015-04-28 22:09 UTC (permalink / raw)
  To: Fenghua Yu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Asit K Mallick, Glenn Williamson
  Cc: linux-kernel, x86

On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> +	/*
> +	 * Clear xcomp_bv[63] in user's xsave area header to indicate
> +	 * buf_fx is in standard format.
> +	 */
> +	xcomp_bv = xsave->xsave_hdr.xcomp_bv;
> +	user_xsave = buf_fx;
> +	xcomp_bv &= ~((u64)1 << 63);

I ran in to another bug.  xrestor_user() is hitting a #GP with these values:

[    6.258743] xrestore_user() tmp: ffff88003f813000
[    6.261122] 	     XCR0: 000000000000001f
[    6.261868] 	xstate_bv: 0000000000000003
[    6.262613] 	 xcomp_bv: 000000000000001f

I think it is because bit 63 is clear in xcomp_bv, but there are other
bits set in there.

I think the above needs to just do:

	/*
	 * We are uncompacting the state for the user buffer.  We need
	 * to clear out the xcomp_bv field entirely.  The uncompacted
	 * form of xsave/xrstor treats this field as reserved.
	 */
	if (__put_user(0, &user_xsave->xsave_hdr.xcomp_bv))
		return -1;



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-28 22:09   ` Dave Hansen
@ 2015-04-28 22:11     ` Yu, Fenghua
  0 siblings, 0 replies; 18+ messages in thread
From: Yu, Fenghua @ 2015-04-28 22:11 UTC (permalink / raw)
  To: Hansen, Dave, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Mallick, Asit K, Williamson, Glenn P
  Cc: linux-kernel, x86

> From: Hansen, Dave
> Sent: Tuesday, April 28, 2015 3:09 PM
> To: Yu, Fenghua; H. Peter Anvin; Ingo Molnar; Thomas Gleixner; Mallick, Asit
> K; Williamson, Glenn P
> Cc: linux-kernel; x86
> Subject: Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use
> user_xstate_size for xstate size in signal context
> 
> On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> > +	/*
> > +	 * Clear xcomp_bv[63] in user's xsave area header to indicate
> > +	 * buf_fx is in standard format.
> > +	 */
> > +	xcomp_bv = xsave->xsave_hdr.xcomp_bv;
> > +	user_xsave = buf_fx;
> > +	xcomp_bv &= ~((u64)1 << 63);
> 
> I ran in to another bug.  xrestor_user() is hitting a #GP with these values:
> 
> [    6.258743] xrestore_user() tmp: ffff88003f813000
> [    6.261122] 	     XCR0: 000000000000001f
> [    6.261868] 	xstate_bv: 0000000000000003
> [    6.262613] 	 xcomp_bv: 000000000000001f
> 
> I think it is because bit 63 is clear in xcomp_bv, but there are other bits set in
> there.
> 
> I think the above needs to just do:
> 
> 	/*
> 	 * We are uncompacting the state for the user buffer.  We need
> 	 * to clear out the xcomp_bv field entirely.  The uncompacted
> 	 * form of xsave/xrstor treats this field as reserved.
> 	 */
> 	if (__put_user(0, &user_xsave->xsave_hdr.xcomp_bv))
> 		return -1;

Yes, that makes sense. I will add this in v3 patchset.

Thanks.

-Fenghua

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context
  2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
                     ` (2 preceding siblings ...)
  2015-04-28 22:09   ` Dave Hansen
@ 2015-04-29 13:53   ` Dave Hansen
  3 siblings, 0 replies; 18+ messages in thread
From: Dave Hansen @ 2015-04-29 13:53 UTC (permalink / raw)
  To: Fenghua Yu, H. Peter Anvin, Ingo Molnar, Thomas Gleixner,
	Asit K Mallick, Glenn Williamson
  Cc: linux-kernel, x86

On 04/21/2015 09:51 PM, Fenghua Yu wrote:
> +static int copy_to_user_xstate(void __user *buf_fx, struct xsave_struct *xsave)

I was thinking about this a bit more.  I think uncompacting the xsave
state from the kernel buffer in to the user buffer in software is
probably a bad idea.

It would nice, eventually, to not need to memset() the entire xsave
state when we initialize it.  It could be a lot bigger some day, and we
might start to notice the memset() cost.  We can't do this unless we're
careful about copying uninitialized parts of the task xstate buffer out
to userspace.

The whole reason we've gone down this path is that we are
!user_has_fpu(), so there is no FPU state in the registers which we can
xsave directly.  Perhaps we should enable the FPU hardware, restore the
state to the registers (from the xstate buffer), and then use
xsave_user() to write the user xstate buffer.

This *might* even be faster.  But, it *is* a slow (and rare) path, so it
does not really matter what we do performance-wise.  What I've described
above should be a lot less code than what we've got now.  It makes the
hardware do the hard work.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-04-29 13:53 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-22  4:51 [PATCH Bugfix v2 0/4] x86/xsave/xsaves: Fix a few xsave/xsaves related bugs Fenghua Yu
2015-04-22  4:51 ` [PATCH Bugfix v2 1/4] x86/xsave.c: Fix xstate offsets and sizes enumeration Fenghua Yu
2015-04-22  4:51 ` [PATCH Bugfix v2 2/4] x86/xsaves: Define and use user_xstate_size for xstate size in signal context Fenghua Yu
2015-04-22 18:45   ` Dave Hansen
2015-04-22 19:05     ` Yu, Fenghua
2015-04-22 19:34       ` Dave Hansen
2015-04-23  0:06         ` Yu, Fenghua
2015-04-23  0:21           ` Dave Hansen
2015-04-23  0:23             ` Yu, Fenghua
2015-04-23  0:34           ` Dave Hansen
2015-04-23 17:09             ` Yu, Fenghua
2015-04-23 21:32               ` Dave Hansen
2015-04-28 14:28   ` Dave Hansen
2015-04-28 22:09   ` Dave Hansen
2015-04-28 22:11     ` Yu, Fenghua
2015-04-29 13:53   ` Dave Hansen
2015-04-22  4:51 ` [PATCH Bugfix v2 3/4] x86/xsaves: Rename xstate_size to kernel_xstate_size to explicitly distinguish xstate size in kernel from user space Fenghua Yu
2015-04-22  4:51 ` [PATCH Bugfix v2 4/4] x86/xsave: Don't add new states in xsave_struct Fenghua Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).