All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage
@ 2021-06-08 14:36 Thomas Gleixner
  2021-06-08 14:36 ` [patch V3 1/6] x86/fpu: Prevent state corruption in __fpu__restore_sig() Thomas Gleixner
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Rik van Riel

syszbot reported a warnon for XRSTOR raising #GP:

  https://lore.kernel.org/r/0000000000004c453905c30f8334@google.com

with a syzcaller reproducer and a conclusive bisect result.  It took a
while to destill a simple C reproducer out of it which allowed to pin point
the root cause: The recent addition of supervisor XSTATEs broke the signal
restore path for the case where the signal handler wreckaged the XSTATE on
stack because it does not sanitize the XSTATE header which causes a
subsequent XRSTOR to fail and #GP.

The following series addresses the problem and fixes related issues
which were found while inspecting the related changes and testing
signal restore and PKRU.

V2 of this can be found here:

  https://lore.kernel.org/r/20210605234742.712464974@linutronix.de

Changes vs. V2:

  - Drop the cleanup patches for now, so the bug fixes make progress

  - Fix the PKRU context switch code

  - Address review comments

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 1/6] x86/fpu: Prevent state corruption in __fpu__restore_sig()
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
@ 2021-06-08 14:36 ` Thomas Gleixner
  2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2021-06-08 14:36 ` [patch V3 2/6] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer Thomas Gleixner
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Rik van Riel

From: Thomas Gleixner <tglx@linutronix.de>

The non-compacted slowpath uses __copy_from_user() and copies the entire
user buffer into the kernel buffer, verbatim.  This means that the kernel
buffer may now contain entirely invalid state on which XRSTOR will #GP.
validate_user_xstate_header() can detect some of that corruption, but that
leaves the onus on callers to clear the buffer.

Prior to XSAVES support it was possible just to reinitialize the buffer,
completely, but with supervisor states that is not longer possible as the
buffer clearing code split got it backwards. Fixing that is possible, but
not corrupting the state in the first place is more robust.

Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate()
which validates the XSAVE header contents before copying the actual states
to the kernel. copy_user_to_xstate() was previously only called for
compacted-format kernel buffers, but it works for both compacted and
non-compacted forms.

Using it for the non-compacted form is slower because of multiple
__copy_from_user() operations, but that cost is less important than robust
code in an already slow path.

[ Changelog polished by Dave Hansen ]

Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates")
Reported-by: syzbot+2067e764dbcd10721e2e@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
---
V2: Removed the make validate_user_xstate_header() static hunks (Borislav)
---
 arch/x86/kernel/fpu/signal.c |    9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -405,14 +405,7 @@ static int __fpu__restore_sig(void __use
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
-		if (using_compacted_format()) {
-			ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
-		} else {
-			ret = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
-
-			if (!ret && state_size > offsetof(struct xregs_state, header))
-				ret = validate_user_xstate_header(&fpu->state.xsave.header);
-		}
+		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			goto err_out;
 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 2/6] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
  2021-06-08 14:36 ` [patch V3 1/6] x86/fpu: Prevent state corruption in __fpu__restore_sig() Thomas Gleixner
@ 2021-06-08 14:36 ` Thomas Gleixner
  2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Andy Lutomirski
  2021-06-08 14:36 ` [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads Thomas Gleixner
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Rik van Riel

From: Andy Lutomirski <luto@kernel.org>

Both Intel and AMD consider it to be architecturally valid for XRSTOR to
fail with #PF but nonetheless change the register state.  The actual
conditions under which this might occur are unclear [1], but it seems
plausible that this might be triggered if one sibling thread unmaps a page
and invalidates the shared TLB while another sibling thread is executing
XRSTOR on the page in question.

__fpu__restore_sig() can execute XRSTOR while the hardware registers are
preserved on behalf of a different victim task (using the
fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but
modify the registers.  If this happens, then there is a window in which
__fpu__restore_sig() could schedule out and the victim task could schedule
back in without reloading its own FPU registers.  This would result in part
of the FPU state that __fpu__restore_sig() was attempting to load leaking
into the victim task's user-visible state.

Invalidate preserved FPU registers on XRSTOR failure to prevent this
situation from corrupting any state.

[1] Frequent readers of the errata lists might imagine "complex
    microarchitectural conditions"

Fixes: 1d731e731c4c ("x86/fpu: Add a fastpath to __fpu__restore_sig()")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
V3: Rework comment - Borislav
V2: Amend changelog - Borislav
---
 arch/x86/kernel/fpu/signal.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -369,6 +369,25 @@ static int __fpu__restore_sig(void __use
 			fpregs_unlock();
 			return 0;
 		}
+
+		/*
+		 * The above did an FPU restore operation, restricted to
+		 * the user portion of the registers, and failed, but the
+		 * microcode might have modified the FPU registers
+		 * nevertheless.
+		 *
+		 * If the FPU registers do not belong to current, then
+		 * invalidate the FPU register state otherwise the task might
+		 * preempt current and return to user space with corrupted
+		 * FPU registers.
+		 *
+		 * In case current owns the FPU registers then no further
+		 * action is required. The fixup below will handle it
+		 * correctly.
+		 */
+		if (test_thread_flag(TIF_NEED_FPU_LOAD))
+			__cpu_invalidate_fpregs_state();
+
 		fpregs_unlock();
 	} else {
 		/*


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
  2021-06-08 14:36 ` [patch V3 1/6] x86/fpu: Prevent state corruption in __fpu__restore_sig() Thomas Gleixner
  2021-06-08 14:36 ` [patch V3 2/6] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer Thomas Gleixner
@ 2021-06-08 14:36 ` Thomas Gleixner
  2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2021-06-10 17:10   ` [patch V3 3/6] " Andy Lutomirski
  2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov

switch_fpu_finish() checks current->mm as indicator for kernel threads.
That's wrong because kernel threads can temporarily use a mm of a user
process via kthread_use_mm().

Check the task flags for PF_KTHREAD instead.

Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -578,7 +578,7 @@ static inline void switch_fpu_finish(str
 	 * PKRU state is switched eagerly because it needs to be valid before we
 	 * return to userland e.g. for a copy_to_user() operation.
 	 */
-	if (current->mm) {
+	if (!(current->flags & PF_KTHREAD)) {
 		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
 		if (pk)
 			pkru_val = pk->pkru;


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
                   ` (2 preceding siblings ...)
  2021-06-08 14:36 ` [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads Thomas Gleixner
@ 2021-06-08 14:36 ` Thomas Gleixner
  2021-06-08 15:40   ` Borislav Petkov
                     ` (3 more replies)
  2021-06-08 14:36 ` [patch V3 5/6] x86/fpu: Add address range checks to copy_user_to_xstate() Thomas Gleixner
                   ` (4 subsequent siblings)
  8 siblings, 4 replies; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov

When user space brings PKRU into init state, then the kernel handling is
broken:

  T1 user space
     xsave(state)
     state.header.xfeatures &= ~XFEATURE_MASK_PKRU;
     xrstor(state)

  T1 -> kernel
     schedule()
       XSAVE(S) -> T1->xsave.header.xfeatures[PKRU] == 0
       T1->flags |= TIF_NEED_FPU_LOAD;

       wrpkru();

     schedule()
       ...
       pk = get_xsave_addr(&T1->fpu->state.xsave, XFEATURE_PKRU);
       if (pk)
	 wrpkru(pk->pkru);
       else
	 wrpkru(DEFAULT_PKRU);

Because the xfeatures bit is 0 and therefore the value in the xsave storage
is not valid, get_xsave_addr() returns NULL and switch_to() writes the
default PKRU. -> FAIL #1!

So that wreckages any copy_to/from_user() on the way back to user space
which hits memory which is protected by the default PKRU value.

Assumed that this does not fail (pure luck) then T1 goes back to user
space and because TIF_NEED_FPU_LOAD is set it ends up in

switch_fpu_return()
    __fpregs_load_activate()
      if (!fpregs_state_valid()) {
	 load_XSTATE_from_task();
      }

But if nothing touched the FPU between T1 scheduling out and in the
fpregs_state is valid so switch_fpu_return() does nothing and just clears
TIF_NEED_FPU_LOAD. Back to user space with DEFAULT_PKRU loaded. -> FAIL #2!

The fix is simple: if get_xsave_addr() returns NULL then set the PKRU value
to 0 instead of the restrictive default PKRU value.

Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/include/asm/fpu/internal.h |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -579,9 +579,16 @@ static inline void switch_fpu_finish(str
 	 * return to userland e.g. for a copy_to_user() operation.
 	 */
 	if (!(current->flags & PF_KTHREAD)) {
+		/*
+		 * If the PKRU bit in xsave.header.xfeatures is not set,
+		 * then the PKRU compoment was in init state, which means
+		 * XRSTOR will set PKRU to 0. If the bit is not set then
+		 * get_xsave_addr() will return NULL because the PKRU value
+		 * in memory is not valid. This means pkru_val has to be
+		 * set to 0 and not to init_pkru_value.
+		 */
 		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
-		if (pk)
-			pkru_val = pk->pkru;
+		pkru_val = pk ? pk->pkru : 0;
 	}
 	__write_pkru(pkru_val);
 }


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 5/6] x86/fpu: Add address range checks to copy_user_to_xstate()
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
                   ` (3 preceding siblings ...)
  2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
@ 2021-06-08 14:36 ` Thomas Gleixner
  2021-06-09 12:56   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  2021-06-08 14:36 ` [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling Thomas Gleixner
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior,
	syzbot+2067e764dbcd10721e2e, Borislav Petkov, Rik van Riel

From: Andy Lutomirski <luto@kernel.org>

copy_user_to_xstate() uses __copy_from_user(), which provides a negligible
speedup.  Fortunately, both call sites are at least almost correct.
__fpu__restore_sig() checks access_ok() with a length of
xstate_sigframe_size() and ptrace regset access uses fpu_user_xstate_size.
These should be valid upper bounds on the length, so, at worst, this would
cause spurious failures and not accesses to kernel memory.

Nonetheless, this is far more fragile than necessary and none of these
callers are in a hotpath. 

Use copy_from_user() instead.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/fpu/xstate.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1190,7 +1190,7 @@ int copy_user_to_xstate(struct xregs_sta
 	offset = offsetof(struct xregs_state, header);
 	size = sizeof(hdr);
 
-	if (__copy_from_user(&hdr, ubuf + offset, size))
+	if (copy_from_user(&hdr, ubuf + offset, size))
 		return -EFAULT;
 
 	if (validate_user_xstate_header(&hdr))
@@ -1205,7 +1205,7 @@ int copy_user_to_xstate(struct xregs_sta
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
 
-			if (__copy_from_user(dst, ubuf + offset, size))
+			if (copy_from_user(dst, ubuf + offset, size))
 				return -EFAULT;
 		}
 	}
@@ -1213,7 +1213,7 @@ int copy_user_to_xstate(struct xregs_sta
 	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
 		offset = offsetof(struct fxregs_state, mxcsr);
 		size = MXCSR_AND_FLAGS_SIZE;
-		if (__copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
+		if (copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
 			return -EFAULT;
 	}
 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
                   ` (4 preceding siblings ...)
  2021-06-08 14:36 ` [patch V3 5/6] x86/fpu: Add address range checks to copy_user_to_xstate() Thomas Gleixner
@ 2021-06-08 14:36 ` Thomas Gleixner
  2021-06-09  8:38   ` David Edmondson
  2021-06-09 12:56   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  2021-06-08 16:08 ` [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Dave Hansen
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 14:36 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior,
	syzbot+2067e764dbcd10721e2e, Borislav Petkov, Rik van Riel

From: Andy Lutomirski <luto@kernel.org>

This is very heavily based on some code from Thomas Gleixner.  On a system
without XSAVES, it triggers the WARN_ON():

    Bad FPU state detected at copy_kernel_to_fpregs+0x2f/0x40, reinitializing FPU registers.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: main() -> main(void) - Borislav
---
 tools/testing/selftests/x86/Makefile                |    3 
 tools/testing/selftests/x86/corrupt_xstate_header.c |  114 ++++++++++++++++++++
 2 files changed, 116 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/corrupt_xstate_header.c

--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -17,7 +17,8 @@ TARGETS_C_BOTHBITS := single_step_syscal
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
-TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering
+TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering \
+			corrupt_xstate_header
 # Some selftests require 32bit support enabled also on 64bit systems
 TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall
 
--- /dev/null
+++ b/tools/testing/selftests/x86/corrupt_xstate_header.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Corrupt the XSTATE header in a signal frame
+ *
+ * Based on analysis and a test case from Thomas Gleixner.
+ */
+
+#define _GNU_SOURCE
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <sched.h>
+#include <signal.h>
+#include <err.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <sys/wait.h>
+
+static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
+			   unsigned int *ecx, unsigned int *edx)
+{
+	asm volatile(
+		"cpuid;"
+		: "=a" (*eax),
+		  "=b" (*ebx),
+		  "=c" (*ecx),
+		  "=d" (*edx)
+		: "0" (*eax), "2" (*ecx));
+}
+
+static inline int xsave_enabled(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	eax = 0x1;
+	ecx = 0x0;
+	__cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* Is CR4.OSXSAVE enabled ? */
+	return ecx & (1U << 27);
+}
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static void sigusr1(int sig, siginfo_t *info, void *uc_void)
+{
+	ucontext_t *uc = uc_void;
+	uint8_t *fpstate = (uint8_t *)uc->uc_mcontext.fpregs;
+	uint64_t *xfeatures = (uint64_t *)(fpstate + 512);
+
+	printf("\tWreckage XSTATE header\n");
+	/* Wreckage the first reserved byte in the header */
+	*(xfeatures + 2) = 0xfffffff;
+}
+
+static void sigsegv(int sig, siginfo_t *info, void *uc_void)
+{
+	printf("\tGot SIGSEGV\n");
+}
+
+int main(void)
+{
+	cpu_set_t set;
+
+	sethandler(SIGUSR1, sigusr1, 0);
+	sethandler(SIGSEGV, sigsegv, 0);
+
+	if (!xsave_enabled()) {
+		printf("[SKIP] CR4.OSXSAVE disabled.\n");
+		return 0;
+	}
+
+	CPU_ZERO(&set);
+	CPU_SET(0, &set);
+
+	/*
+	 * Enforce that the child runs on the same CPU
+	 * which in turn forces a schedule.
+	 */
+	sched_setaffinity(getpid(), sizeof(set), &set);
+
+	printf("[RUN]\tSend ourselves a signal\n");
+	raise(SIGUSR1);
+
+	printf("[OK]\tBack from the signal.  Now schedule.\n");
+	pid_t child = fork();
+	if (child < 0)
+		err(1, "fork");
+	if (child == 0)
+		return 0;
+	if (child)
+		waitpid(child, NULL, 0);
+	printf("[OK]\tBack in the main thread.\n");
+
+	/*
+	 * We could try to confirm that extended state is still preserved
+	 * when we schedule.  For now, the only indication of failure is
+	 * a warning in the kernel logs.
+	 */
+
+	return 0;
+}


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
@ 2021-06-08 15:40   ` Borislav Petkov
  2021-06-08 19:15     ` Thomas Gleixner
  2021-06-08 16:06   ` Dave Hansen
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 27+ messages in thread
From: Borislav Petkov @ 2021-06-08 15:40 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel

Just typos:

On Tue, Jun 08, 2021 at 04:36:21PM +0200, Thomas Gleixner wrote:
> So that wreckages any copy_to/from_user() on the way back to user space

wrecks

> which hits memory which is protected by the default PKRU value.
> 
> Assumed that this does not fail (pure luck) then T1 goes back to user
> space and because TIF_NEED_FPU_LOAD is set it ends up in
> 
> switch_fpu_return()
>     __fpregs_load_activate()
>       if (!fpregs_state_valid()) {
> 	 load_XSTATE_from_task();
>       }
> 
> But if nothing touched the FPU between T1 scheduling out and in the
							       ^^

							s/in/if/ it seems.

> fpregs_state is valid so switch_fpu_return() does nothing and just clears
> TIF_NEED_FPU_LOAD. Back to user space with DEFAULT_PKRU loaded. -> FAIL #2!
> 
> The fix is simple: if get_xsave_addr() returns NULL then set the PKRU value
> to 0 instead of the restrictive default PKRU value.
> 
> Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/include/asm/fpu/internal.h |   11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> --- a/arch/x86/include/asm/fpu/internal.h
> +++ b/arch/x86/include/asm/fpu/internal.h
> @@ -579,9 +579,16 @@ static inline void switch_fpu_finish(str
>  	 * return to userland e.g. for a copy_to_user() operation.
>  	 */
>  	if (!(current->flags & PF_KTHREAD)) {
> +		/*
> +		 * If the PKRU bit in xsave.header.xfeatures is not set,
> +		 * then the PKRU compoment was in init state, which means
                                 ^^^^^^^^^

component

> +		 * XRSTOR will set PKRU to 0. If the bit is not set then
> +		 * get_xsave_addr() will return NULL because the PKRU value
> +		 * in memory is not valid. This means pkru_val has to be
> +		 * set to 0 and not to init_pkru_value.
> +		 */
>  		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
> -		if (pk)
> -			pkru_val = pk->pkru;
> +		pkru_val = pk ? pk->pkru : 0;
>  	}

Hohumm, let's see who cries out... :-\

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
  2021-06-08 15:40   ` Borislav Petkov
@ 2021-06-08 16:06   ` Dave Hansen
  2021-06-08 19:06     ` Thomas Gleixner
  2021-06-08 21:37   ` Babu Moger
  2021-06-09 14:46   ` [tip: x86/urgent] x86/pkru: Write hardware init value to PKRU when xstate is init tip-bot2 for Thomas Gleixner
  3 siblings, 1 reply; 27+ messages in thread
From: Dave Hansen @ 2021-06-08 16:06 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov

On 6/8/21 7:36 AM, Thomas Gleixner wrote:
> When user space brings PKRU into init state, then the kernel handling is
> broken:

Nit: while setting PKRU=0 _can_ trigger this issue (on AMD), the
underlying problem is truly with the init state and not simply with any
wrpkru(0).  IOW, I like the changelog better than the subject.

Maybe something like this would be more precise:

	x86/pkru: Write hardware init value to PKRU when xstate is init


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
                   ` (5 preceding siblings ...)
  2021-06-08 14:36 ` [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling Thomas Gleixner
@ 2021-06-08 16:08 ` Dave Hansen
  2021-06-08 18:46 ` Rik van Riel
  2021-06-09 19:18 ` [PATCH] x86/fpu: Reset state for all signal restore failures Thomas Gleixner
  8 siblings, 0 replies; 27+ messages in thread
From: Dave Hansen @ 2021-06-08 16:08 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov,
	Rik van Riel

The series as a whole looks good to me.  So, especially for the PKRU parts:

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
                   ` (6 preceding siblings ...)
  2021-06-08 16:08 ` [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Dave Hansen
@ 2021-06-08 18:46 ` Rik van Riel
  2021-06-09 19:18 ` [PATCH] x86/fpu: Reset state for all signal restore failures Thomas Gleixner
  8 siblings, 0 replies; 27+ messages in thread
From: Rik van Riel @ 2021-06-08 18:46 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov

[-- Attachment #1: Type: text/plain, Size: 354 bytes --]

On Tue, 2021-06-08 at 16:36 +0200, Thomas Gleixner wrote:

> The following series addresses the problem and fixes related issues
> which were found while inspecting the related changes and testing
> signal restore and PKRU.

Woah, you sure found some subtle stuff there.

Acked-by: Rik van Riel <riel@surriel.com>


-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 16:06   ` Dave Hansen
@ 2021-06-08 19:06     ` Thomas Gleixner
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 19:06 UTC (permalink / raw)
  To: Dave Hansen, LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov

On Tue, Jun 08 2021 at 09:06, Dave Hansen wrote:
> On 6/8/21 7:36 AM, Thomas Gleixner wrote:
>> When user space brings PKRU into init state, then the kernel handling is
>> broken:
>
> Nit: while setting PKRU=0 _can_ trigger this issue (on AMD), the
> underlying problem is truly with the init state and not simply with any
> wrpkru(0).  IOW, I like the changelog better than the subject.
>
> Maybe something like this would be more precise:
>
> 	x86/pkru: Write hardware init value to PKRU when xstate is init

Indeed.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 15:40   ` Borislav Petkov
@ 2021-06-08 19:15     ` Thomas Gleixner
  2021-06-08 20:06       ` Borislav Petkov
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-08 19:15 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: LKML, x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel

On Tue, Jun 08 2021 at 17:40, Borislav Petkov wrote:
> Just typos:
>
> On Tue, Jun 08, 2021 at 04:36:21PM +0200, Thomas Gleixner wrote:
>> So that wreckages any copy_to/from_user() on the way back to user space
>
> wrecks
>
>> which hits memory which is protected by the default PKRU value.
>> 
>> Assumed that this does not fail (pure luck) then T1 goes back to user
>> space and because TIF_NEED_FPU_LOAD is set it ends up in
>> 
>> switch_fpu_return()
>>     __fpregs_load_activate()
>>       if (!fpregs_state_valid()) {
>> 	 load_XSTATE_from_task();
>>       }
>> 
>> But if nothing touched the FPU between T1 scheduling out and in the
> 							       ^^
>
> 							s/in/if/ it
> 							seems.

No. It should be 

But if nothing touched the FPU between T1 scheduling out and back in,
then the fpregs_state is still valid which means switch_fpu_return()
does nothing and just clears TIF_NEED_FPU_LOAD. Back to user space with
DEFAULT_PKRU loaded. -> FAIL #2!

>> +		 * XRSTOR will set PKRU to 0. If the bit is not set then
>> +		 * get_xsave_addr() will return NULL because the PKRU value
>> +		 * in memory is not valid. This means pkru_val has to be
>> +		 * set to 0 and not to init_pkru_value.
>> +		 */
>>  		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
>> -		if (pk)
>> -			pkru_val = pk->pkru;
>> +		pkru_val = pk ? pk->pkru : 0;
>>  	}
>
> Hohumm, let's see who cries out... :-\

Why? It was clearly wrong and I can reproduce it with a hack which
forces a schedule to a kernel thread and it fails all the way back to
user space.

I chased that because I observed sporadic failures when forcing PKRU to
init state and then observed the default key being written. I had some
extra trace_printks there to analyze something completely different :)

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 19:15     ` Thomas Gleixner
@ 2021-06-08 20:06       ` Borislav Petkov
  0 siblings, 0 replies; 27+ messages in thread
From: Borislav Petkov @ 2021-06-08 20:06 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel

On Tue, Jun 08, 2021 at 09:15:42PM +0200, Thomas Gleixner wrote:
> But if nothing touched the FPU between T1 scheduling out and back in,
> then the fpregs_state is still valid which means switch_fpu_return()
> does nothing and just clears TIF_NEED_FPU_LOAD. Back to user space with
> DEFAULT_PKRU loaded. -> FAIL #2!

Ah ok.

> Why? It was clearly wrong and I can reproduce it with a hack which
> forces a schedule to a kernel thread and it fails all the way back to
> user space.

Oh, I was speculating about some weird luserspace's behavior of clearing
PKRU and then relying on the buggy behavior of getting PKRU restored to
DEFAULT_PKRU.

I know, it is nuts but it is user-visible change. And yeah, probably
nothing does that...

> I chased that because I observed sporadic failures when forcing PKRU to
> init state and then observed the default key being written. I had some
> extra trace_printks there to analyze something completely different :)

As you do. :-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [patch V3 4/6] x86/pkru: Make PKRU=0 actually work
  2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
  2021-06-08 15:40   ` Borislav Petkov
  2021-06-08 16:06   ` Dave Hansen
@ 2021-06-08 21:37   ` Babu Moger
  2021-06-09 14:46   ` [tip: x86/urgent] x86/pkru: Write hardware init value to PKRU when xstate is init tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 27+ messages in thread
From: Babu Moger @ 2021-06-08 21:37 UTC (permalink / raw)
  To: tglx
  Cc: bigeasy, bp, dave.hansen, fenghua.yu, linux-kernel, luto, riel,
	tony.luck, x86, yu-cheng.yu, babu.moger, dave.kleikamp

I can confirm that this patch fixes the problem reported in
https://lore.kernel.org/lkml/b2e0324a-9125-bb34-9e76-81817df27c48@amd.com/.

Tested-by: Babu Moger <babu.moger@amd.com>
Thanks

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling
  2021-06-08 14:36 ` [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling Thomas Gleixner
@ 2021-06-09  8:38   ` David Edmondson
  2021-06-09 12:56   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
  1 sibling, 0 replies; 27+ messages in thread
From: David Edmondson @ 2021-06-09  8:38 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior,
	syzbot+2067e764dbcd10721e2e, Borislav Petkov, Rik van Riel

On Tuesday, 2021-06-08 at 16:36:23 +02, Thomas Gleixner wrote:

> From: Andy Lutomirski <luto@kernel.org>
>
> This is very heavily based on some code from Thomas Gleixner.  On a system
> without XSAVES, it triggers the WARN_ON():
>
>     Bad FPU state detected at copy_kernel_to_fpregs+0x2f/0x40, reinitializing FPU registers.
>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: main() -> main(void) - Borislav
> ---
>  tools/testing/selftests/x86/Makefile                |    3 
>  tools/testing/selftests/x86/corrupt_xstate_header.c |  114 ++++++++++++++++++++
>  2 files changed, 116 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/x86/corrupt_xstate_header.c
>
> --- a/tools/testing/selftests/x86/Makefile
> +++ b/tools/testing/selftests/x86/Makefile
> @@ -17,7 +17,8 @@ TARGETS_C_BOTHBITS := single_step_syscal
>  TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
>  			test_FCMOV test_FCOMI test_FISTTP \
>  			vdso_restorer
> -TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering
> +TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering \
> +			corrupt_xstate_header
>  # Some selftests require 32bit support enabled also on 64bit systems
>  TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall
>  
> --- /dev/null
> +++ b/tools/testing/selftests/x86/corrupt_xstate_header.c
> @@ -0,0 +1,114 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Corrupt the XSTATE header in a signal frame
> + *
> + * Based on analysis and a test case from Thomas Gleixner.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include <sched.h>
> +#include <signal.h>
> +#include <err.h>
> +#include <unistd.h>
> +#include <stdint.h>
> +#include <sys/wait.h>
> +
> +static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
> +			   unsigned int *ecx, unsigned int *edx)
> +{
> +	asm volatile(
> +		"cpuid;"
> +		: "=a" (*eax),
> +		  "=b" (*ebx),
> +		  "=c" (*ecx),
> +		  "=d" (*edx)
> +		: "0" (*eax), "2" (*ecx));
> +}
> +
> +static inline int xsave_enabled(void)
> +{
> +	unsigned int eax, ebx, ecx, edx;
> +
> +	eax = 0x1;
> +	ecx = 0x0;
> +	__cpuid(&eax, &ebx, &ecx, &edx);
> +
> +	/* Is CR4.OSXSAVE enabled ? */
> +	return ecx & (1U << 27);
> +}
> +
> +static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
> +		       int flags)
> +{
> +	struct sigaction sa;
> +
> +	memset(&sa, 0, sizeof(sa));
> +	sa.sa_sigaction = handler;
> +	sa.sa_flags = SA_SIGINFO | flags;
> +	sigemptyset(&sa.sa_mask);
> +	if (sigaction(sig, &sa, 0))
> +		err(1, "sigaction");
> +}
> +
> +static void sigusr1(int sig, siginfo_t *info, void *uc_void)
> +{
> +	ucontext_t *uc = uc_void;
> +	uint8_t *fpstate = (uint8_t *)uc->uc_mcontext.fpregs;
> +	uint64_t *xfeatures = (uint64_t *)(fpstate + 512);
> +
> +	printf("\tWreckage XSTATE header\n");

"Wreckage" is an odd word to use here, as it usually refers to the
objects that are the result of a wreck. Maybe "wreck" or (to match the
test name) "corrupt"?

> +	/* Wreckage the first reserved byte in the header */
> +	*(xfeatures + 2) = 0xfffffff;

This trashes more than a byte, which doesn't seem significant, but it
would be good to have the comment and code match.

> +}
> +
> +static void sigsegv(int sig, siginfo_t *info, void *uc_void)
> +{
> +	printf("\tGot SIGSEGV\n");
> +}
> +
> +int main(void)
> +{
> +	cpu_set_t set;
> +
> +	sethandler(SIGUSR1, sigusr1, 0);
> +	sethandler(SIGSEGV, sigsegv, 0);
> +
> +	if (!xsave_enabled()) {
> +		printf("[SKIP] CR4.OSXSAVE disabled.\n");
> +		return 0;
> +	}
> +
> +	CPU_ZERO(&set);
> +	CPU_SET(0, &set);
> +
> +	/*
> +	 * Enforce that the child runs on the same CPU
> +	 * which in turn forces a schedule.
> +	 */
> +	sched_setaffinity(getpid(), sizeof(set), &set);
> +
> +	printf("[RUN]\tSend ourselves a signal\n");
> +	raise(SIGUSR1);
> +
> +	printf("[OK]\tBack from the signal.  Now schedule.\n");
> +	pid_t child = fork();
> +	if (child < 0)
> +		err(1, "fork");
> +	if (child == 0)
> +		return 0;
> +	if (child)
> +		waitpid(child, NULL, 0);
> +	printf("[OK]\tBack in the main thread.\n");
> +
> +	/*
> +	 * We could try to confirm that extended state is still preserved
> +	 * when we schedule.  For now, the only indication of failure is
> +	 * a warning in the kernel logs.
> +	 */
> +
> +	return 0;
> +}

dme.
-- 
It's alright, we told you what to dream.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip: x86/fpu] selftests/x86: Test signal frame XSTATE header corruption handling
  2021-06-08 14:36 ` [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling Thomas Gleixner
  2021-06-09  8:38   ` David Edmondson
@ 2021-06-09 12:56   ` tip-bot2 for Andy Lutomirski
  1 sibling, 0 replies; 27+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-09 12:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, Dave Hansen,
	Rik van Riel, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     b7c11876d24bdd7ae3feeaa771b8f903f6cf05eb
Gitweb:        https://git.kernel.org/tip/b7c11876d24bdd7ae3feeaa771b8f903f6cf05eb
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Tue, 08 Jun 2021 16:36:23 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 09 Jun 2021 14:46:30 +02:00

selftests/x86: Test signal frame XSTATE header corruption handling

This is very heavily based on some code from Thomas Gleixner.  On a system
without XSAVES, it triggers the WARN_ON():

  Bad FPU state detected at copy_kernel_to_fpregs+0x2f/0x40, reinitializing FPU registers.

  [ bp: Massage in nitpicks. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Link: https://lkml.kernel.org/r/20210608144346.234764986@linutronix.de
---
 tools/testing/selftests/x86/Makefile                |   3 +-
 tools/testing/selftests/x86/corrupt_xstate_header.c | 114 +++++++++++-
 2 files changed, 116 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/corrupt_xstate_header.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 65bba2a..b4142cd 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -17,7 +17,8 @@ TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt test_mremap
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
-TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering
+TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering \
+			corrupt_xstate_header
 # Some selftests require 32bit support enabled also on 64bit systems
 TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall
 
diff --git a/tools/testing/selftests/x86/corrupt_xstate_header.c b/tools/testing/selftests/x86/corrupt_xstate_header.c
new file mode 100644
index 0000000..ab8599c
--- /dev/null
+++ b/tools/testing/selftests/x86/corrupt_xstate_header.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Corrupt the XSTATE header in a signal frame
+ *
+ * Based on analysis and a test case from Thomas Gleixner.
+ */
+
+#define _GNU_SOURCE
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <sched.h>
+#include <signal.h>
+#include <err.h>
+#include <unistd.h>
+#include <stdint.h>
+#include <sys/wait.h>
+
+static inline void __cpuid(unsigned int *eax, unsigned int *ebx,
+			   unsigned int *ecx, unsigned int *edx)
+{
+	asm volatile(
+		"cpuid;"
+		: "=a" (*eax),
+		  "=b" (*ebx),
+		  "=c" (*ecx),
+		  "=d" (*edx)
+		: "0" (*eax), "2" (*ecx));
+}
+
+static inline int xsave_enabled(void)
+{
+	unsigned int eax, ebx, ecx, edx;
+
+	eax = 0x1;
+	ecx = 0x0;
+	__cpuid(&eax, &ebx, &ecx, &edx);
+
+	/* Is CR4.OSXSAVE enabled ? */
+	return ecx & (1U << 27);
+}
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static void sigusr1(int sig, siginfo_t *info, void *uc_void)
+{
+	ucontext_t *uc = uc_void;
+	uint8_t *fpstate = (uint8_t *)uc->uc_mcontext.fpregs;
+	uint64_t *xfeatures = (uint64_t *)(fpstate + 512);
+
+	printf("\tWreck XSTATE header\n");
+	/* Wreck the first reserved bytes in the header */
+	*(xfeatures + 2) = 0xfffffff;
+}
+
+static void sigsegv(int sig, siginfo_t *info, void *uc_void)
+{
+	printf("\tGot SIGSEGV\n");
+}
+
+int main(void)
+{
+	cpu_set_t set;
+
+	sethandler(SIGUSR1, sigusr1, 0);
+	sethandler(SIGSEGV, sigsegv, 0);
+
+	if (!xsave_enabled()) {
+		printf("[SKIP] CR4.OSXSAVE disabled.\n");
+		return 0;
+	}
+
+	CPU_ZERO(&set);
+	CPU_SET(0, &set);
+
+	/*
+	 * Enforce that the child runs on the same CPU
+	 * which in turn forces a schedule.
+	 */
+	sched_setaffinity(getpid(), sizeof(set), &set);
+
+	printf("[RUN]\tSend ourselves a signal\n");
+	raise(SIGUSR1);
+
+	printf("[OK]\tBack from the signal.  Now schedule.\n");
+	pid_t child = fork();
+	if (child < 0)
+		err(1, "fork");
+	if (child == 0)
+		return 0;
+	if (child)
+		waitpid(child, NULL, 0);
+	printf("[OK]\tBack in the main thread.\n");
+
+	/*
+	 * We could try to confirm that extended state is still preserved
+	 * when we schedule.  For now, the only indication of failure is
+	 * a warning in the kernel logs.
+	 */
+
+	return 0;
+}

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip: x86/fpu] x86/fpu: Add address range checks to copy_user_to_xstate()
  2021-06-08 14:36 ` [patch V3 5/6] x86/fpu: Add address range checks to copy_user_to_xstate() Thomas Gleixner
@ 2021-06-09 12:56   ` tip-bot2 for Andy Lutomirski
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-09 12:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, Dave Hansen,
	Rik van Riel, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     f72a249b0ba85564c6bfa94d609a70567485a061
Gitweb:        https://git.kernel.org/tip/f72a249b0ba85564c6bfa94d609a70567485a061
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Tue, 08 Jun 2021 16:36:22 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 09 Jun 2021 14:46:20 +02:00

x86/fpu: Add address range checks to copy_user_to_xstate()

copy_user_to_xstate() uses __copy_from_user(), which provides a negligible
speedup.  Fortunately, both call sites are at least almost correct.

__fpu__restore_sig() checks access_ok() with xstate_sigframe_size()
length and ptrace regset access uses fpu_user_xstate_size. These should
be valid upper bounds on the length, so, at worst, this would cause
spurious failures and not accesses to kernel memory.

Nonetheless, this is far more fragile than necessary and none of these
callers are in a hotpath.

Use copy_from_user() instead.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Link: https://lkml.kernel.org/r/20210608144346.140254130@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index a85c640..8ac0f67 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1190,7 +1190,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 	offset = offsetof(struct xregs_state, header);
 	size = sizeof(hdr);
 
-	if (__copy_from_user(&hdr, ubuf + offset, size))
+	if (copy_from_user(&hdr, ubuf + offset, size))
 		return -EFAULT;
 
 	if (validate_user_xstate_header(&hdr))
@@ -1205,7 +1205,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
 
-			if (__copy_from_user(dst, ubuf + offset, size))
+			if (copy_from_user(dst, ubuf + offset, size))
 				return -EFAULT;
 		}
 	}
@@ -1213,7 +1213,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 	if (xfeatures_mxcsr_quirk(hdr.xfeatures)) {
 		offset = offsetof(struct fxregs_state, mxcsr);
 		size = MXCSR_AND_FLAGS_SIZE;
-		if (__copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
+		if (copy_from_user(&xsave->i387.mxcsr, ubuf + offset, size))
 			return -EFAULT;
 	}
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip: x86/urgent] x86/pkru: Write hardware init value to PKRU when xstate is init
  2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
                     ` (2 preceding siblings ...)
  2021-06-08 21:37   ` Babu Moger
@ 2021-06-09 14:46   ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 27+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-09 14:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Dave Hansen, Rik van Riel,
	Babu Moger, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     510b80a6a0f1a0d114c6e33bcea64747d127973c
Gitweb:        https://git.kernel.org/tip/510b80a6a0f1a0d114c6e33bcea64747d127973c
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 08 Jun 2021 16:36:21 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 09 Jun 2021 12:12:45 +02:00

x86/pkru: Write hardware init value to PKRU when xstate is init

When user space brings PKRU into init state, then the kernel handling is
broken:

  T1 user space
     xsave(state)
     state.header.xfeatures &= ~XFEATURE_MASK_PKRU;
     xrstor(state)

  T1 -> kernel
     schedule()
       XSAVE(S) -> T1->xsave.header.xfeatures[PKRU] == 0
       T1->flags |= TIF_NEED_FPU_LOAD;

       wrpkru();

     schedule()
       ...
       pk = get_xsave_addr(&T1->fpu->state.xsave, XFEATURE_PKRU);
       if (pk)
	 wrpkru(pk->pkru);
       else
	 wrpkru(DEFAULT_PKRU);

Because the xfeatures bit is 0 and therefore the value in the xsave
storage is not valid, get_xsave_addr() returns NULL and switch_to()
writes the default PKRU. -> FAIL #1!

So that wrecks any copy_to/from_user() on the way back to user space
which hits memory which is protected by the default PKRU value.

Assumed that this does not fail (pure luck) then T1 goes back to user
space and because TIF_NEED_FPU_LOAD is set it ends up in

  switch_fpu_return()
      __fpregs_load_activate()
        if (!fpregs_state_valid()) {
  	 load_XSTATE_from_task();
        }

But if nothing touched the FPU between T1 scheduling out and back in,
then the fpregs_state is still valid which means switch_fpu_return()
does nothing and just clears TIF_NEED_FPU_LOAD. Back to user space with
DEFAULT_PKRU loaded. -> FAIL #2!

The fix is simple: if get_xsave_addr() returns NULL then set the
PKRU value to 0 instead of the restrictive default PKRU value in
init_pkru_value.

 [ bp: Massage in minor nitpicks from folks. ]

Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144346.045616965@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 18382ac..fdee23e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -579,9 +579,16 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
 	 * return to userland e.g. for a copy_to_user() operation.
 	 */
 	if (!(current->flags & PF_KTHREAD)) {
+		/*
+		 * If the PKRU bit in xsave.header.xfeatures is not set,
+		 * then the PKRU component was in init state, which means
+		 * XRSTOR will set PKRU to 0. If the bit is not set then
+		 * get_xsave_addr() will return NULL because the PKRU value
+		 * in memory is not valid. This means pkru_val has to be
+		 * set to 0 and not to init_pkru_value.
+		 */
 		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
-		if (pk)
-			pkru_val = pk->pkru;
+		pkru_val = pk ? pk->pkru : 0;
 	}
 	__write_pkru(pkru_val);
 }

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip: x86/urgent] x86/process: Check PF_KTHREAD and not current->mm for kernel threads
  2021-06-08 14:36 ` [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads Thomas Gleixner
@ 2021-06-09 14:46   ` tip-bot2 for Thomas Gleixner
  2021-06-10 17:10   ` [patch V3 3/6] " Andy Lutomirski
  1 sibling, 0 replies; 27+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-09 14:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, Dave Hansen, Rik van Riel,
	stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     12f7764ac61200e32c916f038bdc08f884b0b604
Gitweb:        https://git.kernel.org/tip/12f7764ac61200e32c916f038bdc08f884b0b604
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 08 Jun 2021 16:36:20 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 09 Jun 2021 10:39:04 +02:00

x86/process: Check PF_KTHREAD and not current->mm for kernel threads

switch_fpu_finish() checks current->mm as indicator for kernel threads.
That's wrong because kernel threads can temporarily use a mm of a user
process via kthread_use_mm().

Check the task flags for PF_KTHREAD instead.

Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144345.912645927@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index ceeba9f..18382ac 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -578,7 +578,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu)
 	 * PKRU state is switched eagerly because it needs to be valid before we
 	 * return to userland e.g. for a copy_to_user() operation.
 	 */
-	if (current->mm) {
+	if (!(current->flags & PF_KTHREAD)) {
 		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
 		if (pk)
 			pkru_val = pk->pkru;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip: x86/urgent] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer
  2021-06-08 14:36 ` [patch V3 2/6] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer Thomas Gleixner
@ 2021-06-09 14:46   ` tip-bot2 for Andy Lutomirski
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot2 for Andy Lutomirski @ 2021-06-09 14:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Andy Lutomirski, Thomas Gleixner, Borislav Petkov, Dave Hansen,
	Rik van Riel, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     d8778e393afa421f1f117471144f8ce6deb6953a
Gitweb:        https://git.kernel.org/tip/d8778e393afa421f1f117471144f8ce6deb6953a
Author:        Andy Lutomirski <luto@kernel.org>
AuthorDate:    Tue, 08 Jun 2021 16:36:19 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 09 Jun 2021 09:49:38 +02:00

x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer

Both Intel and AMD consider it to be architecturally valid for XRSTOR to
fail with #PF but nonetheless change the register state.  The actual
conditions under which this might occur are unclear [1], but it seems
plausible that this might be triggered if one sibling thread unmaps a page
and invalidates the shared TLB while another sibling thread is executing
XRSTOR on the page in question.

__fpu__restore_sig() can execute XRSTOR while the hardware registers
are preserved on behalf of a different victim task (using the
fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but
modify the registers.

If this happens, then there is a window in which __fpu__restore_sig()
could schedule out and the victim task could schedule back in without
reloading its own FPU registers. This would result in part of the FPU
state that __fpu__restore_sig() was attempting to load leaking into the
victim task's user-visible state.

Invalidate preserved FPU registers on XRSTOR failure to prevent this
situation from corrupting any state.

[1] Frequent readers of the errata lists might imagine "complex
    microarchitectural conditions".

Fixes: 1d731e731c4c ("x86/fpu: Add a fastpath to __fpu__restore_sig()")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144345.758116583@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index d5bc96a..4ab9aeb 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -369,6 +369,25 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			fpregs_unlock();
 			return 0;
 		}
+
+		/*
+		 * The above did an FPU restore operation, restricted to
+		 * the user portion of the registers, and failed, but the
+		 * microcode might have modified the FPU registers
+		 * nevertheless.
+		 *
+		 * If the FPU registers do not belong to current, then
+		 * invalidate the FPU register state otherwise the task might
+		 * preempt current and return to user space with corrupted
+		 * FPU registers.
+		 *
+		 * In case current owns the FPU registers then no further
+		 * action is required. The fixup below will handle it
+		 * correctly.
+		 */
+		if (test_thread_flag(TIF_NEED_FPU_LOAD))
+			__cpu_invalidate_fpregs_state();
+
 		fpregs_unlock();
 	} else {
 		/*

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [tip: x86/urgent] x86/fpu: Prevent state corruption in __fpu__restore_sig()
  2021-06-08 14:36 ` [patch V3 1/6] x86/fpu: Prevent state corruption in __fpu__restore_sig() Thomas Gleixner
@ 2021-06-09 14:46   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-09 14:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: syzbot+2067e764dbcd10721e2e, Thomas Gleixner, Borislav Petkov,
	Dave Hansen, Rik van Riel, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     484cea4f362e1eeb5c869abbfb5f90eae6421b38
Gitweb:        https://git.kernel.org/tip/484cea4f362e1eeb5c869abbfb5f90eae6421b38
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Tue, 08 Jun 2021 16:36:18 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Wed, 09 Jun 2021 09:28:21 +02:00

x86/fpu: Prevent state corruption in __fpu__restore_sig()

The non-compacted slowpath uses __copy_from_user() and copies the entire
user buffer into the kernel buffer, verbatim.  This means that the kernel
buffer may now contain entirely invalid state on which XRSTOR will #GP.
validate_user_xstate_header() can detect some of that corruption, but that
leaves the onus on callers to clear the buffer.

Prior to XSAVES support, it was possible just to reinitialize the buffer,
completely, but with supervisor states that is not longer possible as the
buffer clearing code split got it backwards. Fixing that is possible but
not corrupting the state in the first place is more robust.

Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate()
which validates the XSAVE header contents before copying the actual states
to the kernel. copy_user_to_xstate() was previously only called for
compacted-format kernel buffers, but it works for both compacted and
non-compacted forms.

Using it for the non-compacted form is slower because of multiple
__copy_from_user() operations, but that cost is less important than robust
code in an already slow path.

[ Changelog polished by Dave Hansen ]

Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates")
Reported-by: syzbot+2067e764dbcd10721e2e@syzkaller.appspotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210608144345.611833074@linutronix.de
---
 arch/x86/kernel/fpu/signal.c |  9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a4ec653..d5bc96a 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -405,14 +405,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask_user() & ~user_xfeatures;
 
-		if (using_compacted_format()) {
-			ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
-		} else {
-			ret = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
-
-			if (!ret && state_size > offsetof(struct xregs_state, header))
-				ret = validate_user_xstate_header(&fpu->state.xsave.header);
-		}
+		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
 			goto err_out;
 

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH] x86/fpu: Reset state for all signal restore failures
  2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
                   ` (7 preceding siblings ...)
  2021-06-08 18:46 ` Rik van Riel
@ 2021-06-09 19:18 ` Thomas Gleixner
  2021-06-10  6:39   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  8 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-09 19:18 UTC (permalink / raw)
  To: LKML
  Cc: x86, Andy Lutomirski, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Borislav Petkov

If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the
function just returns but does not clear the FPU state as it does for all
other fatal failures.

Clear the FPU state for these failures as well.

Fixes: 72a671ced66d ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/fpu/signal.c |   24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -307,13 +307,17 @@ static int __fpu__restore_sig(void __use
 		return 0;
 	}
 
-	if (!access_ok(buf, size))
-		return -EACCES;
+	if (!access_ok(buf, size)) {
+		ret = -EACCES;
+		goto out;
+	}
 
-	if (!static_cpu_has(X86_FEATURE_FPU))
-		return fpregs_soft_set(current, NULL,
-				       0, sizeof(struct user_i387_ia32_struct),
-				       NULL, buf) != 0;
+	if (!static_cpu_has(X86_FEATURE_FPU)) {
+		ret = fpregs_soft_set(current, NULL, 0,
+				      sizeof(struct user_i387_ia32_struct),
+				      NULL, buf);
+		goto out;
+	}
 
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
@@ -377,7 +381,7 @@ static int __fpu__restore_sig(void __use
 		 */
 		ret = __copy_from_user(&env, buf, sizeof(env));
 		if (ret)
-			goto err_out;
+			goto out;
 		envp = &env;
 	}
 
@@ -434,7 +438,7 @@ static int __fpu__restore_sig(void __use
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
 		if (ret) {
 			ret = -EFAULT;
-			goto err_out;
+			goto out;
 		}
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
@@ -452,7 +456,7 @@ static int __fpu__restore_sig(void __use
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)
-			goto err_out;
+			goto out;
 
 		fpregs_lock();
 		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
@@ -463,7 +467,7 @@ static int __fpu__restore_sig(void __use
 		fpregs_deactivate(fpu);
 	fpregs_unlock();
 
-err_out:
+out:
 	if (ret)
 		fpu__clear_user_states(fpu);
 	return ret;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip: x86/urgent] x86/fpu: Reset state for all signal restore failures
  2021-06-09 19:18 ` [PATCH] x86/fpu: Reset state for all signal restore failures Thomas Gleixner
@ 2021-06-10  6:39   ` tip-bot2 for Thomas Gleixner
  0 siblings, 0 replies; 27+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2021-06-10  6:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Thomas Gleixner, Borislav Petkov, stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     efa165504943f2128d50f63de0c02faf6dcceb0d
Gitweb:        https://git.kernel.org/tip/efa165504943f2128d50f63de0c02faf6dcceb0d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 09 Jun 2021 21:18:00 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 10 Jun 2021 08:04:24 +02:00

x86/fpu: Reset state for all signal restore failures

If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the
function just returns but does not clear the FPU state as it does for all
other fatal failures.

Clear the FPU state for these failures as well.

Fixes: 72a671ced66d ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/87mtryyhhz.ffs@nanos.tec.linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 4ab9aeb..ec3ae30 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -307,13 +307,17 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		return 0;
 	}
 
-	if (!access_ok(buf, size))
-		return -EACCES;
+	if (!access_ok(buf, size)) {
+		ret = -EACCES;
+		goto out;
+	}
 
-	if (!static_cpu_has(X86_FEATURE_FPU))
-		return fpregs_soft_set(current, NULL,
-				       0, sizeof(struct user_i387_ia32_struct),
-				       NULL, buf) != 0;
+	if (!static_cpu_has(X86_FEATURE_FPU)) {
+		ret = fpregs_soft_set(current, NULL, 0,
+				      sizeof(struct user_i387_ia32_struct),
+				      NULL, buf);
+		goto out;
+	}
 
 	if (use_xsave()) {
 		struct _fpx_sw_bytes fx_sw_user;
@@ -396,7 +400,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 */
 		ret = __copy_from_user(&env, buf, sizeof(env));
 		if (ret)
-			goto err_out;
+			goto out;
 		envp = &env;
 	}
 
@@ -426,7 +430,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 
 		ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
 		if (ret)
-			goto err_out;
+			goto out;
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
 					      fx_only);
@@ -446,7 +450,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
 		if (ret) {
 			ret = -EFAULT;
-			goto err_out;
+			goto out;
 		}
 
 		sanitize_restored_user_xstate(&fpu->state, envp, user_xfeatures,
@@ -464,7 +468,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	} else {
 		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)
-			goto err_out;
+			goto out;
 
 		fpregs_lock();
 		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
@@ -475,7 +479,7 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		fpregs_deactivate(fpu);
 	fpregs_unlock();
 
-err_out:
+out:
 	if (ret)
 		fpu__clear_user_states(fpu);
 	return ret;

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads
  2021-06-08 14:36 ` [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads Thomas Gleixner
  2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
@ 2021-06-10 17:10   ` Andy Lutomirski
  2021-06-10 20:54     ` Thomas Gleixner
  1 sibling, 1 reply; 27+ messages in thread
From: Andy Lutomirski @ 2021-06-10 17:10 UTC (permalink / raw)
  To: Thomas Gleixner, Linux Kernel Mailing List
  Cc: the arch/x86 maintainers, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov



On Tue, Jun 8, 2021, at 7:36 AM, Thomas Gleixner wrote:
> switch_fpu_finish() checks current->mm as indicator for kernel threads.
> That's wrong because kernel threads can temporarily use a mm of a user
> process via kthread_use_mm().
> 
> Check the task flags for PF_KTHREAD instead.
> 
> Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/include/asm/fpu/internal.h |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/arch/x86/include/asm/fpu/internal.h
> +++ b/arch/x86/include/asm/fpu/internal.h
> @@ -578,7 +578,7 @@ static inline void switch_fpu_finish(str
>  	 * PKRU state is switched eagerly because it needs to be valid before we
>  	 * return to userland e.g. for a copy_to_user() operation.
>  	 */
> -	if (current->mm) {
> +	if (!(current->flags & PF_KTHREAD)) {
>  		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
>  		if (pk)
>  			pkru_val = pk->pkru;
> 
> 

Why are we checking this at all?  I actually tend to agree with the ->mm check more than PF_anything. If we have a user address space, then PKRU matters. If we don’t, then it doesn’t.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads
  2021-06-10 17:10   ` [patch V3 3/6] " Andy Lutomirski
@ 2021-06-10 20:54     ` Thomas Gleixner
  2021-06-11  1:04       ` Andy Lutomirski
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2021-06-10 20:54 UTC (permalink / raw)
  To: Andy Lutomirski, Linux Kernel Mailing List
  Cc: the arch/x86 maintainers, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov

On Thu, Jun 10 2021 at 10:10, Andy Lutomirski wrote:

> On Tue, Jun 8, 2021, at 7:36 AM, Thomas Gleixner wrote:
>> switch_fpu_finish() checks current->mm as indicator for kernel threads.
>> That's wrong because kernel threads can temporarily use a mm of a user
>> process via kthread_use_mm().
>> 
>> Check the task flags for PF_KTHREAD instead.
>> 
>> Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Rik van Riel <riel@surriel.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  arch/x86/include/asm/fpu/internal.h |    2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> --- a/arch/x86/include/asm/fpu/internal.h
>> +++ b/arch/x86/include/asm/fpu/internal.h
>> @@ -578,7 +578,7 @@ static inline void switch_fpu_finish(str
>>  	 * PKRU state is switched eagerly because it needs to be valid before we
>>  	 * return to userland e.g. for a copy_to_user() operation.
>>  	 */
>> -	if (current->mm) {
>> +	if (!(current->flags & PF_KTHREAD)) {
>>  		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
>>  		if (pk)
>>  			pkru_val = pk->pkru;
>> 
>> 
> Why are we checking this at all?  I actually tend to agree with the
> ->mm check more than PF_anything. If we have a user address space,
> then PKRU matters. If we don’t, then it doesn’t.

Which PKRU matters? A kernel thread has always the default PKRU no
matter whether it uses a mm or not. It _cannot_ borrow the PKRU from the
mm owning process. There is no way, so let's not pretend there would be.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads
  2021-06-10 20:54     ` Thomas Gleixner
@ 2021-06-11  1:04       ` Andy Lutomirski
  0 siblings, 0 replies; 27+ messages in thread
From: Andy Lutomirski @ 2021-06-11  1:04 UTC (permalink / raw)
  To: Thomas Gleixner, Linux Kernel Mailing List
  Cc: the arch/x86 maintainers, Dave Hansen, Fenghua Yu, Tony Luck,
	Yu-cheng Yu, Sebastian Andrzej Siewior, Rik van Riel,
	Borislav Petkov

On 6/10/21 1:54 PM, Thomas Gleixner wrote:
> On Thu, Jun 10 2021 at 10:10, Andy Lutomirski wrote:
> 
>> On Tue, Jun 8, 2021, at 7:36 AM, Thomas Gleixner wrote:
>>> switch_fpu_finish() checks current->mm as indicator for kernel threads.
>>> That's wrong because kernel threads can temporarily use a mm of a user
>>> process via kthread_use_mm().
>>>
>>> Check the task flags for PF_KTHREAD instead.
>>>
>>> Fixes: 0cecca9d03c9 ("x86/fpu: Eager switch PKRU state")
>>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Rik van Riel <riel@surriel.com>
>>> Cc: stable@vger.kernel.org
>>> ---
>>>  arch/x86/include/asm/fpu/internal.h |    2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> --- a/arch/x86/include/asm/fpu/internal.h
>>> +++ b/arch/x86/include/asm/fpu/internal.h
>>> @@ -578,7 +578,7 @@ static inline void switch_fpu_finish(str
>>>  	 * PKRU state is switched eagerly because it needs to be valid before we
>>>  	 * return to userland e.g. for a copy_to_user() operation.
>>>  	 */
>>> -	if (current->mm) {
>>> +	if (!(current->flags & PF_KTHREAD)) {
>>>  		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
>>>  		if (pk)
>>>  			pkru_val = pk->pkru;
>>>
>>>
>> Why are we checking this at all?  I actually tend to agree with the
>> ->mm check more than PF_anything. If we have a user address space,
>> then PKRU matters. If we don’t, then it doesn’t.
> 
> Which PKRU matters? A kernel thread has always the default PKRU no
> matter whether it uses a mm or not. It _cannot_ borrow the PKRU from the
> mm owning process. There is no way, so let's not pretend there would be.
> 

Hmm.  I guess PK_KTHREAD is consistent with switch_fpu_prepare() --
kernel threads have no FPU state.

It might be worth a loud comment here that kernel threads' PKRU is not
context switched and that, if anyone wants kthread_use_mm() users to use
anything other than the default PKRU, that they will need to change this.

So I guess your patch is okay.

--Andy

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2021-06-11  1:04 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 14:36 [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Thomas Gleixner
2021-06-08 14:36 ` [patch V3 1/6] x86/fpu: Prevent state corruption in __fpu__restore_sig() Thomas Gleixner
2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2021-06-08 14:36 ` [patch V3 2/6] x86/fpu: Invalidate FPU state after a failed XRSTOR from a user buffer Thomas Gleixner
2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Andy Lutomirski
2021-06-08 14:36 ` [patch V3 3/6] x86/process: Check PF_KTHREAD and not current->mm for kernel threads Thomas Gleixner
2021-06-09 14:46   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2021-06-10 17:10   ` [patch V3 3/6] " Andy Lutomirski
2021-06-10 20:54     ` Thomas Gleixner
2021-06-11  1:04       ` Andy Lutomirski
2021-06-08 14:36 ` [patch V3 4/6] x86/pkru: Make PKRU=0 actually work Thomas Gleixner
2021-06-08 15:40   ` Borislav Petkov
2021-06-08 19:15     ` Thomas Gleixner
2021-06-08 20:06       ` Borislav Petkov
2021-06-08 16:06   ` Dave Hansen
2021-06-08 19:06     ` Thomas Gleixner
2021-06-08 21:37   ` Babu Moger
2021-06-09 14:46   ` [tip: x86/urgent] x86/pkru: Write hardware init value to PKRU when xstate is init tip-bot2 for Thomas Gleixner
2021-06-08 14:36 ` [patch V3 5/6] x86/fpu: Add address range checks to copy_user_to_xstate() Thomas Gleixner
2021-06-09 12:56   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
2021-06-08 14:36 ` [patch V3 6/6] selftests/x86: Test signal frame XSTATE header corruption handling Thomas Gleixner
2021-06-09  8:38   ` David Edmondson
2021-06-09 12:56   ` [tip: x86/fpu] " tip-bot2 for Andy Lutomirski
2021-06-08 16:08 ` [patch V3 0/6] x86/fpu: Mop up XSAVES and related damage Dave Hansen
2021-06-08 18:46 ` Rik van Riel
2021-06-09 19:18 ` [PATCH] x86/fpu: Reset state for all signal restore failures Thomas Gleixner
2021-06-10  6:39   ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.