KVM Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v9 00/27] x86: load FPU registers on return to userland
@ 2019-04-03 16:41 Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 01/27] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() Sebastian Andrzej Siewior
                   ` (29 more replies)
  0 siblings, 30 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

This is a refurbished series originally started by by Rik van Riel. The
goal is load the FPU registers on return to userland and not on every
context switch. By this optimisation we can:
- avoid loading the registers if the task stays in kernel and does
  not return to userland
- make kernel_fpu_begin() cheaper: it only saves the registers on the
  first invocation. The second invocation does not need save them again.

To access the FPU registers in kernel we need:
- disable preemption to avoid that the scheduler switches tasks. By
  doing so it would set TIF_NEED_FPU_LOAD and the FPU registers would be
  not valid.
- disable BH because the softirq might use kernel_fpu_begin() and then
  set TIF_NEED_FPU_LOAD instead loading the FPU registers on completion.

v8…v9:
Rebased on top of v5.1-rc3 plus:
- using the proper register pointer in __fpu__restore_sig() for the
  fsave case.

- dropped a WARN_ON() from switch_fpu_finish() if the PKRU state is
  missing. On a preemptible kernel this warning will trigger if the
  task gets preempted in fpstate_init() during memset() and
  fpstate_init_xstate().

- Adding a fastpath to __fpu__restore_sig() and
  copy_fpstate_to_sigframe() save/restore FPU state to/from userland.
  Thomas Gleixner asked for this. The series first does the slowpath and
  then the last few patches add/enable the fast path. This ensures that
  the slow works correctly (during a bisect) and then it should be
  rarely used.

Some numbers generated by 
     taskset 2 lat_sig -P 1 -W 64 -N 5000 catch

v5.1-rc3
 catch 64
 Signal handler overhead: 1.7531 microseconds
 Signal handler overhead: 1.7547 microseconds
 Signal handler overhead: 1.7362 microseconds
 catch 32
 Signal handler overhead: 2.4044 microseconds
 Signal handler overhead: 2.3732 microseconds
 Signal handler overhead: 2.3979 microseconds

up to x86/fpu: Defer FPU state load until return to userspace
 catch 64
 Signal handler overhead: 2.7556 microseconds
 Signal handler overhead: 2.7878 microseconds
 Signal handler overhead: 2.7664 microseconds
 catch 32
 Signal handler overhead: 2.5961 microseconds
 Signal handler overhead: 2.5916 microseconds
 Signal handler overhead: 2.6091 microseconds

up to x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()
 catch 64
 Signal handler overhead: 1.8573 microseconds
 Signal handler overhead: 1.8469 microseconds
 Signal handler overhead: 1.8554 microseconds
 catch 32
 Signal handler overhead: 2.4875 microseconds
 Signal handler overhead: 2.4822 microseconds
 Signal handler overhead: 2.4806 microseconds
 ----
complete series
 catch 64
 Signal handler overhead: 1.8463 microseconds
 Signal handler overhead: 1.8431 microseconds
 Signal handler overhead: 1.8609 microseconds
 catch 32
 Signal handler overhead: 2.4907 microseconds
 Signal handler overhead: 2.4771 microseconds
 Signal handler overhead: 2.4939 microseconds

v7…v8:
Rebased on top of v5.1-rc1. And then:
- Remove a WARN_ON() in switch_fpu_finish. Turns out it can trigger
  during a preemption while the xstate is initialized.

- Provide __read_pkru_ins() and __write_pkru_ins which are symmetrical.
  __write_pkru() does the "write only if the value is different from
  current" check.

- The kernel threads now load `init_pkru_value' instead of `0' for the
  PKRU value.

- The PKRU value is also written into its xsave area of init_fpstate.

v6…v7:
Rebased on top of v5.0-rc7 and addressed Borislav Petkov's review.

v5…v6:
Rebased on top of v5.0-rc1. Integrated a few fixes which I noticed while
looking over the patches, dropped the first few patches which were
already applied.

v4…v5:
Rebased on top of a fix, noticed a problem with XSAVES and then redid
the restore on sig return (patch #26 to #28).

I don't like very much the sig save+restore thing that we are doing. It
has been always like that. I *think* that this is just because we have
nowhere to stash the FPU state while we are handling the signal. We
could add another fpu->state for the signal handler and avoid the thing.
Debian code-search revealed that `criu' is using it (and I didn't
figure out why). Nothing else (that is packaged in Debian). Maybe we
could get rid of this and if `criu' would then use a dedicated interface
for its needs rather the signal interface that happen to do what it
wants :)

v3…v4:
It has been suggested to remove the `initialized' member of the struct
fpu because it should not required be needed with lazy-FPU-restore and
would make the review easier. This is the first part of the series, the
second is basically the rebase of the v3 queue. As a result, the
diffstat became negative (which wasn't the case in previous version) :)
I tried to incorporate all the review comments that came up, some of
them were "outdated" after the removal of the `initialized' member. I'm
sorry should I missed any.

v1…v3:
v2 was never posted. I followed the idea to completely decouple PKRU
from xstate. This didn't quite work and made a few things complicated. 
One obvious required fixup is copy_fpstate_to_sigframe() where the PKRU
state needs to be fiddled into xstate. This required another
xfeatures_mask so that the sanity checks were performed and
xstate_offsets would be computed. Additionally ptrace also reads/sets
xstate in order to get/set the register and PKRU is one of them. So this
would need some fiddle, too.
In v3 I dropped that decouple idea. I also learned that the wrpkru
instruction is not privileged and so caching it in kernel does not work.
Instead I keep PKRU in xstate area and load it at context switch time
while the remaining registers are deferred (until return to userland).
The offset of PKRU within xstate is enumerated at boot time so why not
use it.

The following changes since commit 79a3aaa7b82e3106be97842dedfd8429248896e6:

  Linux 5.1-rc3 (2019-03-31 14:39:29 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bigeasy/staging.git x86_fpu_rtu_v9

for you to fetch changes up to c5265b05d167b3be19c787ebb913dc138485fa25:

  x86/pkeys: add PKRU value to init_fpstate (2019-04-03 18:19:10 +0200)

----------------------------------------------------------------
Rik van Riel (5):
      x86/fpu: Add (__)make_fpregs_active helpers
      x86/fpu: Eager switch PKRU state
      x86/fpu: Always store the registers in copy_fpstate_to_sigframe()
      x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD
      x86/fpu: Defer FPU state load until return to userspace

Sebastian Andrzej Siewior (22):
      x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()
      x86/fpu: Remove fpu__restore()
      x86/fpu: Remove preempt_disable() in fpu__clear()
      x86/fpu: Always init the `state' in fpu__clear()
      x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()
      x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()
      x86/fpu: Remove fpu->initialized
      x86/fpu: Remove user_fpu_begin()
      x86/fpu: Make __raw_xsave_addr() use feature number instead of mask
      x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use feature number instead of mask
      x86/pkru: Provide .*_pkru_ins() functions
      x86/fpu: Only write PKRU if it is different from current
      x86/pkeys: Don't check if PKRU is zero before writting it
      x86/entry: Add TIF_NEED_FPU_LOAD
      x86/fpu: Update xstate's PKRU value on write_pkru()
      x86/fpu: Inline copy_user_to_fpregs_zeroing()
      x86/fpu: Let __fpu__restore_sig() restore the !32bit+fxsr frame from kernel memory
      x86/fpu: Merge the two code paths in __fpu__restore_sig()
      x86/fpu: Add a fastpath to __fpu__restore_sig()
      x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()
      x86/fpu: Restore FPU register in copy_fpstate_to_sigframe() in order to use the fastpath
      x86/pkeys: add PKRU value to init_fpstate

 Documentation/preempt-locking.txt    |   1 -
 arch/x86/entry/common.c              |   8 ++
 arch/x86/ia32/ia32_signal.c          |  17 ++-
 arch/x86/include/asm/fpu/api.h       |  31 ++++++
 arch/x86/include/asm/fpu/internal.h  | 130 ++++++++++++++++------
 arch/x86/include/asm/fpu/signal.h    |   2 +-
 arch/x86/include/asm/fpu/types.h     |   9 --
 arch/x86/include/asm/fpu/xstate.h    |   5 +-
 arch/x86/include/asm/pgtable.h       |  28 ++++-
 arch/x86/include/asm/special_insns.h |  18 +++-
 arch/x86/include/asm/thread_info.h   |   2 +
 arch/x86/include/asm/trace/fpu.h     |   8 +-
 arch/x86/kernel/cpu/common.c         |   5 +
 arch/x86/kernel/fpu/core.c           | 194 ++++++++++++++++-----------------
 arch/x86/kernel/fpu/init.c           |   2 -
 arch/x86/kernel/fpu/regset.c         |  24 ++---
 arch/x86/kernel/fpu/signal.c         | 202 +++++++++++++++++++++--------------
 arch/x86/kernel/fpu/xstate.c         |  42 ++++----
 arch/x86/kernel/process.c            |   2 +-
 arch/x86/kernel/process_32.c         |  11 +-
 arch/x86/kernel/process_64.c         |  11 +-
 arch/x86/kernel/signal.c             |  17 ++-
 arch/x86/kernel/traps.c              |   2 +-
 arch/x86/kvm/vmx/vmx.c               |   2 +-
 arch/x86/kvm/x86.c                   |  48 +++++----
 arch/x86/math-emu/fpu_entry.c        |   3 -
 arch/x86/mm/mpx.c                    |   6 +-
 arch/x86/mm/pkeys.c                  |  21 ++--
 28 files changed, 496 insertions(+), 355 deletions(-)

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 01/27] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:46   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 02/27] x86/fpu: Remove fpu__restore() Sebastian Andrzej Siewior
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior, Borislav Petkov

This is a preparation for the removal of the ->initialized member in the
fpu struct.
__fpu__restore_sig() is deactivating the FPU via fpu__drop() and then
setting manually ->initialized followed by fpu__restore(). The result is
that it is possible to manipulate fpu->state and the state of registers
won't be saved/restored on a context switch which would overwrite
fpu->state.

Don't access the fpu->state while the content is read from user space
and examined / sanitized. Use a temporary kmalloc() buffer for the
preparation of the FPU registers and once the state is considered okay,
load it. Should something go wrong, return with an error and without
altering the original FPU registers.

The removal of "fpu__initialize()" is a nop because fpu->initialized is
already set for the user task.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/signal.h |  2 +-
 arch/x86/kernel/fpu/regset.c      |  5 ++--
 arch/x86/kernel/fpu/signal.c      | 40 ++++++++++++-------------------
 3 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 44bbc39a57b30..7fb516b6893a8 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -22,7 +22,7 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
 
 extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
 			      struct task_struct *tsk);
-extern void convert_to_fxsr(struct task_struct *tsk,
+extern void convert_to_fxsr(struct fxregs_state *fxsave,
 			    const struct user_i387_ia32_struct *env);
 
 unsigned long
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index bc02f5144b958..5dbc099178a88 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -269,11 +269,10 @@ convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
 		memcpy(&to[i], &from[i], sizeof(to[0]));
 }
 
-void convert_to_fxsr(struct task_struct *tsk,
+void convert_to_fxsr(struct fxregs_state *fxsave,
 		     const struct user_i387_ia32_struct *env)
 
 {
-	struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
 	struct _fpreg *from = (struct _fpreg *) &env->st_space[0];
 	struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0];
 	int i;
@@ -350,7 +349,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 
 	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
 	if (!ret)
-		convert_to_fxsr(target, &env);
+		convert_to_fxsr(&target->thread.fpu.state.fxsave, &env);
 
 	/*
 	 * update the header bit in the xsave header, indicating the
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index f6a1d299627c5..a874931edf6a9 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -207,11 +207,11 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 }
 
 static inline void
-sanitize_restored_xstate(struct task_struct *tsk,
+sanitize_restored_xstate(union fpregs_state *state,
 			 struct user_i387_ia32_struct *ia32_env,
 			 u64 xfeatures, int fx_only)
 {
-	struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
+	struct xregs_state *xsave = &state->xsave;
 	struct xstate_header *header = &xsave->header;
 
 	if (use_xsave()) {
@@ -238,7 +238,7 @@ sanitize_restored_xstate(struct task_struct *tsk,
 		 */
 		xsave->i387.mxcsr &= mxcsr_feature_mask;
 
-		convert_to_fxsr(tsk, ia32_env);
+		convert_to_fxsr(&state->fxsave, ia32_env);
 	}
 }
 
@@ -284,8 +284,6 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	if (!access_ok(buf, size))
 		return -EACCES;
 
-	fpu__initialize(fpu);
-
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return fpregs_soft_set(current, NULL,
 				       0, sizeof(struct user_i387_ia32_struct),
@@ -315,40 +313,32 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 * header. Validate and sanitize the copied state.
 		 */
 		struct user_i387_ia32_struct env;
+		union fpregs_state *state;
 		int err = 0;
+		void *tmp;
 
-		/*
-		 * Drop the current fpu which clears fpu->initialized. This ensures
-		 * that any context-switch during the copy of the new state,
-		 * avoids the intermediate state from getting restored/saved.
-		 * Thus avoiding the new restored state from getting corrupted.
-		 * We will be ready to restore/save the state only after
-		 * fpu->initialized is again set.
-		 */
-		fpu__drop(fpu);
+		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+		state = PTR_ALIGN(tmp, 64);
 
 		if (using_compacted_format()) {
-			err = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
+			err = copy_user_to_xstate(&state->xsave, buf_fx);
 		} else {
-			err = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
+			err = __copy_from_user(&state->xsave, buf_fx, state_size);
 
 			if (!err && state_size > offsetof(struct xregs_state, header))
-				err = validate_xstate_header(&fpu->state.xsave.header);
+				err = validate_xstate_header(&state->xsave.header);
 		}
 
 		if (err || __copy_from_user(&env, buf, sizeof(env))) {
-			fpstate_init(&fpu->state);
-			trace_x86_fpu_init_state(fpu);
 			err = -1;
 		} else {
-			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
+			sanitize_restored_xstate(state, &env, xfeatures, fx_only);
+			copy_kernel_to_fpregs(state);
 		}
 
-		local_bh_disable();
-		fpu->initialized = 1;
-		fpu__restore(fpu);
-		local_bh_enable();
-
+		kfree(tmp);
 		return err;
 	} else {
 		/*
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 02/27] x86/fpu: Remove fpu__restore()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 01/27] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:47   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 03/27] x86/fpu: Remove preempt_disable() in fpu__clear() Sebastian Andrzej Siewior
                   ` (27 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

There are no users of fpu__restore() so it is time to remove it.
The comment regarding fpu__restore() and TS bit is stale since commit
  b3b0870ef3ffe ("i387: do not preload FPU state at task switch time")
and has no meaning since.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 Documentation/preempt-locking.txt   |  1 -
 arch/x86/include/asm/fpu/internal.h |  1 -
 arch/x86/kernel/fpu/core.c          | 24 ------------------------
 arch/x86/kernel/process_32.c        |  4 +---
 arch/x86/kernel/process_64.c        |  4 +---
 5 files changed, 2 insertions(+), 32 deletions(-)

diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt
index 509f5a422d571..dce336134e54a 100644
--- a/Documentation/preempt-locking.txt
+++ b/Documentation/preempt-locking.txt
@@ -52,7 +52,6 @@ preemption must be disabled around such regions.
 
 Note, some FPU functions are already explicitly preempt safe.  For example,
 kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
-However, fpu__restore() must be called with preemption disabled.
 
 
 RULE #3: Lock acquire and release must be performed by same task
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index fb04a3ded7ddb..75a1d5f712ee7 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -28,7 +28,6 @@ extern void fpu__initialize(struct fpu *fpu);
 extern void fpu__prepare_read(struct fpu *fpu);
 extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
-extern void fpu__restore(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
 extern int  fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2e5003fef51a9..1d3ae7988f7f2 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -303,30 +303,6 @@ void fpu__prepare_write(struct fpu *fpu)
 	}
 }
 
-/*
- * 'fpu__restore()' is called to copy FPU registers from
- * the FPU fpstate to the live hw registers and to activate
- * access to the hardware registers, so that FPU instructions
- * can be used afterwards.
- *
- * Must be called with kernel preemption disabled (for example
- * with local interrupts disabled, as it is in the case of
- * do_device_not_available()).
- */
-void fpu__restore(struct fpu *fpu)
-{
-	fpu__initialize(fpu);
-
-	/* Avoid __kernel_fpu_begin() right after fpregs_activate() */
-	kernel_fpu_disable();
-	trace_x86_fpu_before_restore(fpu);
-	fpregs_activate(fpu);
-	copy_kernel_to_fpregs(&fpu->state);
-	trace_x86_fpu_after_restore(fpu);
-	kernel_fpu_enable();
-}
-EXPORT_SYMBOL_GPL(fpu__restore);
-
 /*
  * Drops current FPU state: deactivates the fpregs and
  * the fpstate. NOTE: it still leaves previous contents
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index e471d8e6f0b24..7888a41a03cdb 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -267,9 +267,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	/*
 	 * Leave lazy mode, flushing any hypercalls made here.
 	 * This must be done before restoring TLS segments so
-	 * the GDT and LDT are properly updated, and must be
-	 * done before fpu__restore(), so the TS bit is up
-	 * to date.
+	 * the GDT and LDT are properly updated.
 	 */
 	arch_end_context_switch(next_p);
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6a62f4af9fcf7..e1983b3a16c43 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -538,9 +538,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	/*
 	 * Leave lazy mode, flushing any hypercalls made here.  This
 	 * must be done after loading TLS entries in the GDT but before
-	 * loading segments that might reference them, and and it must
-	 * be done before fpu__restore(), so the TS bit is up to
-	 * date.
+	 * loading segments that might reference them.
 	 */
 	arch_end_context_switch(next_p);
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 03/27] x86/fpu: Remove preempt_disable() in fpu__clear()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 01/27] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 02/27] x86/fpu: Remove fpu__restore() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:48   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 04/27] x86/fpu: Always init the `state' " Sebastian Andrzej Siewior
                   ` (26 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior, Borislav Petkov

The preempt_disable() section was introduced in commit

  a10b6a16cdad8 ("x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic")

and it was said to be temporary.

fpu__initialize() initializes the FPU struct to its "init" value and
then sets ->initialized to 1. The last part is the important one.
The content of the `state' does not matter because it gets set via
copy_init_fpstate_to_fpregs().
A preemption here has little meaning because the registers will always be
set to the same content after copy_init_fpstate_to_fpregs(). A softirq
with a kernel_fpu_begin() could also force to save FPU's registers after
fpu__initialize() without changing the outcome here.

Remove the preempt_disable() section in fpu__clear(), preemption here
does not hurt.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1d3ae7988f7f2..1940319268aef 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -366,11 +366,9 @@ void fpu__clear(struct fpu *fpu)
 	 * Make sure fpstate is cleared and initialized.
 	 */
 	if (static_cpu_has(X86_FEATURE_FPU)) {
-		preempt_disable();
 		fpu__initialize(fpu);
 		user_fpu_begin();
 		copy_init_fpstate_to_fpregs();
-		preempt_enable();
 	}
 }
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 04/27] x86/fpu: Always init the `state' in fpu__clear()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (2 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 03/27] x86/fpu: Remove preempt_disable() in fpu__clear() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:48   ` [tip:x86/fpu] x86/fpu: Always init the state " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 05/27] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
                   ` (25 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior, Borislav Petkov

fpu__clear() only initializes the `state' if the FPU is present. This
initialisation is also required for the FPU-less system and takes place
in math_emulate(). Since fpu__initialize() only performs the
initialization if ->initialized is zero it does not matter that it is
invoked each time an opcode is emulated. It makes the removal of
->initialized easier if the struct is also initialized in the FPU-less
case at the same time.

Move fpu__initialize() before the FPU check so it is also performed in
the FPU-less case.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h | 1 -
 arch/x86/kernel/fpu/core.c          | 5 ++---
 arch/x86/math-emu/fpu_entry.c       | 3 ---
 3 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 75a1d5f712ee7..70ecb7c032cb4 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -24,7 +24,6 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__initialize(struct fpu *fpu);
 extern void fpu__prepare_read(struct fpu *fpu);
 extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1940319268aef..e43296854e379 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -223,7 +223,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
  * Activate the current task's in-memory FPU context,
  * if it has not been used before:
  */
-void fpu__initialize(struct fpu *fpu)
+static void fpu__initialize(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
@@ -236,7 +236,6 @@ void fpu__initialize(struct fpu *fpu)
 		fpu->initialized = 1;
 	}
 }
-EXPORT_SYMBOL_GPL(fpu__initialize);
 
 /*
  * This function must be called before we read a task's fpstate.
@@ -365,8 +364,8 @@ void fpu__clear(struct fpu *fpu)
 	/*
 	 * Make sure fpstate is cleared and initialized.
 	 */
+	fpu__initialize(fpu);
 	if (static_cpu_has(X86_FEATURE_FPU)) {
-		fpu__initialize(fpu);
 		user_fpu_begin();
 		copy_init_fpstate_to_fpregs();
 	}
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index 9e2ba7e667f61..a873da6b46d6b 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -113,9 +113,6 @@ void math_emulate(struct math_emu_info *info)
 	unsigned long code_base = 0;
 	unsigned long code_limit = 0;	/* Initialized to stop compiler warnings */
 	struct desc_struct code_descriptor;
-	struct fpu *fpu = &current->thread.fpu;
-
-	fpu__initialize(fpu);
 
 #ifdef RE_ENTRANT_CHECKING
 	if (emulating) {
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 05/27] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (3 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 04/27] x86/fpu: Always init the `state' " Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:49   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 06/27] x86/fpu: Don't save fxregs for ia32 frames " Sebastian Andrzej Siewior
                   ` (24 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

With lazy-FPU support the (now named variable) ->initialized was set to true if
the CPU's FPU registers were holding the a valid state of the FPU registers for
the active process. If it was set to false then the FPU state was saved in
fpu->state and the FPU was deactivated.
With lazy-FPU gone, ->initialized is always true for user threads and kernel
threads never this function so ->initialized is always true in
copy_fpstate_to_sigframe().
The using_compacted_format() check is also a leftover from the lazy-FPU time.
In the `->initialized == false' case copy_to_user() would copy the compacted
buffer while userland would expect the non-compacted format instead. So in
order to save the FPU state in the non-compacted form it issues the xsave
opcode to save the *current* FPU state.
The FPU is not enabled so the attempt raises the FPU trap, the trap restores
the FPU content and re-enables the FPU and the xsave opcode is invoked again and
succeeds. *This* does not longer work since commit

  bef8b6da9522 ("x86/fpu: Handle #NM without FPU emulation as an error")

Remove check for ->initialized because it is always true and remove the
false condition. Update the comment to reflect that the "state is always live".

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 35 ++++++++---------------------------
 1 file changed, 8 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a874931edf6a9..de83d0ed9e14e 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -144,9 +144,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  *	buf == buf_fx for 64-bit frames and 32-bit fsave frame.
  *	buf != buf_fx for 32-bit frames with fxstate.
  *
- * If the fpu, extended register state is live, save the state directly
- * to the user frame pointed by the aligned pointer 'buf_fx'. Otherwise,
- * copy the thread's fpu state to the user frame starting at 'buf_fx'.
+ * Save the state directly to the user frame pointed by the aligned pointer
+ * 'buf_fx'.
  *
  * If this is a 32-bit frame with fxstate, put a fsave header before
  * the aligned state at 'buf_fx'.
@@ -157,7 +156,6 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
 	struct fpu *fpu = &current->thread.fpu;
-	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -172,29 +170,12 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	if (fpu->initialized || using_compacted_format()) {
-		/* Save the live register state to the user directly. */
-		if (copy_fpregs_to_sigframe(buf_fx))
-			return -1;
-		/* Update the thread's fxstate to save the fsave header. */
-		if (ia32_fxstate)
-			copy_fxregs_to_kernel(fpu);
-	} else {
-		/*
-		 * It is a *bug* if kernel uses compacted-format for xsave
-		 * area and we copy it out directly to a signal frame. It
-		 * should have been handled above by saving the registers
-		 * directly.
-		 */
-		if (boot_cpu_has(X86_FEATURE_XSAVES)) {
-			WARN_ONCE(1, "x86/fpu: saving compacted-format xsave area to a signal frame!\n");
-			return -1;
-		}
-
-		fpstate_sanitize_xstate(fpu);
-		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
-			return -1;
-	}
+	/* Save the live register state to the user directly. */
+	if (copy_fpregs_to_sigframe(buf_fx))
+		return -1;
+	/* Update the thread's fxstate to save the fsave header. */
+	if (ia32_fxstate)
+		copy_fxregs_to_kernel(fpu);
 
 	/* Save the fsave header for the 32-bit frames. */
 	if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 06/27] x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (4 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 05/27] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:50   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 07/27] x86/fpu: Remove fpu->initialized Sebastian Andrzej Siewior
                   ` (23 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

In commit

  72a671ced66db ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")

the 32bit and 64bit path of the signal delivery code were merged. The 32bit version:
|int save_i387_xstate_ia32(void __user *buf)
|…
|       if (cpu_has_xsave)
|               return save_i387_xsave(fp);
|       if (cpu_has_fxsr)
|               return save_i387_fxsave(fp);

The 64bit version:
|int save_i387_xstate(void __user *buf)
|…
|       if (user_has_fpu()) {
|               if (use_xsave())
|                       err = xsave_user(buf);
|               else
|                       err = fxsave_user(buf);
|
|               if (unlikely(err)) {
|                       __clear_user(buf, xstate_size);
|                       return err;

The merge:
|int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
|…
|       if (user_has_fpu()) {
|               /* Save the live register state to the user directly. */
|               if (save_user_xstate(buf_fx))
|                       return -1;
|               /* Update the thread's fxstate to save the fsave header. */
|               if (ia32_fxstate)
|                       fpu_fxsave(&tsk->thread.fpu);

I don't think that we needed to save the FPU registers to ->thread.fpu because
the registers were stored in `buf_fx'. Today the state will be restored from
`buf_fx' after the signal was handled (I assume that this was also the case
with lazy-FPU). Since commit

  66463db4fc560 ("x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()")

it is ensured that the signal handler starts with clear/fresh set of FPU
registers which means that the previous store is futile.

Remove copy_fxregs_to_kernel() because task's FPU state is cleared later in
handle_signal() via fpu__clear().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index de83d0ed9e14e..2f044021fde2b 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -155,7 +155,6 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  */
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
-	struct fpu *fpu = &current->thread.fpu;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -173,9 +172,6 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 	/* Save the live register state to the user directly. */
 	if (copy_fpregs_to_sigframe(buf_fx))
 		return -1;
-	/* Update the thread's fxstate to save the fsave header. */
-	if (ia32_fxstate)
-		copy_fxregs_to_kernel(fpu);
 
 	/* Save the fsave header for the 32-bit frames. */
 	if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 07/27] x86/fpu: Remove fpu->initialized
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (5 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 06/27] x86/fpu: Don't save fxregs for ia32 frames " Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:50   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 08/27] x86/fpu: Remove user_fpu_begin() Sebastian Andrzej Siewior
                   ` (22 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

The `initialized' member of the fpu struct is always set to one for user
tasks and zero for kernel tasks. This avoids saving/restoring the FPU
registers for kernel threads.

The ->initialized = 0 case for user tasks has been removed in previous changes
for instance by always an explicit init at fork() time for FPU-less system which
was otherwise delayed until the emulated opcode.

The context switch code (switch_fpu_prepare() + switch_fpu_finish())
can't unconditionally save/restore registers for kernel threads. Not only would
it slow down switch but also load a zeroed xcomp_bv for the XSAVES.

For kernel_fpu_begin() (+end) the situation is similar: EFI with runtime
services uses this before alternatives_patched is true. Which means that this
function is used too early and it wasn't the case before.

For those two cases current->mm is used to determine between user &
kernel thread. For kernel_fpu_begin() we skip save/restore of the FPU
registers.
During the context switch into a kernel thread we don't do anything.
There is no reason to save the FPU state of a kernel thread.
The reordering in __switch_to() is important because the current() pointer
needs to be valid before switch_fpu_finish() is invoked so ->mm is seen of the
new task instead the old one.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/ia32/ia32_signal.c         | 17 +++----
 arch/x86/include/asm/fpu/internal.h | 18 ++++----
 arch/x86/include/asm/fpu/types.h    |  9 ----
 arch/x86/include/asm/trace/fpu.h    |  5 +--
 arch/x86/kernel/fpu/core.c          | 70 +++++++++--------------------
 arch/x86/kernel/fpu/init.c          |  2 -
 arch/x86/kernel/fpu/regset.c        | 19 ++------
 arch/x86/kernel/fpu/xstate.c        |  2 -
 arch/x86/kernel/process_32.c        |  4 +-
 arch/x86/kernel/process_64.c        |  4 +-
 arch/x86/kernel/signal.c            | 17 +++----
 arch/x86/mm/pkeys.c                 |  7 +--
 12 files changed, 53 insertions(+), 121 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 321fe5f5d0e96..6eeb3249f22ff 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -216,8 +216,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 				 size_t frame_size,
 				 void __user **fpstate)
 {
-	struct fpu *fpu = &current->thread.fpu;
-	unsigned long sp;
+	unsigned long sp, fx_aligned, math_size;
 
 	/* Default to using normal stack */
 	sp = regs->sp;
@@ -231,15 +230,11 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 		 ksig->ka.sa.sa_restorer)
 		sp = (unsigned long) ksig->ka.sa.sa_restorer;
 
-	if (fpu->initialized) {
-		unsigned long fx_aligned, math_size;
-
-		sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
-		*fpstate = (struct _fpstate_32 __user *) sp;
-		if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned,
-				    math_size) < 0)
-			return (void __user *) -1L;
-	}
+	sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
+	*fpstate = (struct _fpstate_32 __user *) sp;
+	if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned,
+				     math_size) < 0)
+		return (void __user *) -1L;
 
 	sp -= frame_size;
 	/* Align the stack pointer according to the i386 ABI,
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 70ecb7c032cb4..fd95f1411eb5c 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -494,11 +494,14 @@ static inline void fpregs_activate(struct fpu *fpu)
  *
  *  - switch_fpu_finish() restores the new state as
  *    necessary.
+ *
+ * The FPU context is only stored/restore for user task and ->mm is used to
+ * distinguish between kernel and user threads.
  */
 static inline void
 switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
-	if (static_cpu_has(X86_FEATURE_FPU) && old_fpu->initialized) {
+	if (static_cpu_has(X86_FEATURE_FPU) && current->mm) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
 			old_fpu->last_cpu = -1;
 		else
@@ -506,8 +509,7 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 
 		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
-	} else
-		old_fpu->last_cpu = -1;
+	}
 }
 
 /*
@@ -520,12 +522,12 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 {
-	bool preload = static_cpu_has(X86_FEATURE_FPU) &&
-		       new_fpu->initialized;
+	if (static_cpu_has(X86_FEATURE_FPU)) {
+		if (!fpregs_state_valid(new_fpu, cpu)) {
+			if (current->mm)
+				copy_kernel_to_fpregs(&new_fpu->state);
+		}
 
-	if (preload) {
-		if (!fpregs_state_valid(new_fpu, cpu))
-			copy_kernel_to_fpregs(&new_fpu->state);
 		fpregs_activate(new_fpu);
 	}
 }
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 2e32e178e0645..f098f6cab94bf 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -293,15 +293,6 @@ struct fpu {
 	 */
 	unsigned int			last_cpu;
 
-	/*
-	 * @initialized:
-	 *
-	 * This flag indicates whether this context is initialized: if the task
-	 * is not running then we can restore from this context, if the task
-	 * is running then we should save into this context.
-	 */
-	unsigned char			initialized;
-
 	/*
 	 * @avx512_timestamp:
 	 *
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index 069c04be15076..bd65f6ba950f8 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -13,22 +13,19 @@ DECLARE_EVENT_CLASS(x86_fpu,
 
 	TP_STRUCT__entry(
 		__field(struct fpu *, fpu)
-		__field(bool, initialized)
 		__field(u64, xfeatures)
 		__field(u64, xcomp_bv)
 		),
 
 	TP_fast_assign(
 		__entry->fpu		= fpu;
-		__entry->initialized	= fpu->initialized;
 		if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
 			__entry->xfeatures = fpu->state.xsave.header.xfeatures;
 			__entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
 		}
 	),
-	TP_printk("x86/fpu: %p initialized: %d xfeatures: %llx xcomp_bv: %llx",
+	TP_printk("x86/fpu: %p xfeatures: %llx xcomp_bv: %llx",
 			__entry->fpu,
-			__entry->initialized,
 			__entry->xfeatures,
 			__entry->xcomp_bv
 	)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index e43296854e379..97e27de2b7c05 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -101,7 +101,7 @@ static void __kernel_fpu_begin(void)
 
 	kernel_fpu_disable();
 
-	if (fpu->initialized) {
+	if (current->mm) {
 		/*
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
@@ -116,7 +116,7 @@ static void __kernel_fpu_end(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
-	if (fpu->initialized)
+	if (current->mm)
 		copy_kernel_to_fpregs(&fpu->state);
 
 	kernel_fpu_enable();
@@ -147,11 +147,10 @@ void fpu__save(struct fpu *fpu)
 
 	preempt_disable();
 	trace_x86_fpu_before_save(fpu);
-	if (fpu->initialized) {
-		if (!copy_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
-		}
-	}
+
+	if (!copy_fpregs_to_fpstate(fpu))
+		copy_kernel_to_fpregs(&fpu->state);
+
 	trace_x86_fpu_after_save(fpu);
 	preempt_enable();
 }
@@ -190,7 +189,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
 	dst_fpu->last_cpu = -1;
 
-	if (!src_fpu->initialized || !static_cpu_has(X86_FEATURE_FPU))
+	if (!static_cpu_has(X86_FEATURE_FPU))
 		return 0;
 
 	WARN_ON_FPU(src_fpu != &current->thread.fpu);
@@ -227,14 +226,10 @@ static void fpu__initialize(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
-	if (!fpu->initialized) {
-		fpstate_init(&fpu->state);
-		trace_x86_fpu_init_state(fpu);
+	fpstate_init(&fpu->state);
+	trace_x86_fpu_init_state(fpu);
 
-		trace_x86_fpu_activate_state(fpu);
-		/* Safe to do for the current task: */
-		fpu->initialized = 1;
-	}
+	trace_x86_fpu_activate_state(fpu);
 }
 
 /*
@@ -247,32 +242,20 @@ static void fpu__initialize(struct fpu *fpu)
  *
  * - or it's called for stopped tasks (ptrace), in which case the
  *   registers were already saved by the context-switch code when
- *   the task scheduled out - we only have to initialize the registers
- *   if they've never been initialized.
+ *   the task scheduled out.
  *
  * If the task has used the FPU before then save it.
  */
 void fpu__prepare_read(struct fpu *fpu)
 {
-	if (fpu == &current->thread.fpu) {
+	if (fpu == &current->thread.fpu)
 		fpu__save(fpu);
-	} else {
-		if (!fpu->initialized) {
-			fpstate_init(&fpu->state);
-			trace_x86_fpu_init_state(fpu);
-
-			trace_x86_fpu_activate_state(fpu);
-			/* Safe to do for current and for stopped child tasks: */
-			fpu->initialized = 1;
-		}
-	}
 }
 
 /*
  * This function must be called before we write a task's fpstate.
  *
- * If the task has used the FPU before then invalidate any cached FPU registers.
- * If the task has not used the FPU before then initialize its fpstate.
+ * Invalidate any cached FPU registers.
  *
  * After this function call, after registers in the fpstate are
  * modified and the child task has woken up, the child task will
@@ -289,17 +272,8 @@ void fpu__prepare_write(struct fpu *fpu)
 	 */
 	WARN_ON_FPU(fpu == &current->thread.fpu);
 
-	if (fpu->initialized) {
-		/* Invalidate any cached state: */
-		__fpu_invalidate_fpregs_state(fpu);
-	} else {
-		fpstate_init(&fpu->state);
-		trace_x86_fpu_init_state(fpu);
-
-		trace_x86_fpu_activate_state(fpu);
-		/* Safe to do for stopped child tasks: */
-		fpu->initialized = 1;
-	}
+	/* Invalidate any cached state: */
+	__fpu_invalidate_fpregs_state(fpu);
 }
 
 /*
@@ -316,17 +290,13 @@ void fpu__drop(struct fpu *fpu)
 	preempt_disable();
 
 	if (fpu == &current->thread.fpu) {
-		if (fpu->initialized) {
-			/* Ignore delayed exceptions from user space */
-			asm volatile("1: fwait\n"
-				     "2:\n"
-				     _ASM_EXTABLE(1b, 2b));
-			fpregs_deactivate(fpu);
-		}
+		/* Ignore delayed exceptions from user space */
+		asm volatile("1: fwait\n"
+			     "2:\n"
+			     _ASM_EXTABLE(1b, 2b));
+		fpregs_deactivate(fpu);
 	}
 
-	fpu->initialized = 0;
-
 	trace_x86_fpu_dropped(fpu);
 
 	preempt_enable();
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 6abd83572b016..20d8fa7124c77 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -239,8 +239,6 @@ static void __init fpu__init_system_ctx_switch(void)
 
 	WARN_ON_FPU(!on_boot_cpu);
 	on_boot_cpu = 0;
-
-	WARN_ON_FPU(current->thread.fpu.initialized);
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 5dbc099178a88..d652b939ccfb5 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -15,16 +15,12 @@
  */
 int regset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	struct fpu *target_fpu = &target->thread.fpu;
-
-	return target_fpu->initialized ? regset->n : 0;
+	return regset->n;
 }
 
 int regset_xregset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	struct fpu *target_fpu = &target->thread.fpu;
-
-	if (boot_cpu_has(X86_FEATURE_FXSR) && target_fpu->initialized)
+	if (boot_cpu_has(X86_FEATURE_FXSR))
 		return regset->n;
 	else
 		return 0;
@@ -370,16 +366,9 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 int dump_fpu(struct pt_regs *regs, struct user_i387_struct *ufpu)
 {
 	struct task_struct *tsk = current;
-	struct fpu *fpu = &tsk->thread.fpu;
-	int fpvalid;
 
-	fpvalid = fpu->initialized;
-	if (fpvalid)
-		fpvalid = !fpregs_get(tsk, NULL,
-				      0, sizeof(struct user_i387_ia32_struct),
-				      ufpu, NULL);
-
-	return fpvalid;
+	return !fpregs_get(tsk, NULL, 0, sizeof(struct user_i387_ia32_struct),
+			   ufpu, NULL);
 }
 EXPORT_SYMBOL(dump_fpu);
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d7432c2b10514..8bfcc5b9e04b7 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -892,8 +892,6 @@ const void *get_xsave_field_ptr(int xsave_state)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
-	if (!fpu->initialized)
-		return NULL;
 	/*
 	 * fpu__save() takes the CPU's xstate registers
 	 * and saves them off to the 'fpu memory buffer.
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 7888a41a03cdb..77d9eb43ccac8 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -288,10 +288,10 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	if (prev->gs | next->gs)
 		lazy_load_gs(next->gs);
 
-	switch_fpu_finish(next_fpu, cpu);
-
 	this_cpu_write(current_task, next_p);
 
+	switch_fpu_finish(next_fpu, cpu);
+
 	/* Load the Intel cache allocation PQR MSR. */
 	resctrl_sched_in();
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e1983b3a16c43..ffea7c557963a 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -566,14 +566,14 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	x86_fsgsbase_load(prev, next);
 
-	switch_fpu_finish(next_fpu, cpu);
-
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */
 	this_cpu_write(current_task, next_p);
 	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
 
+	switch_fpu_finish(next_fpu, cpu);
+
 	/* Reload sp0. */
 	update_task_stack(next_p);
 
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 08dfd4c1a4f95..6f45f795690f6 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -246,7 +246,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
 	int onsigstack = on_sig_stack(sp);
-	struct fpu *fpu = &current->thread.fpu;
+	int ret;
 
 	/* redzone */
 	if (IS_ENABLED(CONFIG_X86_64))
@@ -265,11 +265,9 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		sp = (unsigned long) ka->sa.sa_restorer;
 	}
 
-	if (fpu->initialized) {
-		sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
-					  &buf_fx, &math_size);
-		*fpstate = (void __user *)sp;
-	}
+	sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
+				  &buf_fx, &math_size);
+	*fpstate = (void __user *)sp;
 
 	sp = align_sigframe(sp - frame_size);
 
@@ -281,8 +279,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		return (void __user *)-1L;
 
 	/* save i387 and extended state */
-	if (fpu->initialized &&
-	    copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size) < 0)
+	ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size);
+	if (ret < 0)
 		return (void __user *)-1L;
 
 	return (void __user *)sp;
@@ -763,8 +761,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 		/*
 		 * Ensure the signal handler starts with the new fpu state.
 		 */
-		if (fpu->initialized)
-			fpu__clear(fpu);
+		fpu__clear(fpu);
 	}
 	signal_setup_done(failed, ksig, stepping);
 }
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 047a77f6a10cb..05bb9a44eb1c3 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -39,17 +39,12 @@ int __execute_only_pkey(struct mm_struct *mm)
 	 * dance to set PKRU if we do not need to.  Check it
 	 * first and assume that if the execute-only pkey is
 	 * write-disabled that we do not have to set it
-	 * ourselves.  We need preempt off so that nobody
-	 * can make fpregs inactive.
+	 * ourselves.
 	 */
-	preempt_disable();
 	if (!need_to_set_mm_pkey &&
-	    current->thread.fpu.initialized &&
 	    !__pkru_allows_read(read_pkru(), execute_only_pkey)) {
-		preempt_enable();
 		return execute_only_pkey;
 	}
-	preempt_enable();
 
 	/*
 	 * Set up PKRU so that it denies access for everything
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 08/27] x86/fpu: Remove user_fpu_begin()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (6 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 07/27] x86/fpu: Remove fpu->initialized Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:51   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 09/27] x86/fpu: Add (__)make_fpregs_active helpers Sebastian Andrzej Siewior
                   ` (21 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior, Borislav Petkov

user_fpu_begin() sets fpu_fpregs_owner_ctx to task's fpu struct. This is
always the case since there is no lazy FPU anymore.

fpu_fpregs_owner_ctx is used during context switch to decide if it needs
to load the saved registers or if the currently loaded registers are
valid. It could be skipped during
	taskA -> kernel thread -> taskA

because the switch to kernel thread would not alter the CPU's FPU state.

Since this field is always updated during context switch and never
invalidated, setting it manually (in user context) makes no difference.
A kernel thread with kernel_fpu_begin() block could set
fpu_fpregs_owner_ctx to NULL but a kernel thread does not use
user_fpu_begin().
This is a leftover from the lazy-FPU time.

Remove user_fpu_begin(), it does not change fpu_fpregs_owner_ctx's
content.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/fpu/internal.h | 17 -----------------
 arch/x86/kernel/fpu/core.c          |  4 +---
 arch/x86/kernel/fpu/signal.c        |  1 -
 3 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index fd95f1411eb5c..df98bc7f1c8d8 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -532,23 +532,6 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 	}
 }
 
-/*
- * Needs to be preemption-safe.
- *
- * NOTE! user_fpu_begin() must be used only immediately before restoring
- * the save state. It does not do any saving/restoring on its own. In
- * lazy FPU mode, it is just an optimization to avoid a #NM exception,
- * the task can lose the FPU right after preempt_enable().
- */
-static inline void user_fpu_begin(void)
-{
-	struct fpu *fpu = &current->thread.fpu;
-
-	preempt_disable();
-	fpregs_activate(fpu);
-	preempt_enable();
-}
-
 /*
  * MXCSR and XCR definitions:
  */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 97e27de2b7c05..739ca3ae2bdcd 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -335,10 +335,8 @@ void fpu__clear(struct fpu *fpu)
 	 * Make sure fpstate is cleared and initialized.
 	 */
 	fpu__initialize(fpu);
-	if (static_cpu_has(X86_FEATURE_FPU)) {
-		user_fpu_begin();
+	if (static_cpu_has(X86_FEATURE_FPU))
 		copy_init_fpstate_to_fpregs();
-	}
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 2f044021fde2b..6475320939ce3 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -322,7 +322,6 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 * For 64-bit frames and 32-bit fsave frames, restore the user
 		 * state to the registers directly (with exceptions handled).
 		 */
-		user_fpu_begin();
 		if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
 			fpu__clear(fpu);
 			return -1;
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 09/27] x86/fpu: Add (__)make_fpregs_active helpers
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (7 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 08/27] x86/fpu: Remove user_fpu_begin() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:52   ` [tip:x86/fpu] x86/fpu: Add an __fpregs_load_activate() internal helper tip-bot for Rik van Riel
  2019-04-03 16:41 ` [PATCH 10/27] x86/fpu: Make __raw_xsave_addr() use feature number instead of mask Sebastian Andrzej Siewior
                   ` (20 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

From: Rik van Riel <riel@surriel.com>

Add helper function that ensures the floating point registers for
the current task are active. Use with preemption disabled.

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/fpu/api.h      | 11 +++++++++++
 arch/x86/include/asm/fpu/internal.h | 19 +++++++++++--------
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index b56d504af6545..73e684160f354 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -10,6 +10,7 @@
 
 #ifndef _ASM_X86_FPU_API_H
 #define _ASM_X86_FPU_API_H
+#include <linux/preempt.h>
 
 /*
  * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
@@ -22,6 +23,16 @@ extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 extern bool irq_fpu_usable(void);
 
+static inline void fpregs_lock(void)
+{
+	preempt_disable();
+}
+
+static inline void fpregs_unlock(void)
+{
+	preempt_enable();
+}
+
 /*
  * Query the presence of one or more xfeatures. Works on any legacy CPU as well.
  *
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index df98bc7f1c8d8..237ff8dc8559b 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -484,6 +484,15 @@ static inline void fpregs_activate(struct fpu *fpu)
 	trace_x86_fpu_regs_activated(fpu);
 }
 
+static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
+{
+	if (!fpregs_state_valid(fpu, cpu)) {
+		if (current->mm)
+			copy_kernel_to_fpregs(&fpu->state);
+		fpregs_activate(fpu);
+	}
+}
+
 /*
  * FPU state switching for scheduling.
  *
@@ -522,14 +531,8 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 {
-	if (static_cpu_has(X86_FEATURE_FPU)) {
-		if (!fpregs_state_valid(new_fpu, cpu)) {
-			if (current->mm)
-				copy_kernel_to_fpregs(&new_fpu->state);
-		}
-
-		fpregs_activate(new_fpu);
-	}
+	if (static_cpu_has(X86_FEATURE_FPU))
+		__fpregs_load_activate(new_fpu, cpu);
 }
 
 /*
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 10/27] x86/fpu: Make __raw_xsave_addr() use feature number instead of mask
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (8 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 09/27] x86/fpu: Add (__)make_fpregs_active helpers Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:52   ` [tip:x86/fpu] x86/fpu: Make __raw_xsave_addr() use a " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 11/27] x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use " Sebastian Andrzej Siewior
                   ` (19 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior, Borislav Petkov

Most users of __raw_xsave_addr() use a feature number, shift it to a
mask and then __raw_xsave_addr() shifts it back to the feature number.

Make __raw_xsave_addr() use the feature number as an argument.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/xstate.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 8bfcc5b9e04b7..4f7f3c5d0d0cf 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -805,20 +805,18 @@ void fpu__resume_cpu(void)
 }
 
 /*
- * Given an xstate feature mask, calculate where in the xsave
+ * Given an xstate feature nr, calculate where in the xsave
  * buffer the state is.  Callers should ensure that the buffer
  * is valid.
  */
-static void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask)
+static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 {
-	int feature_nr = fls64(xstate_feature_mask) - 1;
-
-	if (!xfeature_enabled(feature_nr)) {
+	if (!xfeature_enabled(xfeature_nr)) {
 		WARN_ON_FPU(1);
 		return NULL;
 	}
 
-	return (void *)xsave + xstate_comp_offsets[feature_nr];
+	return (void *)xsave + xstate_comp_offsets[xfeature_nr];
 }
 /*
  * Given the xsave area and a state inside, this function returns the
@@ -840,6 +838,7 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask
  */
 void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 {
+	int xfeature_nr;
 	/*
 	 * Do we even *have* xsave state?
 	 */
@@ -867,7 +866,8 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	if (!(xsave->header.xfeatures & xstate_feature))
 		return NULL;
 
-	return __raw_xsave_addr(xsave, xstate_feature);
+	xfeature_nr = fls64(xstate_feature) - 1;
+	return __raw_xsave_addr(xsave, xfeature_nr);
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
@@ -1014,7 +1014,7 @@ int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int of
 		 * Copy only in-use xstates:
 		 */
 		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, 1 << i);
+			void *src = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
@@ -1100,7 +1100,7 @@ int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned i
 		 * Copy only in-use xstates:
 		 */
 		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, 1 << i);
+			void *src = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
@@ -1157,7 +1157,7 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 		u64 mask = ((u64)1 << i);
 
 		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, 1 << i);
+			void *dst = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
@@ -1211,7 +1211,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 		u64 mask = ((u64)1 << i);
 
 		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, 1 << i);
+			void *dst = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 11/27] x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use feature number instead of mask
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (9 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 10/27] x86/fpu: Make __raw_xsave_addr() use feature number instead of mask Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:53   ` [tip:x86/fpu] x86/fpu: Use a feature number instead of mask in two more helpers tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions Sebastian Andrzej Siewior
                   ` (18 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

After changing the argument of __raw_xsave_addr() from a mask to number
Dave suggested to check if it makes sense to do the same for
get_xsave_addr(). As it turns out it does. Only get_xsave_addr() needs
the mask to check if the requested feature is part of what is
support/saved and then uses the number again. The shift operation is
cheaper compared to "find last bit set". Also, the feature number uses
less opcode space compared to the mask :)

Make get_xsave_addr() argument a xfeature number instead of mask and fix
up its callers.
As part of this use xfeature_nr and xfeature_mask consistently.
This results in changes to the kvm code as:
	feature -> xfeature_mask
	index -> xfeature_nr

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/fpu/xstate.h |  4 ++--
 arch/x86/kernel/fpu/xstate.c      | 22 ++++++++++------------
 arch/x86/kernel/traps.c           |  2 +-
 arch/x86/kvm/x86.c                | 28 ++++++++++++++--------------
 arch/x86/mm/mpx.c                 |  6 +++---
 5 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 48581988d78c7..fbe41f808e5d8 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -46,8 +46,8 @@ extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
 void fpu__xstate_clear_all_cpu_caps(void);
-void *get_xsave_addr(struct xregs_state *xsave, int xstate);
-const void *get_xsave_field_ptr(int xstate_field);
+void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
+const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
 int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4f7f3c5d0d0cf..9c459fd1d38e6 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -830,15 +830,14 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
  *
  * Inputs:
  *	xstate: the thread's storage area for all FPU data
- *	xstate_feature: state which is defined in xsave.h (e.g.
- *	XFEATURE_MASK_FP, XFEATURE_MASK_SSE, etc...)
+ *	xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
+ *	XFEATURE_SSE, etc...)
  * Output:
  *	address of the state in the xsave area, or NULL if the
  *	field is not present in the xsave buffer.
  */
-void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
+void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 {
-	int xfeature_nr;
 	/*
 	 * Do we even *have* xsave state?
 	 */
@@ -850,11 +849,11 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	 * have not enabled.  Remember that pcntxt_mask is
 	 * what we write to the XCR0 register.
 	 */
-	WARN_ONCE(!(xfeatures_mask & xstate_feature),
+	WARN_ONCE(!(xfeatures_mask & BIT_ULL(xfeature_nr)),
 		  "get of unsupported state");
 	/*
 	 * This assumes the last 'xsave*' instruction to
-	 * have requested that 'xstate_feature' be saved.
+	 * have requested that 'xfeature_nr' be saved.
 	 * If it did not, we might be seeing and old value
 	 * of the field in the buffer.
 	 *
@@ -863,10 +862,9 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	 * or because the "init optimization" caused it
 	 * to not be saved.
 	 */
-	if (!(xsave->header.xfeatures & xstate_feature))
+	if (!(xsave->header.xfeatures & BIT_ULL(xfeature_nr)))
 		return NULL;
 
-	xfeature_nr = fls64(xstate_feature) - 1;
 	return __raw_xsave_addr(xsave, xfeature_nr);
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
@@ -882,13 +880,13 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
  * Note that this only works on the current task.
  *
  * Inputs:
- *	@xsave_state: state which is defined in xsave.h (e.g. XFEATURE_MASK_FP,
- *	XFEATURE_MASK_SSE, etc...)
+ *	@xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
+ *	XFEATURE_SSE, etc...)
  * Output:
  *	address of the state in the xsave area or NULL if the state
  *	is not present or is in its 'init state'.
  */
-const void *get_xsave_field_ptr(int xsave_state)
+const void *get_xsave_field_ptr(int xfeature_nr)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
@@ -898,7 +896,7 @@ const void *get_xsave_field_ptr(int xsave_state)
 	 */
 	fpu__save(fpu);
 
-	return get_xsave_addr(&fpu->state.xsave, xsave_state);
+	return get_xsave_addr(&fpu->state.xsave, xfeature_nr);
 }
 
 #ifdef CONFIG_ARCH_HAS_PKEYS
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index d26f9e9c3d830..8b6d03e55d2f7 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -456,7 +456,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 	 * which is all zeros which indicates MPX was not
 	 * responsible for the exception.
 	 */
-	bndcsr = get_xsave_field_ptr(XFEATURE_MASK_BNDCSR);
+	bndcsr = get_xsave_field_ptr(XFEATURE_BNDCSR);
 	if (!bndcsr)
 		goto exit_trap;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 099b851dabafd..8022e7769b3a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3674,15 +3674,15 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
-		u64 feature = valid & -valid;
-		int index = fls64(feature) - 1;
-		void *src = get_xsave_addr(xsave, feature);
+		u64 xfeature_mask = valid & -valid;
+		int xfeature_nr = fls64(xfeature_mask) - 1;
+		void *src = get_xsave_addr(xsave, xfeature_nr);
 
 		if (src) {
 			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, index,
+			cpuid_count(XSTATE_CPUID, xfeature_nr,
 				    &size, &offset, &ecx, &edx);
-			if (feature == XFEATURE_MASK_PKRU)
+			if (xfeature_nr == XFEATURE_PKRU)
 				memcpy(dest + offset, &vcpu->arch.pkru,
 				       sizeof(vcpu->arch.pkru));
 			else
@@ -3690,7 +3690,7 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
 
 		}
 
-		valid -= feature;
+		valid -= xfeature_mask;
 	}
 }
 
@@ -3717,22 +3717,22 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
-		u64 feature = valid & -valid;
-		int index = fls64(feature) - 1;
-		void *dest = get_xsave_addr(xsave, feature);
+		u64 xfeature_mask = valid & -valid;
+		int xfeature_nr = fls64(xfeature_mask) - 1;
+		void *dest = get_xsave_addr(xsave, xfeature_nr);
 
 		if (dest) {
 			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, index,
+			cpuid_count(XSTATE_CPUID, xfeature_nr,
 				    &size, &offset, &ecx, &edx);
-			if (feature == XFEATURE_MASK_PKRU)
+			if (xfeature_nr == XFEATURE_PKRU)
 				memcpy(&vcpu->arch.pkru, src + offset,
 				       sizeof(vcpu->arch.pkru));
 			else
 				memcpy(dest, src + offset, size);
 		}
 
-		valid -= feature;
+		valid -= xfeature_mask;
 	}
 }
 
@@ -8850,11 +8850,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 		if (init_event)
 			kvm_put_guest_fpu(vcpu);
 		mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu->state.xsave,
-					XFEATURE_MASK_BNDREGS);
+					XFEATURE_BNDREGS);
 		if (mpx_state_buffer)
 			memset(mpx_state_buffer, 0, sizeof(struct mpx_bndreg_state));
 		mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu->state.xsave,
-					XFEATURE_MASK_BNDCSR);
+					XFEATURE_BNDCSR);
 		if (mpx_state_buffer)
 			memset(mpx_state_buffer, 0, sizeof(struct mpx_bndcsr));
 		if (init_event)
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index c805db6236b47..59726aaf46713 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -142,7 +142,7 @@ int mpx_fault_info(struct mpx_fault_info *info, struct pt_regs *regs)
 		goto err_out;
 	}
 	/* get bndregs field from current task's xsave area */
-	bndregs = get_xsave_field_ptr(XFEATURE_MASK_BNDREGS);
+	bndregs = get_xsave_field_ptr(XFEATURE_BNDREGS);
 	if (!bndregs) {
 		err = -EINVAL;
 		goto err_out;
@@ -190,7 +190,7 @@ static __user void *mpx_get_bounds_dir(void)
 	 * The bounds directory pointer is stored in a register
 	 * only accessible if we first do an xsave.
 	 */
-	bndcsr = get_xsave_field_ptr(XFEATURE_MASK_BNDCSR);
+	bndcsr = get_xsave_field_ptr(XFEATURE_BNDCSR);
 	if (!bndcsr)
 		return MPX_INVALID_BOUNDS_DIR;
 
@@ -376,7 +376,7 @@ static int do_mpx_bt_fault(void)
 	const struct mpx_bndcsr *bndcsr;
 	struct mm_struct *mm = current->mm;
 
-	bndcsr = get_xsave_field_ptr(XFEATURE_MASK_BNDCSR);
+	bndcsr = get_xsave_field_ptr(XFEATURE_BNDCSR);
 	if (!bndcsr)
 		return -EINVAL;
 	/*
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (10 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 11/27] x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use " Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-10 16:36   ` Borislav Petkov
  2019-04-13 20:54   ` [tip:x86/fpu] x86/pkeys: Provide *pkru() helpers tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 13/27] x86/fpu: Only write PKRU if it is different from current Sebastian Andrzej Siewior
                   ` (17 subsequent siblings)
  29 siblings, 2 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior, Dave Hansen

Dave Hansen has asked for __read_pkru() and __write_pkru() to be symmetrical.
As part of the series __write_pkru() will read back the value and only write it
if it is different.
In order to make both functions symmetrical move the function containing only
the opcode into a function with _isn() suffix. __write_pkru() will just invoke
__write_pkru_isn() but in a flowup patch will also read back the value.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/pgtable.h       |  2 +-
 arch/x86/include/asm/special_insns.h | 12 +++++++++---
 arch/x86/kvm/vmx/vmx.c               |  2 +-
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 2779ace16d23f..64333e5222cd9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -127,7 +127,7 @@ static inline int pte_dirty(pte_t pte)
 static inline u32 read_pkru(void)
 {
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return __read_pkru();
+		return __read_pkru_ins();
 	return 0;
 }
 
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 43c029cdc3fe8..27328606ff687 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -92,7 +92,7 @@ static inline void native_write_cr8(unsigned long val)
 #endif
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-static inline u32 __read_pkru(void)
+static inline u32 __read_pkru_ins(void)
 {
 	u32 ecx = 0;
 	u32 edx, pkru;
@@ -107,7 +107,7 @@ static inline u32 __read_pkru(void)
 	return pkru;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void __write_pkru_ins(u32 pkru)
 {
 	u32 ecx = 0, edx = 0;
 
@@ -118,8 +118,14 @@ static inline void __write_pkru(u32 pkru)
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
+
+static inline void __write_pkru(u32 pkru)
+{
+	__write_pkru_ins(pkru);
+}
+
 #else
-static inline u32 __read_pkru(void)
+static inline u32 __read_pkru_ins(void)
 {
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab432a930ae86..96dec81430970 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6501,7 +6501,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (static_cpu_has(X86_FEATURE_PKU) &&
 	    kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
-		vcpu->arch.pkru = __read_pkru();
+		vcpu->arch.pkru = __read_pkru_ins();
 		if (vcpu->arch.pkru != vmx->host_pkru)
 			__write_pkru(vmx->host_pkru);
 	}
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 13/27] x86/fpu: Only write PKRU if it is different from current
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (11 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:55   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 14/27] x86/pkeys: Don't check if PKRU is zero before writting it Sebastian Andrzej Siewior
                   ` (16 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

Dave Hansen says that the `wrpkru' is more expensive than `rdpkru'. It
has a higher cycle cost and it's also practically a (light) speculation
barrier.

As an optimisation read the current PKRU value and only write the new
one if it is different.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/special_insns.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 27328606ff687..28ffdf0c1add4 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -121,6 +121,12 @@ static inline void __write_pkru_ins(u32 pkru)
 
 static inline void __write_pkru(u32 pkru)
 {
+	/*
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
+	 */
+	if (pkru == __read_pkru_ins())
+		return;
 	__write_pkru_ins(pkru);
 }
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 14/27] x86/pkeys: Don't check if PKRU is zero before writting it
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (12 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 13/27] x86/fpu: Only write PKRU if it is different from current Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:55   ` [tip:x86/fpu] x86/pkeys: Don't check if PKRU is zero before writing it tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 15/27] x86/fpu: Eager switch PKRU state Sebastian Andrzej Siewior
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

write_pkru() checks if the current value is the same as the expected
value. So instead just checking if the current and new value is zero
(and skip the write in such a case) we can benefit from that.

Remove the zero check of PKRU, write_pkru() provides a similar check.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/mm/pkeys.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 05bb9a44eb1c3..50f65fc1b9a3f 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -142,13 +142,6 @@ u32 init_pkru_value = PKRU_AD_KEY( 1) | PKRU_AD_KEY( 2) | PKRU_AD_KEY( 3) |
 void copy_init_pkru_to_fpregs(void)
 {
 	u32 init_pkru_value_snapshot = READ_ONCE(init_pkru_value);
-	/*
-	 * Any write to PKRU takes it out of the XSAVE 'init
-	 * state' which increases context switch cost.  Avoid
-	 * writing 0 when PKRU was already 0.
-	 */
-	if (!init_pkru_value_snapshot && !read_pkru())
-		return;
 	/*
 	 * Override the PKRU state that came from 'init_fpstate'
 	 * with the baseline from the process.
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 15/27] x86/fpu: Eager switch PKRU state
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (13 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 14/27] x86/pkeys: Don't check if PKRU is zero before writting it Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:56   ` [tip:x86/fpu] " tip-bot for Rik van Riel
  2019-04-03 16:41 ` [PATCH 16/27] x86/entry: Add TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

From: Rik van Riel <riel@surriel.com>

While most of a task's FPU state is only needed in user space, the
protection keys need to be in place immediately after a context switch.

The reason is that any access to userspace memory while running in
kernel mode also need to abide by the memory permissions specified in
the protection keys.

The "eager switch" is a preparation for loading the FPU state on return
to userland. Instead of decoupling PKRU state from xstate I update PKRU
within xstate on write operations by the kernel.

The read/write_pkru() is moved to another header file so it can easily
accessed from pgtable.h and fpu/internal.h.

For user tasks we should always get the PKRU from the xsave area and it
should not change anything because the PKRU value was loaded as part of
FPU restore.
For kernel threads we now will have the default "init_pkru_value"
written. Before this commit the kernel thread would end up with a
random value which it inherited from the previous user task.

Signed-off-by: Rik van Riel <riel@surriel.com>
[bigeasy: save pkru to xstate, no cache, don't use __raw_xsave_addr()]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h | 24 ++++++++++++++++++++++--
 arch/x86/include/asm/fpu/xstate.h   |  1 +
 arch/x86/include/asm/pgtable.h      |  6 ++++++
 arch/x86/mm/pkeys.c                 |  1 -
 4 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 237ff8dc8559b..82ff84a4c4ab7 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -14,6 +14,7 @@
 #include <linux/compat.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/mm.h>
 
 #include <asm/user.h>
 #include <asm/fpu/api.h>
@@ -531,8 +532,27 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 {
-	if (static_cpu_has(X86_FEATURE_FPU))
-		__fpregs_load_activate(new_fpu, cpu);
+	struct pkru_state *pk;
+	u32 pkru_val = init_pkru_value;
+
+	if (!static_cpu_has(X86_FEATURE_FPU))
+		return;
+
+	__fpregs_load_activate(new_fpu, cpu);
+
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	/*
+	 * PKRU state is switched eagerly because it needs to be valid before we
+	 * return to userland e.g. for a copy_to_user() operation.
+	 */
+	if (current->mm) {
+		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
+		if (pk)
+			pkru_val = pk->pkru;
+	}
+	__write_pkru(pkru_val);
 }
 
 /*
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index fbe41f808e5d8..4e18a837223ff 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -5,6 +5,7 @@
 #include <linux/types.h>
 #include <asm/processor.h>
 #include <linux/uaccess.h>
+#include <asm/user.h>
 
 /* Bit 63 of XCR0 is reserved for future expansion */
 #define XFEATURE_MASK_EXTEND	(~(XFEATURE_MASK_FPSSE | (1ULL << 63)))
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 64333e5222cd9..1df90feeea5c6 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1355,6 +1355,12 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
 #define PKRU_WD_BIT 0x2
 #define PKRU_BITS_PER_PKEY 2
 
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+extern u32 init_pkru_value;
+#else
+#define init_pkru_value	0
+#endif
+
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
 {
 	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 50f65fc1b9a3f..2ecbf4155f98f 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -126,7 +126,6 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot, int pkey
  * in the process's lifetime will not accidentally get access
  * to data which is pkey-protected later on.
  */
-static
 u32 init_pkru_value = PKRU_AD_KEY( 1) | PKRU_AD_KEY( 2) | PKRU_AD_KEY( 3) |
 		      PKRU_AD_KEY( 4) | PKRU_AD_KEY( 5) | PKRU_AD_KEY( 6) |
 		      PKRU_AD_KEY( 7) | PKRU_AD_KEY( 8) | PKRU_AD_KEY( 9) |
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 16/27] x86/entry: Add TIF_NEED_FPU_LOAD
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (14 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 15/27] x86/fpu: Eager switch PKRU state Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:57   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 17/27] x86/fpu: Always store the registers in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
                   ` (13 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

Add TIF_NEED_FPU_LOAD. This is reserved for loading the FPU registers
before returning to userland. This flag must not be set for systems
without a FPU.
If this flag is cleared, the CPU's FPU register hold the current content
of current()'s FPU register. The in-memory copy (union fpregs_state) is
not valid.
If this flag is set, then all of CPU's FPU register may hold a random
value (except for PKRU) and it is required to load the content of the
FPU register on return to userland.

It is introduced now, so we can add code handling it now before adding
the main feature.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h | 6 ++++++
 arch/x86/include/asm/thread_info.h  | 2 ++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 82ff84a4c4ab7..b12874b7cf0cf 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -507,6 +507,12 @@ static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
  *
  * The FPU context is only stored/restore for user task and ->mm is used to
  * distinguish between kernel and user threads.
+ *
+ * If TIF_NEED_FPU_LOAD is cleared then the CPU's FPU registers are saved in
+ * the current thread's FPU registers state.
+ * If TIF_NEED_FPU_LOAD is set then CPU's FPU registers may not hold current()'s
+ * FPU registers. It is required to load the registers before returning to
+ * userland or using the content otherwise.
  */
 static inline void
 switch_fpu_prepare(struct fpu *old_fpu, int cpu)
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index e0eccbcb8447d..f9453536f9bbc 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -88,6 +88,7 @@ struct thread_info {
 #define TIF_USER_RETURN_NOTIFY	11	/* notify kernel of userspace return */
 #define TIF_UPROBE		12	/* breakpointed or singlestepping */
 #define TIF_PATCH_PENDING	13	/* pending live patching update */
+#define TIF_NEED_FPU_LOAD	14	/* load FPU on return to userspace */
 #define TIF_NOCPUID		15	/* CPUID is not accessible in userland */
 #define TIF_NOTSC		16	/* TSC is not accessible in userland */
 #define TIF_IA32		17	/* IA32 compatibility process */
@@ -117,6 +118,7 @@ struct thread_info {
 #define _TIF_USER_RETURN_NOTIFY	(1 << TIF_USER_RETURN_NOTIFY)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
+#define _TIF_NEED_FPU_LOAD	(1 << TIF_NEED_FPU_LOAD)
 #define _TIF_NOCPUID		(1 << TIF_NOCPUID)
 #define _TIF_NOTSC		(1 << TIF_NOTSC)
 #define _TIF_IA32		(1 << TIF_IA32)
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 17/27] x86/fpu: Always store the registers in copy_fpstate_to_sigframe()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (15 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 16/27] x86/entry: Add TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:57   ` [tip:x86/fpu] " tip-bot for Rik van Riel
  2019-04-03 16:41 ` [PATCH 18/27] x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

From: Rik van Riel <riel@surriel.com>

copy_fpstate_to_sigframe() stores the registers directly to user space.
This is okay because the FPU register are valid and saving it directly
avoids saving it into kernel memory and making a copy.
However… We can't keep doing this if we are going to restore the FPU
registers on the return to userland. It is possible that the FPU
registers will be invalidated in the middle of the save operation and
this should be done with disabled preemption / BH.

Save the FPU registers to task's FPU struct and copy them to the user
memory later on.

This code is extracted from an earlier version of the patchset while
there still was lazy-FPU on x86.

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 6475320939ce3..e565881678a87 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -144,8 +144,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  *	buf == buf_fx for 64-bit frames and 32-bit fsave frame.
  *	buf != buf_fx for 32-bit frames with fxstate.
  *
- * Save the state directly to the user frame pointed by the aligned pointer
- * 'buf_fx'.
+ * Save the state to task's fpu->state and then copy it to the user frame
+ * pointed by the aligned pointer 'buf_fx'.
  *
  * If this is a 32-bit frame with fxstate, put a fsave header before
  * the aligned state at 'buf_fx'.
@@ -155,6 +155,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  */
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
+	struct fpu *fpu = &current->thread.fpu;
+	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -169,9 +171,16 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	/* Save the live register state to the user directly. */
-	if (copy_fpregs_to_sigframe(buf_fx))
-		return -1;
+	copy_fpregs_to_fpstate(fpu);
+
+	if (using_compacted_format()) {
+		if (copy_xstate_to_user(buf_fx, xsave, 0, size))
+			return -1;
+	} else {
+		fpstate_sanitize_xstate(fpu);
+		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
+			return -1;
+	}
 
 	/* Save the fsave header for the 32-bit frames. */
 	if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 18/27] x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (16 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 17/27] x86/fpu: Always store the registers in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 20:58   ` [tip:x86/fpu] " tip-bot for Rik van Riel
  2019-04-03 16:41 ` [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru() Sebastian Andrzej Siewior
                   ` (11 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

From: Rik van Riel <riel@surriel.com>

The FPU registers need only to be saved if TIF_NEED_FPU_LOAD is not set.
Otherwise this has been already done and can be skipped.

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index e565881678a87..c98c7bca13bc0 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -171,7 +171,17 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	copy_fpregs_to_fpstate(fpu);
+	fpregs_lock();
+	/*
+	 * If we do not need to load the FPU registers at return to userspace
+	 * then the CPU has the current state and we need to save it. Otherwise
+	 * it is already done and we can skip it.
+	 */
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+		copy_fpregs_to_fpstate(fpu);
+		set_thread_flag(TIF_NEED_FPU_LOAD);
+	}
+	fpregs_unlock();
 
 	if (using_compacted_format()) {
 		if (copy_xstate_to_user(buf_fx, xsave, 0, size))
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (17 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 18/27] x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-08 18:14   ` Dave Hansen
  2019-04-13 20:59   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 20/27] x86/fpu: Inline copy_user_to_fpregs_zeroing() Sebastian Andrzej Siewior
                   ` (10 subsequent siblings)
  29 siblings, 2 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

During the context switch the xstate is loaded which also includes the
PKRU value.
If xstate is restored on return to userland it is required that the
PKRU value in xstate is the same as the one in the CPU.

Save the PKRU in xstate during modification.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/pgtable.h | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 1df90feeea5c6..58a3a68e1f114 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,6 +23,8 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
+#include <asm/fpu/xstate.h>
+#include <asm/fpu/api.h>
 
 extern pgd_t early_top_pgt[PTRS_PER_PGD];
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
@@ -133,8 +135,22 @@ static inline u32 read_pkru(void)
 
 static inline void write_pkru(u32 pkru)
 {
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		__write_pkru(pkru);
+	struct pkru_state *pk;
+
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return;
+
+	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
+	/*
+	 * The PKRU value in xstate needs to be in sync with the value that is
+	 * written to the CPU. The FPU restore on return to userland would
+	 * otherwise load the previous value again.
+	 */
+	fpregs_lock();
+	if (pk)
+		pk->pkru = pkru;
+	__write_pkru(pkru);
+	fpregs_unlock();
 }
 
 static inline int pte_young(pte_t pte)
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 20/27] x86/fpu: Inline copy_user_to_fpregs_zeroing()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (18 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 21:00   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 21/27] x86/fpu: Let __fpu__restore_sig() restore the !32bit+fxsr frame from kernel memory Sebastian Andrzej Siewior
                   ` (9 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

Start refactoring __fpu__restore_sig() by inlining
copy_user_to_fpregs_zeroing(). The orignal function remains and will be
used to restore from userland memory if possible.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index c98c7bca13bc0..f733425c66d8d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -337,11 +337,29 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		kfree(tmp);
 		return err;
 	} else {
+		int ret;
+
 		/*
 		 * For 64-bit frames and 32-bit fsave frames, restore the user
 		 * state to the registers directly (with exceptions handled).
 		 */
-		if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
+		if (use_xsave()) {
+			if ((unsigned long)buf_fx % 64 || fx_only) {
+				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				ret = copy_user_to_fxregs(buf_fx);
+			} else {
+				u64 init_bv = xfeatures_mask & ~xfeatures;
+				if (unlikely(init_bv))
+					copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				ret = copy_user_to_xregs(buf_fx, xfeatures);
+			}
+		} else if (use_fxsr()) {
+			ret = copy_user_to_fxregs(buf_fx);
+		} else
+			ret = copy_user_to_fregs(buf_fx);
+
+		if (ret) {
 			fpu__clear(fpu);
 			return -1;
 		}
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 21/27] x86/fpu: Let __fpu__restore_sig() restore the !32bit+fxsr frame from kernel memory
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (19 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 20/27] x86/fpu: Inline copy_user_to_fpregs_zeroing() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 21:00   ` [tip:x86/fpu] x86/fpu: Restore from kernel memory on the 64-bit path too tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 22/27] x86/fpu: Merge the two code paths in __fpu__restore_sig() Sebastian Andrzej Siewior
                   ` (8 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

The !32bit+fxsr case loads the new state from user memory. In case we
restore the FPU state on return to userland we can't do this. It would
be required to disable preemption in order to avoid a context switch
which would set TIF_NEED_FPU_LOAD. If this happens before the "restore"
operation then the loaded registers would become volatile.

Disabling preemption while accessing user memory requires to disable the
pagefault handler. An error during XRSTOR would then mean that either a
page fault occured (and we have to retry with enabled page fault
handler) or a #GP occured because the xstate is bogus (after all the
sig-handler can modify it).

In order to avoid that mess, copy the FPU state from userland, validate
it and then load it. The copy_users_…() helper are basically the old
helper except that they operate on kernel memory and the fault handler
just sets the error value and the caller handles it.

copy_user_to_fpregs_zeroing() and its helpers remain and will be used
later for a fastpath optimisation.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h | 43 ++++++++++++++++++++
 arch/x86/kernel/fpu/signal.c        | 62 +++++++++++++++++++++++------
 2 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index b12874b7cf0cf..037472d0be140 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -121,6 +121,21 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu);
 	err;								\
 })
 
+#define kernel_insn_err(insn, output, input...)				\
+({									\
+	int err;							\
+	asm volatile("1:" #insn "\n\t"					\
+		     "2:\n"						\
+		     ".section .fixup,\"ax\"\n"				\
+		     "3:  movl $-1,%[err]\n"				\
+		     "    jmp  2b\n"					\
+		     ".previous\n"					\
+		     _ASM_EXTABLE(1b, 3b)				\
+		     : [err] "=r" (err), output				\
+		     : "0"(0), input);					\
+	err;								\
+})
+
 #define kernel_insn(insn, output, input...)				\
 	asm volatile("1:" #insn "\n\t"					\
 		     "2:\n"						\
@@ -149,6 +164,14 @@ static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
 		kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
+static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
+{
+	if (IS_ENABLED(CONFIG_X86_32))
+		return kernel_insn_err(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
+	else
+		return kernel_insn_err(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
+}
+
 static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
@@ -162,6 +185,11 @@ static inline void copy_kernel_to_fregs(struct fregs_state *fx)
 	kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
+static inline int copy_kernel_to_fregs_err(struct fregs_state *fx)
+{
+	return kernel_insn_err(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
+}
+
 static inline int copy_user_to_fregs(struct fregs_state __user *fx)
 {
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -361,6 +389,21 @@ static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
 	return err;
 }
 
+/*
+ * Restore xstate from kernel space xsave area, return an error code instead an
+ * exception.
+ */
+static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
+{
+	u32 lmask = mask;
+	u32 hmask = mask >> 32;
+	int err;
+
+	XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
+
+	return err;
+}
+
 /*
  * These must be called with preempt disabled. Returns
  * 'true' if the FPU state is still intact and we can
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index f733425c66d8d..8ee54a1958308 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -234,7 +234,8 @@ sanitize_restored_xstate(union fpregs_state *state,
 		 */
 		xsave->i387.mxcsr &= mxcsr_feature_mask;
 
-		convert_to_fxsr(&state->fxsave, ia32_env);
+		if (ia32_env)
+			convert_to_fxsr(&state->fxsave, ia32_env);
 	}
 }
 
@@ -337,28 +338,63 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		kfree(tmp);
 		return err;
 	} else {
+		union fpregs_state *state;
+		void *tmp;
 		int ret;
 
+		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+		state = PTR_ALIGN(tmp, 64);
+
 		/*
 		 * For 64-bit frames and 32-bit fsave frames, restore the user
 		 * state to the registers directly (with exceptions handled).
 		 */
-		if (use_xsave()) {
-			if ((unsigned long)buf_fx % 64 || fx_only) {
+		if ((unsigned long)buf_fx % 64)
+			fx_only = 1;
+
+		if (use_xsave() && !fx_only) {
+			u64 init_bv = xfeatures_mask & ~xfeatures;
+
+			if (using_compacted_format()) {
+				ret = copy_user_to_xstate(&state->xsave, buf_fx);
+			} else {
+				ret = __copy_from_user(&state->xsave, buf_fx, state_size);
+
+				if (!ret && state_size > offsetof(struct xregs_state, header))
+					ret = validate_xstate_header(&state->xsave.header);
+			}
+			if (ret)
+				goto err_out;
+			sanitize_restored_xstate(state, NULL, xfeatures,
+						 fx_only);
+
+			if (unlikely(init_bv))
+				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
+
+		} else if (use_fxsr()) {
+			ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
+			if (ret)
+				goto err_out;
+
+			if (use_xsave()) {
 				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
 				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-				ret = copy_user_to_fxregs(buf_fx);
-			} else {
-				u64 init_bv = xfeatures_mask & ~xfeatures;
-				if (unlikely(init_bv))
-					copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-				ret = copy_user_to_xregs(buf_fx, xfeatures);
 			}
-		} else if (use_fxsr()) {
-			ret = copy_user_to_fxregs(buf_fx);
-		} else
-			ret = copy_user_to_fregs(buf_fx);
+			state->fxsave.mxcsr &= mxcsr_feature_mask;
 
+			ret = copy_kernel_to_fxregs_err(&state->fxsave);
+		} else {
+			ret = __copy_from_user(&state->fsave, buf_fx, state_size);
+			if (ret)
+				goto err_out;
+			ret = copy_kernel_to_fregs_err(&state->fsave);
+		}
+
+err_out:
+		kfree(tmp);
 		if (ret) {
 			fpu__clear(fpu);
 			return -1;
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 22/27] x86/fpu: Merge the two code paths in __fpu__restore_sig()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (20 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 21/27] x86/fpu: Let __fpu__restore_sig() restore the !32bit+fxsr frame from kernel memory Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 21:01   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace Sebastian Andrzej Siewior
                   ` (7 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

The ia32_fxstate case (32bit with fxsr) and the other (64bit, 32bit
without fxsr) restore both from kernel memory and sanitize the content.
The !ia32_fxstate version restores missing xstates from "init state"
while the ia32_fxstate doesn't and skips it.

Merge the two code paths and keep the !ia32_fxstate version. Copy only
the user_i387_ia32_struct data structure in the ia32_fxstate.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 146 ++++++++++++++---------------------
 1 file changed, 57 insertions(+), 89 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 8ee54a1958308..2421fb17f643d 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -263,12 +263,17 @@ static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_
 
 static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 {
+	struct user_i387_ia32_struct *envp = NULL;
 	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 	int state_size = fpu_kernel_xstate_size;
+	struct user_i387_ia32_struct env;
+	union fpregs_state *state;
 	u64 xfeatures = 0;
 	int fx_only = 0;
+	int ret = 0;
+	void *tmp;
 
 	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
 			 IS_ENABLED(CONFIG_IA32_EMULATION));
@@ -303,105 +308,68 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		}
 	}
 
-	if (ia32_fxstate) {
-		/*
-		 * For 32-bit frames with fxstate, copy the user state to the
-		 * thread's fpu state, reconstruct fxstate from the fsave
-		 * header. Validate and sanitize the copied state.
-		 */
-		struct user_i387_ia32_struct env;
-		union fpregs_state *state;
-		int err = 0;
-		void *tmp;
+	tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
+	if (!tmp)
+		return -ENOMEM;
+	state = PTR_ALIGN(tmp, 64);
 
-		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
-		if (!tmp)
-			return -ENOMEM;
-		state = PTR_ALIGN(tmp, 64);
+	if ((unsigned long)buf_fx % 64)
+		fx_only = 1;
+
+	/*
+	 * For 32-bit frames with fxstate, copy the fxstate so it can be
+	 * reconstructed later.
+	 */
+	if (ia32_fxstate) {
+		ret = __copy_from_user(&env, buf, sizeof(env));
+		if (ret)
+			goto err_out;
+		envp = &env;
+	}
+	if (use_xsave() && !fx_only) {
+		u64 init_bv = xfeatures_mask & ~xfeatures;
 
 		if (using_compacted_format()) {
-			err = copy_user_to_xstate(&state->xsave, buf_fx);
+			ret = copy_user_to_xstate(&state->xsave, buf_fx);
 		} else {
-			err = __copy_from_user(&state->xsave, buf_fx, state_size);
+			ret = __copy_from_user(&state->xsave, buf_fx, state_size);
 
-			if (!err && state_size > offsetof(struct xregs_state, header))
-				err = validate_xstate_header(&state->xsave.header);
+			if (!ret && state_size > offsetof(struct xregs_state, header))
+				ret = validate_xstate_header(&state->xsave.header);
+		}
+		if (ret)
+			goto err_out;
+
+		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+
+		if (unlikely(init_bv))
+			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+		ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
+
+	} else if (use_fxsr()) {
+		ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
+		if (ret)
+			goto err_out;
+
+		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+		if (use_xsave()) {
+			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
 		}
 
-		if (err || __copy_from_user(&env, buf, sizeof(env))) {
-			err = -1;
-		} else {
-			sanitize_restored_xstate(state, &env, xfeatures, fx_only);
-			copy_kernel_to_fpregs(state);
-		}
-
-		kfree(tmp);
-		return err;
+		ret = copy_kernel_to_fxregs_err(&state->fxsave);
 	} else {
-		union fpregs_state *state;
-		void *tmp;
-		int ret;
-
-		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
-		if (!tmp)
-			return -ENOMEM;
-		state = PTR_ALIGN(tmp, 64);
-
-		/*
-		 * For 64-bit frames and 32-bit fsave frames, restore the user
-		 * state to the registers directly (with exceptions handled).
-		 */
-		if ((unsigned long)buf_fx % 64)
-			fx_only = 1;
-
-		if (use_xsave() && !fx_only) {
-			u64 init_bv = xfeatures_mask & ~xfeatures;
-
-			if (using_compacted_format()) {
-				ret = copy_user_to_xstate(&state->xsave, buf_fx);
-			} else {
-				ret = __copy_from_user(&state->xsave, buf_fx, state_size);
-
-				if (!ret && state_size > offsetof(struct xregs_state, header))
-					ret = validate_xstate_header(&state->xsave.header);
-			}
-			if (ret)
-				goto err_out;
-			sanitize_restored_xstate(state, NULL, xfeatures,
-						 fx_only);
-
-			if (unlikely(init_bv))
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-			ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
-
-		} else if (use_fxsr()) {
-			ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
-			if (ret)
-				goto err_out;
-
-			if (use_xsave()) {
-				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-			}
-			state->fxsave.mxcsr &= mxcsr_feature_mask;
-
-			ret = copy_kernel_to_fxregs_err(&state->fxsave);
-		} else {
-			ret = __copy_from_user(&state->fsave, buf_fx, state_size);
-			if (ret)
-				goto err_out;
-			ret = copy_kernel_to_fregs_err(&state->fsave);
-		}
-
-err_out:
-		kfree(tmp);
-		if (ret) {
-			fpu__clear(fpu);
-			return -1;
-		}
+		ret = __copy_from_user(&state->fsave, buf_fx, state_size);
+		if (ret)
+			goto err_out;
+		ret = copy_kernel_to_fregs_err(&state->fsave);
 	}
 
-	return 0;
+err_out:
+	kfree(tmp);
+	if (ret)
+		fpu__clear(fpu);
+	return ret;
 }
 
 static inline int xstate_sigframe_size(void)
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (21 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 22/27] x86/fpu: Merge the two code paths in __fpu__restore_sig() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-12 14:36   ` Borislav Petkov
  2019-04-13 21:02   ` [tip:x86/fpu] " tip-bot for Rik van Riel
  2019-04-03 16:41 ` [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig() Sebastian Andrzej Siewior
                   ` (6 subsequent siblings)
  29 siblings, 2 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

From: Rik van Riel <riel@surriel.com>

Defer loading of FPU state until return to userspace. This gives
the kernel the potential to skip loading FPU state for tasks that
stay in kernel mode, or for tasks that end up with repeated
invocations of kernel_fpu_begin() & kernel_fpu_end().

The __fpregs_changes_{begin|end}() section ensures that the register
remain unchanged. Otherwise a context switch or a BH could save the
registers to its FPU context and processor's FPU register would became
random if beeing modified at the same time.

KVM swaps the host/guest register on entry/exit path. I kept the flow as
is. First it ensures that the registers are loaded and then saves the
current (host) state before it loads the guest's register. The swap is
done at the very end with disabled interrupts so it should not change
anymore before theg guest is entered. The read/save version seems to be
cheaper compared to memcpy() in a micro benchmark.

Each thread gets TIF_NEED_FPU_LOAD set as part of fork() / fpu__copy().
For kernel threads, this flag gets never cleared which avoids saving /
restoring the FPU state for kernel threads and during in-kernel usage of
the FPU register.

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/entry/common.c             |   8 +++
 arch/x86/include/asm/fpu/api.h      |  22 +++++-
 arch/x86/include/asm/fpu/internal.h |  27 ++++---
 arch/x86/include/asm/trace/fpu.h    |   5 +-
 arch/x86/kernel/fpu/core.c          | 105 +++++++++++++++++++++-------
 arch/x86/kernel/fpu/signal.c        |  48 ++++++++-----
 arch/x86/kernel/process.c           |   2 +-
 arch/x86/kernel/process_32.c        |   5 +-
 arch/x86/kernel/process_64.c        |   5 +-
 arch/x86/kvm/x86.c                  |  20 ++++--
 10 files changed, 181 insertions(+), 66 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 7bc105f47d21a..13e8e29af6ab7 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -31,6 +31,7 @@
 #include <asm/vdso.h>
 #include <linux/uaccess.h>
 #include <asm/cpufeature.h>
+#include <asm/fpu/api.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
@@ -196,6 +197,13 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
 	if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
 		exit_to_usermode_loop(regs, cached_flags);
 
+	/* Reload ti->flags; we may have rescheduled above. */
+	cached_flags = READ_ONCE(ti->flags);
+
+	fpregs_assert_state_consistent();
+	if (unlikely(cached_flags & _TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+
 #ifdef CONFIG_COMPAT
 	/*
 	 * Compat syscalls set TS_COMPAT.  Make sure we clear it before
diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index 73e684160f354..fc3dd044e16dd 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -10,7 +10,7 @@
 
 #ifndef _ASM_X86_FPU_API_H
 #define _ASM_X86_FPU_API_H
-#include <linux/preempt.h>
+#include <linux/bottom_half.h>
 
 /*
  * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
@@ -22,17 +22,37 @@
 extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 extern bool irq_fpu_usable(void);
+extern void fpregs_mark_activate(void);
 
+/*
+ * Use fpregs_lock() while editing CPU's FPU registers or fpu->state.
+ * A context switch will (and softirq might) save CPU's FPU register to
+ * fpu->state and set TIF_NEED_FPU_LOAD leaving CPU's FPU registers in a random
+ * state.
+ */
 static inline void fpregs_lock(void)
 {
 	preempt_disable();
+	local_bh_disable();
 }
 
 static inline void fpregs_unlock(void)
 {
+	local_bh_enable();
 	preempt_enable();
 }
 
+#ifdef CONFIG_X86_DEBUG_FPU
+extern void fpregs_assert_state_consistent(void);
+#else
+static inline void fpregs_assert_state_consistent(void) { }
+#endif
+
+/*
+ * Load the task FPU state before returning to userspace.
+ */
+extern void switch_fpu_return(void);
+
 /*
  * Query the presence of one or more xfeatures. Works on any legacy CPU as well.
  *
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 037472d0be140..7a143b86d3a65 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -30,7 +30,7 @@ extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
-extern int  fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
+extern int  fpu__copy(struct task_struct *dst, struct task_struct *src);
 extern void fpu__clear(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 extern int  dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);
@@ -528,13 +528,20 @@ static inline void fpregs_activate(struct fpu *fpu)
 	trace_x86_fpu_regs_activated(fpu);
 }
 
-static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
+static inline void __fpregs_load_activate(void)
 {
+	struct fpu *fpu = &current->thread.fpu;
+	int cpu = smp_processor_id();
+
+	if (WARN_ON_ONCE(current->mm == NULL))
+		return;
+
 	if (!fpregs_state_valid(fpu, cpu)) {
-		if (current->mm)
-			copy_kernel_to_fpregs(&fpu->state);
+		copy_kernel_to_fpregs(&fpu->state);
 		fpregs_activate(fpu);
+		fpu->last_cpu = cpu;
 	}
+	clear_thread_flag(TIF_NEED_FPU_LOAD);
 }
 
 /*
@@ -545,8 +552,8 @@ static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
  *  - switch_fpu_prepare() saves the old state.
  *    This is done within the context of the old process.
  *
- *  - switch_fpu_finish() restores the new state as
- *    necessary.
+ *  - switch_fpu_finish() sets TIF_NEED_FPU_LOAD; the floating point state
+ *    will get loaded on return to userspace, or when the kernel needs it.
  *
  * The FPU context is only stored/restore for user task and ->mm is used to
  * distinguish between kernel and user threads.
@@ -576,10 +583,10 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 
 /*
- * Set up the userspace FPU context for the new task, if the task
- * has used the FPU.
+ * Load PKRU from the FPU context if available. Delay loading the loading of the
+ * complete FPU state until the return to userland.
  */
-static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
+static inline void switch_fpu_finish(struct fpu *new_fpu)
 {
 	struct pkru_state *pk;
 	u32 pkru_val = init_pkru_value;
@@ -587,7 +594,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return;
 
-	__fpregs_load_activate(new_fpu, cpu);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
 
 	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index bd65f6ba950f8..91a1422091ceb 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -13,19 +13,22 @@ DECLARE_EVENT_CLASS(x86_fpu,
 
 	TP_STRUCT__entry(
 		__field(struct fpu *, fpu)
+		__field(bool, load_fpu)
 		__field(u64, xfeatures)
 		__field(u64, xcomp_bv)
 		),
 
 	TP_fast_assign(
 		__entry->fpu		= fpu;
+		__entry->load_fpu	= test_thread_flag(TIF_NEED_FPU_LOAD);
 		if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
 			__entry->xfeatures = fpu->state.xsave.header.xfeatures;
 			__entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
 		}
 	),
-	TP_printk("x86/fpu: %p xfeatures: %llx xcomp_bv: %llx",
+	TP_printk("x86/fpu: %p load: %d xfeatures: %llx xcomp_bv: %llx",
 			__entry->fpu,
+			__entry->load_fpu,
 			__entry->xfeatures,
 			__entry->xcomp_bv
 	)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 739ca3ae2bdcd..7d68404a88d52 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -102,23 +102,20 @@ static void __kernel_fpu_begin(void)
 	kernel_fpu_disable();
 
 	if (current->mm) {
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
-		copy_fpregs_to_fpstate(fpu);
-	} else {
-		__cpu_invalidate_fpregs_state();
+		if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+			set_thread_flag(TIF_NEED_FPU_LOAD);
+			/*
+			 * Ignore return value -- we don't care if reg state
+			 * is clobbered.
+			 */
+			copy_fpregs_to_fpstate(fpu);
+		}
 	}
+	__cpu_invalidate_fpregs_state();
 }
 
 static void __kernel_fpu_end(void)
 {
-	struct fpu *fpu = &current->thread.fpu;
-
-	if (current->mm)
-		copy_kernel_to_fpregs(&fpu->state);
-
 	kernel_fpu_enable();
 }
 
@@ -145,14 +142,17 @@ void fpu__save(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
-	preempt_disable();
+	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!copy_fpregs_to_fpstate(fpu))
-		copy_kernel_to_fpregs(&fpu->state);
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+		if (!copy_fpregs_to_fpstate(fpu)) {
+			copy_kernel_to_fpregs(&fpu->state);
+		}
+	}
 
 	trace_x86_fpu_after_save(fpu);
-	preempt_enable();
+	fpregs_unlock();
 }
 EXPORT_SYMBOL_GPL(fpu__save);
 
@@ -185,8 +185,11 @@ void fpstate_init(union fpregs_state *state)
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
-int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
+int fpu__copy(struct task_struct *dst, struct task_struct *src)
 {
+	struct fpu *dst_fpu = &dst->thread.fpu;
+	struct fpu *src_fpu = &src->thread.fpu;
+
 	dst_fpu->last_cpu = -1;
 
 	if (!static_cpu_has(X86_FEATURE_FPU))
@@ -201,16 +204,23 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
 
 	/*
-	 * Save current FPU registers directly into the child
-	 * FPU context, without any memory-to-memory copying.
+	 * If the FPU registers are not current just memcpy() the state.
+	 * Otherwise save current FPU registers directly into the child's FPU
+	 * context, without any memory-to-memory copying.
 	 *
 	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to copy them back. )
+	 *   register contents so we have to load them back. )
 	 */
-	if (!copy_fpregs_to_fpstate(dst_fpu)) {
-		memcpy(&src_fpu->state, &dst_fpu->state, fpu_kernel_xstate_size);
-		copy_kernel_to_fpregs(&src_fpu->state);
-	}
+	fpregs_lock();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
+
+	else if (!copy_fpregs_to_fpstate(dst_fpu))
+		copy_kernel_to_fpregs(&dst_fpu->state);
+
+	fpregs_unlock();
+
+	set_tsk_thread_flag(dst, TIF_NEED_FPU_LOAD);
 
 	trace_x86_fpu_copy_src(src_fpu);
 	trace_x86_fpu_copy_dst(dst_fpu);
@@ -226,10 +236,9 @@ static void fpu__initialize(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
+	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpstate_init(&fpu->state);
 	trace_x86_fpu_init_state(fpu);
-
-	trace_x86_fpu_activate_state(fpu);
 }
 
 /*
@@ -308,6 +317,8 @@ void fpu__drop(struct fpu *fpu)
  */
 static inline void copy_init_fpstate_to_fpregs(void)
 {
+	fpregs_lock();
+
 	if (use_xsave())
 		copy_kernel_to_xregs(&init_fpstate.xsave, -1);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
@@ -317,6 +328,9 @@ static inline void copy_init_fpstate_to_fpregs(void)
 
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
+
+	fpregs_mark_activate();
+	fpregs_unlock();
 }
 
 /*
@@ -339,6 +353,45 @@ void fpu__clear(struct fpu *fpu)
 		copy_init_fpstate_to_fpregs();
 }
 
+/*
+ * Load FPU context before returning to userspace.
+ */
+void switch_fpu_return(void)
+{
+	if (!static_cpu_has(X86_FEATURE_FPU))
+		return;
+
+	__fpregs_load_activate();
+}
+EXPORT_SYMBOL_GPL(switch_fpu_return);
+
+#ifdef CONFIG_X86_DEBUG_FPU
+/*
+ * If current FPU state according to its tracking (loaded FPU ctx on this CPU)
+ * is not valid then we must have TIF_NEED_FPU_LOAD set so the context is loaded on
+ * return to userland.
+ */
+void fpregs_assert_state_consistent(void)
+{
+       struct fpu *fpu = &current->thread.fpu;
+
+       if (test_thread_flag(TIF_NEED_FPU_LOAD))
+               return;
+       WARN_ON_FPU(!fpregs_state_valid(fpu, smp_processor_id()));
+}
+EXPORT_SYMBOL_GPL(fpregs_assert_state_consistent);
+#endif
+
+void fpregs_mark_activate(void)
+{
+	struct fpu *fpu = &current->thread.fpu;
+
+	fpregs_activate(fpu);
+	fpu->last_cpu = smp_processor_id();
+	clear_thread_flag(TIF_NEED_FPU_LOAD);
+}
+EXPORT_SYMBOL_GPL(fpregs_mark_activate);
+
 /*
  * x87 math exception handling:
  */
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 2421fb17f643d..a5b086ec426a5 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -269,11 +269,9 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	struct fpu *fpu = &tsk->thread.fpu;
 	int state_size = fpu_kernel_xstate_size;
 	struct user_i387_ia32_struct env;
-	union fpregs_state *state;
 	u64 xfeatures = 0;
 	int fx_only = 0;
 	int ret = 0;
-	void *tmp;
 
 	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
 			 IS_ENABLED(CONFIG_IA32_EMULATION));
@@ -308,14 +306,18 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		}
 	}
 
-	tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
-	if (!tmp)
-		return -ENOMEM;
-	state = PTR_ALIGN(tmp, 64);
+	/*
+	 * The current state of the FPU registers does not matter. By setting
+	 * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
+	 * is not modified on context switch and that the xstate is considered
+	 * to loaded again on return to userland (overriding last_cpu avoids the
+	 * optimisation).
+	 */
+	set_thread_flag(TIF_NEED_FPU_LOAD);
+	__fpu_invalidate_fpregs_state(fpu);
 
 	if ((unsigned long)buf_fx % 64)
 		fx_only = 1;
-
 	/*
 	 * For 32-bit frames with fxstate, copy the fxstate so it can be
 	 * reconstructed later.
@@ -330,43 +332,51 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		u64 init_bv = xfeatures_mask & ~xfeatures;
 
 		if (using_compacted_format()) {
-			ret = copy_user_to_xstate(&state->xsave, buf_fx);
+			ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
 		} else {
-			ret = __copy_from_user(&state->xsave, buf_fx, state_size);
+			ret = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
 
 			if (!ret && state_size > offsetof(struct xregs_state, header))
-				ret = validate_xstate_header(&state->xsave.header);
+				ret = validate_xstate_header(&fpu->state.xsave.header);
 		}
 		if (ret)
 			goto err_out;
 
-		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+		sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
 
+		fpregs_lock();
 		if (unlikely(init_bv))
 			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-		ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
+		ret = copy_kernel_to_xregs_err(&fpu->state.xsave, xfeatures);
 
 	} else if (use_fxsr()) {
-		ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
-		if (ret)
+		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
+		if (ret) {
+			ret = -EFAULT;
 			goto err_out;
+		}
 
-		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+		sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
+
+		fpregs_lock();
 		if (use_xsave()) {
 			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
 			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
 		}
 
-		ret = copy_kernel_to_fxregs_err(&state->fxsave);
+		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
 	} else {
-		ret = __copy_from_user(&state->fsave, buf_fx, state_size);
+		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)
 			goto err_out;
-		ret = copy_kernel_to_fregs_err(&state->fsave);
+		fpregs_lock();
+		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
 	}
+	if (!ret)
+		fpregs_mark_activate();
+	fpregs_unlock();
 
 err_out:
-	kfree(tmp);
 	if (ret)
 		fpu__clear(fpu);
 	return ret;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 58ac7be52c7a6..b9b8e6eb74f27 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -101,7 +101,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	dst->thread.vm86 = NULL;
 #endif
 
-	return fpu__copy(&dst->thread.fpu, &src->thread.fpu);
+	return fpu__copy(dst, src);
 }
 
 /*
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 77d9eb43ccac8..1bc47f3a48854 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -234,7 +234,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
-	switch_fpu_prepare(prev_fpu, cpu);
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_prepare(prev_fpu, cpu);
 
 	/*
 	 * Save away %gs. No need to save %fs, as it was saved on the
@@ -290,7 +291,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	this_cpu_write(current_task, next_p);
 
-	switch_fpu_finish(next_fpu, cpu);
+	switch_fpu_finish(next_fpu);
 
 	/* Load the Intel cache allocation PQR MSR. */
 	resctrl_sched_in();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ffea7c557963a..37b2ecef041e6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -520,7 +520,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
 		     this_cpu_read(irq_count) != -1);
 
-	switch_fpu_prepare(prev_fpu, cpu);
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_prepare(prev_fpu, cpu);
 
 	/* We must save %fs and %gs before load_TLS() because
 	 * %fs and %gs may be cleared by load_TLS().
@@ -572,7 +573,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	this_cpu_write(current_task, next_p);
 	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
 
-	switch_fpu_finish(next_fpu, cpu);
+	switch_fpu_finish(next_fpu);
 
 	/* Reload sp0. */
 	update_task_stack(next_p);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8022e7769b3a1..e340c3c0cba32 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7877,6 +7877,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		wait_lapic_expire(vcpu);
 	guest_enter_irqoff();
 
+	fpregs_assert_state_consistent();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+
 	if (unlikely(vcpu->arch.switch_db_regs)) {
 		set_debugreg(0, 7);
 		set_debugreg(vcpu->arch.eff_db[0], 0);
@@ -8137,22 +8141,30 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
 /* Swap (qemu) user FPU context for the guest FPU context. */
 static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 {
-	preempt_disable();
+	fpregs_lock();
+
 	copy_fpregs_to_fpstate(&current->thread.fpu);
 	/* PKRU is separately restored in kvm_x86_ops->run.  */
 	__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state,
 				~XFEATURE_MASK_PKRU);
-	preempt_enable();
+
+	fpregs_mark_activate();
+	fpregs_unlock();
+
 	trace_kvm_fpu(1);
 }
 
 /* When vcpu_run ends, restore user space FPU context. */
 static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 {
-	preempt_disable();
+	fpregs_lock();
+
 	copy_fpregs_to_fpstate(vcpu->arch.guest_fpu);
 	copy_kernel_to_fpregs(&current->thread.fpu.state);
-	preempt_enable();
+
+	fpregs_mark_activate();
+	fpregs_unlock();
+
 	++vcpu->stat.fpu_reload;
 	trace_kvm_fpu(0);
 }
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (22 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-08 17:05   ` Thomas Gleixner
                     ` (2 more replies)
  2019-04-03 16:41 ` [PATCH 25/27] x86/fpu: Add a fastpath to copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
                   ` (5 subsequent siblings)
  29 siblings, 3 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

The previous commits refactor the restoration of the FPU registers so
that they can be loaded from in-kernel memory. This overhead can be
avoided if the load can be performed without a pagefault.

Attempt to restore FPU registers by invoking
copy_user_to_fpregs_zeroing(). If it fails try the slowpath which can handle
pagefaults.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a5b086ec426a5..f20e1d1fffa29 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -242,10 +242,10 @@ sanitize_restored_xstate(union fpregs_state *state,
 /*
  * Restore the extended state if present. Otherwise, restore the FP/SSE state.
  */
-static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
+static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 {
 	if (use_xsave()) {
-		if ((unsigned long)buf % 64 || fx_only) {
+		if (fx_only) {
 			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
 			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
 			return copy_user_to_fxregs(buf);
@@ -327,7 +327,19 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		if (ret)
 			goto err_out;
 		envp = &env;
+	} else {
+		fpregs_lock();
+		pagefault_disable();
+		ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
+		pagefault_enable();
+		if (!ret) {
+			fpregs_mark_activate();
+			fpregs_unlock();
+			return 0;
+		}
+		fpregs_unlock();
 	}
+
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask & ~xfeatures;
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 25/27] x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (23 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 21:03   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 26/27] x86/fpu: Restore FPU register in copy_fpstate_to_sigframe() in order to use the fastpath Sebastian Andrzej Siewior
                   ` (4 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

If the CPU holds the FPU register for the current task then we can try to save
them directly to the userland stack frame. This has to be done with the
pagefault disabled because we can't fault (while the FPU registers are locked)
and therefore the operation might fail.
If it fails try the slowpath which can handle faults.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index f20e1d1fffa29..baf1588d7060c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -144,8 +144,10 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  *	buf == buf_fx for 64-bit frames and 32-bit fsave frame.
  *	buf != buf_fx for 32-bit frames with fxstate.
  *
- * Save the state to task's fpu->state and then copy it to the user frame
- * pointed by the aligned pointer 'buf_fx'.
+ * Try to save it directly to the user frame with disabled page fault handler.
+ * If this fails then do the slow path where the FPU state is first saved to
+ * task's fpu->state and then copy it to the user frame pointed by the aligned
+ * pointer 'buf_fx'.
  *
  * If this is a 32-bit frame with fxstate, put a fsave header before
  * the aligned state at 'buf_fx'.
@@ -159,6 +161,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
+	int ret = -EFAULT;
 
 	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
 			 IS_ENABLED(CONFIG_IA32_EMULATION));
@@ -174,22 +177,29 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 	fpregs_lock();
 	/*
 	 * If we do not need to load the FPU registers at return to userspace
-	 * then the CPU has the current state and we need to save it. Otherwise
-	 * it is already done and we can skip it.
+	 * then the CPU has the current state. Try to save it directly to
+	 * userland's stack frame if it does not cause a pagefault. If it does,
+	 * try the slowpath.
 	 */
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		copy_fpregs_to_fpstate(fpu);
+		pagefault_disable();
+		ret = copy_fpregs_to_sigframe(buf_fx);
+		pagefault_enable();
+		if (ret)
+			copy_fpregs_to_fpstate(fpu);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	fpregs_unlock();
 
-	if (using_compacted_format()) {
-		if (copy_xstate_to_user(buf_fx, xsave, 0, size))
-			return -1;
-	} else {
-		fpstate_sanitize_xstate(fpu);
-		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
-			return -1;
+	if (ret) {
+		if (using_compacted_format()) {
+			if (copy_xstate_to_user(buf_fx, xsave, 0, size))
+				return -1;
+		} else {
+			fpstate_sanitize_xstate(fpu);
+			if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
+				return -1;
+		}
 	}
 
 	/* Save the fsave header for the 32-bit frames. */
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 26/27] x86/fpu: Restore FPU register in copy_fpstate_to_sigframe() in order to use the fastpath
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (24 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 25/27] x86/fpu: Add a fastpath to copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 21:04   ` [tip:x86/fpu] x86/fpu: Restore regs " tip-bot for Sebastian Andrzej Siewior
  2019-04-03 16:41 ` [PATCH 27/27] x86/pkeys: add PKRU value to init_fpstate Sebastian Andrzej Siewior
                   ` (3 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

If a task is scheduled out and receives a signal then it won't be able to take
the fastpath because the register aren't available. The slowpath is more
expensive compared to xrstor + xsave which usually succeeds.

Some clock_gettime() numbers from a bigger box with AVX512 during bootup:
- __fpregs_load_activate() takes 140ns - 350ns. If it was the most recent FPU
  context on the CPU then the optimisation in __fpregs_load_activate() will
  skip the load (which was disabled during the test).

- copy_fpregs_to_sigframe() takes 200ns - 450ns if it succeeds. On a
  pagefault it is 1.8us - 3us usually in the 2.6us area.

- The slowpath takes 1.5 - 6us. Usually in the 2.6us area.

My testcases (including lat_sig) take the fastpath without
__fpregs_load_activate(). I expect this to be the majority.

Since the slowpath is in the >1us area it makes sense to load the
registers and attempt to save them directly. The direct save may fail
but should only happen on the first invocation or after fork() while the
page is RO.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index baf1588d7060c..16f700d5b3a47 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -176,19 +176,20 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 
 	fpregs_lock();
 	/*
-	 * If we do not need to load the FPU registers at return to userspace
-	 * then the CPU has the current state. Try to save it directly to
-	 * userland's stack frame if it does not cause a pagefault. If it does,
-	 * try the slowpath.
+	 * Load the FPU register if they are not valid for the current task.
+	 * With a valid FPU state we can attempt to save the state directly to
+	 * userland's stack frame which will likely succeed. If it does not, do
+	 * the slowpath.
 	 */
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		pagefault_disable();
-		ret = copy_fpregs_to_sigframe(buf_fx);
-		pagefault_enable();
-		if (ret)
-			copy_fpregs_to_fpstate(fpu);
-		set_thread_flag(TIF_NEED_FPU_LOAD);
-	}
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		__fpregs_load_activate();
+
+	pagefault_disable();
+	ret = copy_fpregs_to_sigframe(buf_fx);
+	pagefault_enable();
+	if (ret && !test_thread_flag(TIF_NEED_FPU_LOAD))
+		copy_fpregs_to_fpstate(fpu);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 
 	if (ret) {
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [PATCH 27/27] x86/pkeys: add PKRU value to init_fpstate
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (25 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 26/27] x86/fpu: Restore FPU register in copy_fpstate_to_sigframe() in order to use the fastpath Sebastian Andrzej Siewior
@ 2019-04-03 16:41 ` Sebastian Andrzej Siewior
  2019-04-13 21:05   ` [tip:x86/fpu] x86/pkeys: Add " tip-bot for Sebastian Andrzej Siewior
  2019-04-04 14:01 ` [PATCH v9 00/27] x86: load FPU registers on return to userland David Laight
                   ` (2 subsequent siblings)
  29 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-03 16:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen,
	Sebastian Andrzej Siewior

The task's initiall PKRU value is set part for fpu__clear()/
copy_init_pkru_to_fpregs(). It is not part of init_fpstate.xsave and
instead it is set explictly.
If the user removes the PKRU state from XSAVE in the signal handler then
__fpu__restore_sig() will restore the missing bits from `init_fpstate'
and initialize the PKRU value to 0.

Add the `init_pkru_value' to `init_fpstate' so it is set to the init
value in such a case.

In theory we could drop copy_init_pkru_to_fpregs() because restoring the
PKRU at return-to-userland should be enough.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/cpu/common.c | 5 +++++
 arch/x86/mm/pkeys.c          | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cb28e98a0659a..352fa19e63110 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -372,6 +372,8 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
+	struct pkru_state *pk;
+
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@@ -382,6 +384,9 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 		return;
 
 	cr4_set_bits(X86_CR4_PKE);
+	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
+	if (pk)
+		pk->pkru = init_pkru_value;
 	/*
 	 * Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 2ecbf4155f98f..1dcfc91c8f0c3 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -18,6 +18,7 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
+#include <asm/fpu/internal.h>		/* init_fpstate			*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -161,6 +162,7 @@ static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
+	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@@ -183,6 +185,10 @@ static ssize_t init_pkru_write_file(struct file *file,
 		return -EINVAL;
 
 	WRITE_ONCE(init_pkru_value, new_init_pkru);
+	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
+	if (!pk)
+		return -EINVAL;
+	pk->pkru = new_init_pkru;
 	return count;
 }
 
-- 
2.20.1

^ permalink raw reply	[flat|nested] 83+ messages in thread

* RE: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (26 preceding siblings ...)
  2019-04-03 16:41 ` [PATCH 27/27] x86/pkeys: add PKRU value to init_fpstate Sebastian Andrzej Siewior
@ 2019-04-04 14:01 ` David Laight
  2019-04-04 14:14   ` 'Sebastian Andrzej Siewior'
  2019-04-08 17:08 ` Thomas Gleixner
  2019-04-12 18:30 ` Borislav Petkov
  29 siblings, 1 reply; 83+ messages in thread
From: David Laight @ 2019-04-04 14:01 UTC (permalink / raw)
  To: 'Sebastian Andrzej Siewior', linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

From: Sebastian Andrzej Siewior
> Sent: 03 April 2019 17:41
...
> To access the FPU registers in kernel we need:
> - disable preemption to avoid that the scheduler switches tasks. By
>   doing so it would set TIF_NEED_FPU_LOAD and the FPU registers would be
>   not valid.
> - disable BH because the softirq might use kernel_fpu_begin() and then
>   set TIF_NEED_FPU_LOAD instead loading the FPU registers on completion.

Is there a possible optimisation here for kernel threads?
Since there is no 'user FP state' the 'kernel FP state' can
be saved by a task switch or softirq.

You'd still want the kernel thread to bracket fpu usage (or at least say
that it is a kernel thread that always needs the fpu) to avoid having
to save and restore the fpu registers on every context switch to/from
a kernel thread.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-04 14:01 ` [PATCH v9 00/27] x86: load FPU registers on return to userland David Laight
@ 2019-04-04 14:14   ` 'Sebastian Andrzej Siewior'
  2019-04-04 14:26     ` Andy Lutomirski
  0 siblings, 1 reply; 83+ messages in thread
From: 'Sebastian Andrzej Siewior' @ 2019-04-04 14:14 UTC (permalink / raw)
  To: David Laight
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-04 14:01:43 [+0000], David Laight wrote:
> From: Sebastian Andrzej Siewior
> > Sent: 03 April 2019 17:41
> ...
> > To access the FPU registers in kernel we need:
> > - disable preemption to avoid that the scheduler switches tasks. By
> >   doing so it would set TIF_NEED_FPU_LOAD and the FPU registers would be
> >   not valid.
> > - disable BH because the softirq might use kernel_fpu_begin() and then
> >   set TIF_NEED_FPU_LOAD instead loading the FPU registers on completion.
> 
> Is there a possible optimisation here for kernel threads?
> Since there is no 'user FP state' the 'kernel FP state' can
> be saved by a task switch or softirq.

There is no such thing as "kernel FP state" that is saved.

> You'd still want the kernel thread to bracket fpu usage (or at least say
> that it is a kernel thread that always needs the fpu) to avoid having
> to save and restore the fpu registers on every context switch to/from
> a kernel thread.

I thing you misunderstood the code. A context switch between two kernel
threads never saves/restores the FPU register. This only happens if a
user task is involved and then only those FPU register (of the user
task) are saved and restored.

> 	David

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-04 14:14   ` 'Sebastian Andrzej Siewior'
@ 2019-04-04 14:26     ` Andy Lutomirski
  2019-04-04 14:31       ` Sebastian Andrzej Siewior
  2019-04-04 15:10       ` David Laight
  0 siblings, 2 replies; 83+ messages in thread
From: Andy Lutomirski @ 2019-04-04 14:26 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: David Laight, linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Thu, Apr 4, 2019 at 7:14 AM Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2019-04-04 14:01:43 [+0000], David Laight wrote:
> > From: Sebastian Andrzej Siewior
> > > Sent: 03 April 2019 17:41
> > ...
> > > To access the FPU registers in kernel we need:
> > > - disable preemption to avoid that the scheduler switches tasks. By
> > >   doing so it would set TIF_NEED_FPU_LOAD and the FPU registers would be
> > >   not valid.
> > > - disable BH because the softirq might use kernel_fpu_begin() and then
> > >   set TIF_NEED_FPU_LOAD instead loading the FPU registers on completion.
> >
> > Is there a possible optimisation here for kernel threads?
> > Since there is no 'user FP state' the 'kernel FP state' can
> > be saved by a task switch or softirq.
>
> There is no such thing as "kernel FP state" that is saved.
>

I think that David was asking whether we could make kernel_fpu_begin()
regions sometimes be preemptible.  The answer is presumably yes, but I
think that should be a separate effort, and it should be justified
with improved performance above and beyond what we get with Jason's
simd_get() stuff.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-04 14:26     ` Andy Lutomirski
@ 2019-04-04 14:31       ` Sebastian Andrzej Siewior
  2019-04-04 15:10       ` David Laight
  1 sibling, 0 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-04 14:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Laight, linux-kernel, x86, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-04 07:26:37 [-0700], Andy Lutomirski wrote:
> I think that David was asking whether we could make kernel_fpu_begin()
> regions sometimes be preemptible.  The answer is presumably yes, but I
> think that should be a separate effort, and it should be justified
> with improved performance above and beyond what we get with Jason's
> simd_get() stuff.

Last time I checked we had no very SIMD loops unless you count $ALG over
1 MiB data or so. And in that case you could use the
kernel_fpu_begin()/end combo after a complete SIMD loop. I would like
see that at least while going from one page to the next.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* RE: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-04 14:26     ` Andy Lutomirski
  2019-04-04 14:31       ` Sebastian Andrzej Siewior
@ 2019-04-04 15:10       ` David Laight
  1 sibling, 0 replies; 83+ messages in thread
From: David Laight @ 2019-04-04 15:10 UTC (permalink / raw)
  To: 'Andy Lutomirski', Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

From: Sebastian Andrzej Siewior
> From: Andy Lutomirski [mailto:luto@kernel.org]
> Sent: 04 April 2019 15:27
> 
> On Thu, Apr 4, 2019 at 7:14 AM Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > On 2019-04-04 14:01:43 [+0000], David Laight wrote:
> > > From: Sebastian Andrzej Siewior
> > > > Sent: 03 April 2019 17:41
> > > ...
> > > > To access the FPU registers in kernel we need:
> > > > - disable preemption to avoid that the scheduler switches tasks. By
> > > >   doing so it would set TIF_NEED_FPU_LOAD and the FPU registers would be
> > > >   not valid.
> > > > - disable BH because the softirq might use kernel_fpu_begin() and then
> > > >   set TIF_NEED_FPU_LOAD instead loading the FPU registers on completion.
> > >
> > > Is there a possible optimisation here for kernel threads?
> > > Since there is no 'user FP state' the 'kernel FP state' can
> > > be saved by a task switch or softirq.
> >
> > There is no such thing as "kernel FP state" that is saved.
> >
> 
> I think that David was asking whether we could make kernel_fpu_begin()
> regions sometimes be preemptible.  The answer is presumably yes, but I
> think that should be a separate effort, and it should be justified
> with improved performance above and beyond what we get with Jason's
> simd_get() stuff.

Yep...

- Actually there are some loops that process more or less arbitrary
- amounts of data. But all of those somewhat manually break that up
- into page size chunks before checking if we've disabled preemption
- for too long, and then if so, do a end()begin() sequence before
- starting the next chunk. My simd_relax takes care of that, for
- example. Given that a long term purpose of this patchset is to
- obsolete the simd_get/put/relax API I've proposed, it seems like
- it might be nice to also do away with the manual relaxation
- requirement, if that's somehow possible.

The advantage of a 'relax' (or kernel_fpu_end()/begin() pair)
is that the kernel registers never need saving.
OTOH allowing arbitrary context switches is (probably) better
for latency. 

As well as kernel threads being able to use the tasks 'normal' fpu
save area, it ought to be possible to pass an 'fpu save area'
to kernel_fpu_begin() so that a non-kernel thread can be pre-empted
while using the fpu.

Actually you could (probably) just pass the address of a location
where the address of a kmalloc()ed save area would be written
were one needed and allocate the structure the first time it
is needed (although kmalloc() in that code path might be hard.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-03 16:41 ` [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig() Sebastian Andrzej Siewior
@ 2019-04-08 17:05   ` Thomas Gleixner
  2019-04-08 20:02     ` Sebastian Andrzej Siewior
  2019-04-12 17:17   ` Borislav Petkov
  2019-04-13 21:02   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2 siblings, 1 reply; 83+ messages in thread
From: Thomas Gleixner @ 2019-04-08 17:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Wed, 3 Apr 2019, Sebastian Andrzej Siewior wrote:

> The previous commits refactor the restoration of the FPU registers so
> that they can be loaded from in-kernel memory. This overhead can be
> avoided if the load can be performed without a pagefault.
> 
> Attempt to restore FPU registers by invoking
> copy_user_to_fpregs_zeroing(). If it fails try the slowpath which can handle
> pagefaults.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/x86/kernel/fpu/signal.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index a5b086ec426a5..f20e1d1fffa29 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -242,10 +242,10 @@ sanitize_restored_xstate(union fpregs_state *state,
>  /*
>   * Restore the extended state if present. Otherwise, restore the FP/SSE state.
>   */
> -static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
> +static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
>  {
>  	if (use_xsave()) {
> -		if ((unsigned long)buf % 64 || fx_only) {
> +		if (fx_only) {

This change is weird and not mentioned in the changelog....

>  			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
>  			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
>  			return copy_user_to_fxregs(buf);
> @@ -327,7 +327,19 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
>  		if (ret)
>  			goto err_out;
>  		envp = &env;
> +	} else {
> +		fpregs_lock();
> +		pagefault_disable();
> +		ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
> +		pagefault_enable();
> +		if (!ret) {
> +			fpregs_mark_activate();
> +			fpregs_unlock();
> +			return 0;
> +		}
> +		fpregs_unlock();
>  	}
> +
>  	if (use_xsave() && !fx_only) {
>  		u64 init_bv = xfeatures_mask & ~xfeatures;
>  
> -- 
> 2.20.1
> 
> 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (27 preceding siblings ...)
  2019-04-04 14:01 ` [PATCH v9 00/27] x86: load FPU registers on return to userland David Laight
@ 2019-04-08 17:08 ` Thomas Gleixner
  2019-04-12 18:30 ` Borislav Petkov
  29 siblings, 0 replies; 83+ messages in thread
From: Thomas Gleixner @ 2019-04-08 17:08 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Wed, 3 Apr 2019, Sebastian Andrzej Siewior wrote:

> This is a refurbished series originally started by by Rik van Riel. The
> goal is load the FPU registers on return to userland and not on every
> context switch. By this optimisation we can:
> - avoid loading the registers if the task stays in kernel and does
>   not return to userland
> - make kernel_fpu_begin() cheaper: it only saves the registers on the
>   first invocation. The second invocation does not need save them again.
> 
> To access the FPU registers in kernel we need:
> - disable preemption to avoid that the scheduler switches tasks. By
>   doing so it would set TIF_NEED_FPU_LOAD and the FPU registers would be
>   not valid.
> - disable BH because the softirq might use kernel_fpu_begin() and then
>   set TIF_NEED_FPU_LOAD instead loading the FPU registers on completion.

So aside of that one hunk in 24/27 which is either wrong or needs some
information in the changelog I couldn't find anything disturbing.

With that addressed:

     Reviewed-by: Thomas Gleixner <tglx@linutronix.de>


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru()
  2019-04-03 16:41 ` [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru() Sebastian Andrzej Siewior
@ 2019-04-08 18:14   ` Dave Hansen
  2019-04-08 20:03     ` Sebastian Andrzej Siewior
  2019-04-13 20:59   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  1 sibling, 1 reply; 83+ messages in thread
From: Dave Hansen @ 2019-04-08 18:14 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linux-kernel
  Cc: x86, Andy Lutomirski, Paolo Bonzini, Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 4/3/19 9:41 AM, Sebastian Andrzej Siewior wrote:
> During the context switch the xstate is loaded which also includes the
> PKRU value.
> If xstate is restored on return to userland it is required that the
> PKRU value in xstate is the same as the one in the CPU.

All of the protection keys bits in here look ok to me.  Thanks for the
sustained work to get all those into shape.  I know it's a special
snowflake.

Although I paid much closer attention to the PK bits, feel free to add:

Reviewed-by: Dave Hansen <dave.hansen@intel.com>

for the series.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-08 17:05   ` Thomas Gleixner
@ 2019-04-08 20:02     ` Sebastian Andrzej Siewior
  2019-04-09  7:27       ` Thomas Gleixner
  0 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-08 20:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-08 19:05:56 [+0200], Thomas Gleixner wrote:
> > diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> > index a5b086ec426a5..f20e1d1fffa29 100644
> > --- a/arch/x86/kernel/fpu/signal.c
> > +++ b/arch/x86/kernel/fpu/signal.c
> > @@ -242,10 +242,10 @@ sanitize_restored_xstate(union fpregs_state *state,
> >  /*
> >   * Restore the extended state if present. Otherwise, restore the FP/SSE state.
> >   */
> > -static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
> > +static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
> >  {
> >  	if (use_xsave()) {
> > -		if ((unsigned long)buf % 64 || fx_only) {
> > +		if (fx_only) {
> 
> This change is weird and not mentioned in the changelog....

if you scroll up there is this:
|          * to loaded again on return to userland (overriding last_cpu avoids the
|          * optimisation).
|          */
|         set_thread_flag(TIF_NEED_FPU_LOAD);
|         __fpu_invalidate_fpregs_state(fpu);
| 
|         if ((unsigned long)buf_fx % 64)
|                 fx_only = 1;
…
|                 ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
|                 pagefault_enable();


so I just removed that part because it was already done earlier.
Is it still weird and should be mentioned in the changelog?

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru()
  2019-04-08 18:14   ` Dave Hansen
@ 2019-04-08 20:03     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-08 20:03 UTC (permalink / raw)
  To: Dave Hansen
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-08 11:14:28 [-0700], Dave Hansen wrote:
> On 4/3/19 9:41 AM, Sebastian Andrzej Siewior wrote:
> > During the context switch the xstate is loaded which also includes the
> > PKRU value.
> > If xstate is restored on return to userland it is required that the
> > PKRU value in xstate is the same as the one in the CPU.
> 
> All of the protection keys bits in here look ok to me.  Thanks for the
> sustained work to get all those into shape.  I know it's a special
> snowflake.

> Although I paid much closer attention to the PK bits, feel free to add:
> 
> Reviewed-by: Dave Hansen <dave.hansen@intel.com>
> 
> for the series.

Thank you.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-08 20:02     ` Sebastian Andrzej Siewior
@ 2019-04-09  7:27       ` Thomas Gleixner
  0 siblings, 0 replies; 83+ messages in thread
From: Thomas Gleixner @ 2019-04-09  7:27 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen


[-- Attachment #1: Type: text/plain, Size: 1524 bytes --]

On Mon, 8 Apr 2019, Sebastian Andrzej Siewior wrote:

> On 2019-04-08 19:05:56 [+0200], Thomas Gleixner wrote:
> > > diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> > > index a5b086ec426a5..f20e1d1fffa29 100644
> > > --- a/arch/x86/kernel/fpu/signal.c
> > > +++ b/arch/x86/kernel/fpu/signal.c
> > > @@ -242,10 +242,10 @@ sanitize_restored_xstate(union fpregs_state *state,
> > >  /*
> > >   * Restore the extended state if present. Otherwise, restore the FP/SSE state.
> > >   */
> > > -static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
> > > +static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
> > >  {
> > >  	if (use_xsave()) {
> > > -		if ((unsigned long)buf % 64 || fx_only) {
> > > +		if (fx_only) {
> > 
> > This change is weird and not mentioned in the changelog....
> 
> if you scroll up there is this:
> |          * to loaded again on return to userland (overriding last_cpu avoids the
> |          * optimisation).
> |          */
> |         set_thread_flag(TIF_NEED_FPU_LOAD);
> |         __fpu_invalidate_fpregs_state(fpu);
> | 
> |         if ((unsigned long)buf_fx % 64)
> |                 fx_only = 1;
> …
> |                 ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
> |                 pagefault_enable();
> 
> 
> so I just removed that part because it was already done earlier.
> Is it still weird and should be mentioned in the changelog?

Bah. Staring at this for too long. All good ...

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions
  2019-04-03 16:41 ` [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions Sebastian Andrzej Siewior
@ 2019-04-10 16:36   ` Borislav Petkov
  2019-04-10 16:52     ` Borislav Petkov
  2019-04-13 20:54   ` [tip:x86/fpu] x86/pkeys: Provide *pkru() helpers tip-bot for Sebastian Andrzej Siewior
  1 sibling, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-10 16:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen, Dave Hansen

On Wed, Apr 03, 2019 at 06:41:41PM +0200, Sebastian Andrzej Siewior wrote:
> Dave Hansen has asked for __read_pkru() and __write_pkru() to be symmetrical.
> As part of the series __write_pkru() will read back the value and only write it
> if it is different.
> In order to make both functions symmetrical move the function containing only
> the opcode into a function with _isn() suffix. __write_pkru() will just invoke
> __write_pkru_isn() but in a flowup patch will also read back the value.
> 
> Suggested-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/x86/include/asm/pgtable.h       |  2 +-
>  arch/x86/include/asm/special_insns.h | 12 +++++++++---
>  arch/x86/kvm/vmx/vmx.c               |  2 +-
>  3 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 2779ace16d23f..64333e5222cd9 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -127,7 +127,7 @@ static inline int pte_dirty(pte_t pte)
>  static inline u32 read_pkru(void)
>  {
>  	if (boot_cpu_has(X86_FEATURE_OSPKE))
> -		return __read_pkru();
> +		return __read_pkru_ins();
>  	return 0;
>  }
>  
> diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
> index 43c029cdc3fe8..27328606ff687 100644
> --- a/arch/x86/include/asm/special_insns.h
> +++ b/arch/x86/include/asm/special_insns.h
> @@ -92,7 +92,7 @@ static inline void native_write_cr8(unsigned long val)
>  #endif
>  
>  #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
> -static inline u32 __read_pkru(void)
> +static inline u32 __read_pkru_ins(void)
>  {
>  	u32 ecx = 0;
>  	u32 edx, pkru;
> @@ -107,7 +107,7 @@ static inline u32 __read_pkru(void)
>  	return pkru;
>  }
>  
> -static inline void __write_pkru(u32 pkru)
> +static inline void __write_pkru_ins(u32 pkru)
>  {
>  	u32 ecx = 0, edx = 0;
> 

Well, this is going in the wrong direction. The proper thing to do would
be to have:

rdpkru()
wrpkru()

which only do the inline asm with the respective opcodes. Just like
rdtsc(), clflush(), etc.

IOW, the functions which correspond to the instructions should be called
just like the respective instructions they implement in asm.

Then, all the helpers around them should have different names to denote
what they do. Like the current rdpkru() which does the assertion
checking should be renamed to something like read_pkey_user() or so.
Ditto for the current wrpkru().

Makes sense?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions
  2019-04-10 16:36   ` Borislav Petkov
@ 2019-04-10 16:52     ` Borislav Petkov
  2019-04-10 21:25       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-10 16:52 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Dave Hansen
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Wed, Apr 10, 2019 at 06:36:15PM +0200, Borislav Petkov wrote:
> Well, this is going in the wrong direction. The proper thing to do would
> be to have:
> 
> rdpkru()
> wrpkru()
> 
> which only do the inline asm with the respective opcodes. Just like
> rdtsc(), clflush(), etc.

IOW, like this:

---
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 2779ace16d23..e2948ce1376c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -127,14 +127,14 @@ static inline int pte_dirty(pte_t pte)
 static inline u32 read_pkru(void)
 {
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return __read_pkru();
+		return rdpkru();
 	return 0;
 }
 
 static inline void write_pkru(u32 pkru)
 {
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		__write_pkru(pkru);
+		wrpkru(pkru);
 }
 
 static inline int pte_young(pte_t pte)
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 43c029cdc3fe..396ff572d66a 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -92,7 +92,7 @@ static inline void native_write_cr8(unsigned long val)
 #endif
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-static inline u32 __read_pkru(void)
+static inline u32 rdpkru(void)
 {
 	u32 ecx = 0;
 	u32 edx, pkru;
@@ -107,7 +107,7 @@ static inline u32 __read_pkru(void)
 	return pkru;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 	u32 ecx = 0, edx = 0;
 
@@ -119,12 +119,12 @@ static inline void __write_pkru(u32 pkru)
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
 #else
-static inline u32 __read_pkru(void)
+static inline u32 rdpkru(void)
 {
 	return 0;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 }
 #endif
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab432a930ae8..85260aa2a920 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6413,7 +6413,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	if (static_cpu_has(X86_FEATURE_PKU) &&
 	    kvm_read_cr4_bits(vcpu, X86_CR4_PKE) &&
 	    vcpu->arch.pkru != vmx->host_pkru)
-		__write_pkru(vcpu->arch.pkru);
+		wrpkru(vcpu->arch.pkru);
 
 	pt_guest_enter(vmx);
 
@@ -6501,9 +6501,9 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (static_cpu_has(X86_FEATURE_PKU) &&
 	    kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
-		vcpu->arch.pkru = __read_pkru();
+		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vmx->host_pkru)
-			__write_pkru(vmx->host_pkru);
+			wrpkru(vmx->host_pkru);
 	}
 
 	vmx->nested.nested_run_pending = 0;

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions
  2019-04-10 16:52     ` Borislav Petkov
@ 2019-04-10 21:25       ` Sebastian Andrzej Siewior
  2019-04-10 21:29         ` Dave Hansen
  0 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-10 21:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dave Hansen, linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-10 18:52:41 [+0200], Borislav Petkov wrote:
> On Wed, Apr 10, 2019 at 06:36:15PM +0200, Borislav Petkov wrote:
> > Well, this is going in the wrong direction. The proper thing to do would
> > be to have:
> > 
> > rdpkru()
> > wrpkru()
> > 
> > which only do the inline asm with the respective opcodes. Just like
> > rdtsc(), clflush(), etc.
> 
> IOW, like this:
> 
> ---
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 2779ace16d23..e2948ce1376c 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -127,14 +127,14 @@ static inline int pte_dirty(pte_t pte)
>  static inline u32 read_pkru(void)
>  {
>  	if (boot_cpu_has(X86_FEATURE_OSPKE))
> -		return __read_pkru();
> +		return rdpkru();
>  	return 0;
>  }
>  
>  static inline void write_pkru(u32 pkru)
>  {
>  	if (boot_cpu_has(X86_FEATURE_OSPKE))
> -		__write_pkru(pkru);
> +		wrpkru(pkru);

I think if this is a simple

   's@__write_pkru_ins@wrpkru@g'
   's@__read_pkru_ins@rdpkru@g'

then it should work just fine and match what Dave asked for.

>  }
>  
>  static inline int pte_young(pte_t pte)

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions
  2019-04-10 21:25       ` Sebastian Andrzej Siewior
@ 2019-04-10 21:29         ` Dave Hansen
  2019-04-11 13:24           ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Dave Hansen @ 2019-04-10 21:29 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Borislav Petkov
  Cc: Dave Hansen, linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel

On 4/10/19 2:25 PM, Sebastian Andrzej Siewior wrote:
>>  static inline void write_pkru(u32 pkru)
>>  {
>>  	if (boot_cpu_has(X86_FEATURE_OSPKE))
>> -		__write_pkru(pkru);
>> +		wrpkru(pkru);
> I think if this is a simple
> 
>    's@__write_pkru_ins@wrpkru@g'
>    's@__read_pkru_ins@rdpkru@g'
> 
> then it should work just fine and match what Dave asked for.

I'm fine with it either way, fwiw.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions
  2019-04-10 21:29         ` Dave Hansen
@ 2019-04-11 13:24           ` Borislav Petkov
  0 siblings, 0 replies; 83+ messages in thread
From: Borislav Petkov @ 2019-04-11 13:24 UTC (permalink / raw)
  To: Dave Hansen, Sebastian Andrzej Siewior
  Cc: Dave Hansen, linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel

On Wed, Apr 10, 2019 at 02:29:26PM -0700, Dave Hansen wrote:
> On 4/10/19 2:25 PM, Sebastian Andrzej Siewior wrote:
> >>  static inline void write_pkru(u32 pkru)
> >>  {
> >>  	if (boot_cpu_has(X86_FEATURE_OSPKE))
> >> -		__write_pkru(pkru);
> >> +		wrpkru(pkru);
> > I think if this is a simple
> > 
> >    's@__write_pkru_ins@wrpkru@g'
> >    's@__read_pkru_ins@rdpkru@g'
> > 
> > then it should work just fine and match what Dave asked for.
> 
> I'm fine with it either way, fwiw.

Final version:

---
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Wed, 3 Apr 2019 18:41:41 +0200
Subject: [PATCH] x86/pkru: Provide *pkru() helpers
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Dave Hansen asked for __read_pkru() and __write_pkru() to be
symmetrical.

As part of the series __write_pkru() will read back the value and only
write it if it is different.

In order to make both functions symmetrical, move the function
containing only the opcode asm into a function called like the
instruction itself.

__write_pkru() will just invoke wrpkru() but in a follow-up patch will
also read back the value.

 [ bp: Convert asm opcode wrapper names to rd/wrpkru(). ]

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-13-bigeasy@linutronix.de
---
 arch/x86/include/asm/pgtable.h       |  2 +-
 arch/x86/include/asm/special_insns.h | 12 +++++++++---
 arch/x86/kvm/vmx/vmx.c               |  2 +-
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 2779ace16d23..e8875ca75623 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -127,7 +127,7 @@ static inline int pte_dirty(pte_t pte)
 static inline u32 read_pkru(void)
 {
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return __read_pkru();
+		return rdpkru();
 	return 0;
 }
 
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 43c029cdc3fe..34897e2b52c9 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -92,7 +92,7 @@ static inline void native_write_cr8(unsigned long val)
 #endif
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-static inline u32 __read_pkru(void)
+static inline u32 rdpkru(void)
 {
 	u32 ecx = 0;
 	u32 edx, pkru;
@@ -107,7 +107,7 @@ static inline u32 __read_pkru(void)
 	return pkru;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 	u32 ecx = 0, edx = 0;
 
@@ -118,8 +118,14 @@ static inline void __write_pkru(u32 pkru)
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
+
+static inline void __write_pkru(u32 pkru)
+{
+	wrpkru(pkru);
+}
+
 #else
-static inline u32 __read_pkru(void)
+static inline u32 rdpkru(void)
 {
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab432a930ae8..dd1e1eea4f0b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6501,7 +6501,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (static_cpu_has(X86_FEATURE_PKU) &&
 	    kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
-		vcpu->arch.pkru = __read_pkru();
+		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vmx->host_pkru)
 			__write_pkru(vmx->host_pkru);
 	}
-- 
2.21.0


-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-03 16:41 ` [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace Sebastian Andrzej Siewior
@ 2019-04-12 14:36   ` Borislav Petkov
  2019-04-12 15:24     ` Sebastian Andrzej Siewior
  2019-04-13 21:02   ` [tip:x86/fpu] " tip-bot for Rik van Riel
  1 sibling, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-12 14:36 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Wed, Apr 03, 2019 at 06:41:52PM +0200, Sebastian Andrzej Siewior wrote:
> @@ -226,10 +236,9 @@ static void fpu__initialize(struct fpu *fpu)
>  {
>  	WARN_ON_FPU(fpu != &current->thread.fpu);
>  
> +	set_thread_flag(TIF_NEED_FPU_LOAD);
>  	fpstate_init(&fpu->state);
>  	trace_x86_fpu_init_state(fpu);
> -
> -	trace_x86_fpu_activate_state(fpu);

That is called nowhere after this patch.

Shouldn't it be called below, before fpregs_activate() because
fpregs_activate() does trace_x86_fpu_regs_activated()?

>  /*
> @@ -308,6 +317,8 @@ void fpu__drop(struct fpu *fpu)
>   */
>  static inline void copy_init_fpstate_to_fpregs(void)
>  {
> +	fpregs_lock();
> +
>  	if (use_xsave())
>  		copy_kernel_to_xregs(&init_fpstate.xsave, -1);
>  	else if (static_cpu_has(X86_FEATURE_FXSR))
> @@ -317,6 +328,9 @@ static inline void copy_init_fpstate_to_fpregs(void)
>  
>  	if (boot_cpu_has(X86_FEATURE_OSPKE))
>  		copy_init_pkru_to_fpregs();
> +
> +	fpregs_mark_activate();
> +	fpregs_unlock();
>  }
>  
>  /*
> @@ -339,6 +353,45 @@ void fpu__clear(struct fpu *fpu)
>  		copy_init_fpstate_to_fpregs();
>  }
>  
> +/*
> + * Load FPU context before returning to userspace.
> + */
> +void switch_fpu_return(void)
> +{
> +	if (!static_cpu_has(X86_FEATURE_FPU))
> +		return;
> +
> +	__fpregs_load_activate();
> +}
> +EXPORT_SYMBOL_GPL(switch_fpu_return);
> +
> +#ifdef CONFIG_X86_DEBUG_FPU
> +/*
> + * If current FPU state according to its tracking (loaded FPU ctx on this CPU)
> + * is not valid then we must have TIF_NEED_FPU_LOAD set so the context is loaded on
> + * return to userland.
> + */
> +void fpregs_assert_state_consistent(void)
> +{
> +       struct fpu *fpu = &current->thread.fpu;
> +
> +       if (test_thread_flag(TIF_NEED_FPU_LOAD))
> +               return;
> +       WARN_ON_FPU(!fpregs_state_valid(fpu, smp_processor_id()));
> +}
> +EXPORT_SYMBOL_GPL(fpregs_assert_state_consistent);
> +#endif
> +
> +void fpregs_mark_activate(void)
> +{
> +	struct fpu *fpu = &current->thread.fpu;
> +

<--- here?

> +	fpregs_activate(fpu);
> +	fpu->last_cpu = smp_processor_id();
> +	clear_thread_flag(TIF_NEED_FPU_LOAD);
> +}
> +EXPORT_SYMBOL_GPL(fpregs_mark_activate);

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 14:36   ` Borislav Petkov
@ 2019-04-12 15:24     ` Sebastian Andrzej Siewior
  2019-04-12 16:22       ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-12 15:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-12 16:36:15 [+0200], Borislav Petkov wrote:
> On Wed, Apr 03, 2019 at 06:41:52PM +0200, Sebastian Andrzej Siewior wrote:
> > @@ -226,10 +236,9 @@ static void fpu__initialize(struct fpu *fpu)
> >  {
> >  	WARN_ON_FPU(fpu != &current->thread.fpu);
> >  
> > +	set_thread_flag(TIF_NEED_FPU_LOAD);
> >  	fpstate_init(&fpu->state);
> >  	trace_x86_fpu_init_state(fpu);
> > -
> > -	trace_x86_fpu_activate_state(fpu);
> 
> That is called nowhere after this patch.

Isn't it called from fpu__clear()?

> Shouldn't it be called below, before fpregs_activate() because
> fpregs_activate() does trace_x86_fpu_regs_activated()?

Why? fpu__initialize() wipes the FPU state and starts from zero.
fpregs_mark_activate() on the other hand marks this FPU context is
currently active.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 15:24     ` Sebastian Andrzej Siewior
@ 2019-04-12 16:22       ` Borislav Petkov
  2019-04-12 16:37         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-12 16:22 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Fri, Apr 12, 2019 at 05:24:37PM +0200, Sebastian Andrzej Siewior wrote:
> Isn't it called from fpu__clear()?

$ git grep trace_x86_fpu_activate_state
$

all 23 patches applied. Grepping the later patches doesn't give
trace_x86_fpu_activate_state() either.

> > Shouldn't it be called below, before fpregs_activate() because
> > fpregs_activate() does trace_x86_fpu_regs_activated()?
> 
> Why? fpu__initialize() wipes the FPU state and starts from zero.
> fpregs_mark_activate() on the other hand marks this FPU context is
> currently active.

Well, then the naming still needs adjusting.

"trace_x86_fpu_activate_state" reads to me like the state is being
activated here, at the call site. And fpregs_mark_activate() marks which
*fpu is the active one.

Hmm.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 16:22       ` Borislav Petkov
@ 2019-04-12 16:37         ` Sebastian Andrzej Siewior
  2019-04-12 16:48           ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-12 16:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-12 18:22:13 [+0200], Borislav Petkov wrote:
> On Fri, Apr 12, 2019 at 05:24:37PM +0200, Sebastian Andrzej Siewior wrote:
> > Isn't it called from fpu__clear()?
> 
> $ git grep trace_x86_fpu_activate_state
> $
> 
> all 23 patches applied. Grepping the later patches doesn't give
> trace_x86_fpu_activate_state() either.
> 
> > > Shouldn't it be called below, before fpregs_activate() because
> > > fpregs_activate() does trace_x86_fpu_regs_activated()?
> > 
> > Why? fpu__initialize() wipes the FPU state and starts from zero.
> > fpregs_mark_activate() on the other hand marks this FPU context is
> > currently active.
> 
> Well, then the naming still needs adjusting.
> 
> "trace_x86_fpu_activate_state" reads to me like the state is being
> activated here, at the call site. And fpregs_mark_activate() marks which
> *fpu is the active one.

bah. You are referring to trace_x86_fpu_activate_state. I parsed this as
fpu__initialize(). Sorry for that.

trace_x86_fpu_activate_state is unused and we should do something about
it.  Adding it to fpregs_mark_activate() seems to make sense.
We we also have this:
 fpregs_mark_activate()
   fpregs_activate()
      trace_x86_fpu_regs_activated()

(as you mentioned) so we would always record both trace points.
Therefore I would suggest to remove it.
Maybe we could add a new one to __fpregs_load_activate() one in case we
avoid loading registers because of fpregs_state_valid(). This might make
sense.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 16:37         ` Sebastian Andrzej Siewior
@ 2019-04-12 16:48           ` Borislav Petkov
  2019-04-12 17:19             ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-12 16:48 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Fri, Apr 12, 2019 at 06:37:41PM +0200, Sebastian Andrzej Siewior wrote:
> (as you mentioned) so we would always record both trace points.
> Therefore I would suggest to remove it.

Remove which one?

Recording both TPs seems to make sense unless it doesn't make a whole
lotta sense to have:

fpregs_mark_activate
|-> trace_x86_fpu_activate_state		<-- TP1
|-> fpregs_activate
    |-> trace_x86_fpu_regs_activated		<-- TP2


Yeah, looks like the two are too much and too close for no good
reason. There's nothing particularly spectacular in-between in
fpregs_activate().

> Maybe we could add a new one to __fpregs_load_activate() one in case we
> avoid loading registers because of fpregs_state_valid(). This might make
> sense.

But that's only the switch_fpu_return() path. Is fpregs_mark_activate()
is going to use only the trace_x86_fpu_regs_activated() one? Note the
"d" at the end.

  [ Btw, those two names need adjusting too: who came up with such close,
     confusing names?!
     ]

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-03 16:41 ` [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig() Sebastian Andrzej Siewior
  2019-04-08 17:05   ` Thomas Gleixner
@ 2019-04-12 17:17   ` Borislav Petkov
  2019-04-12 17:27     ` Sebastian Andrzej Siewior
  2019-04-13 21:02   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
  2 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-12 17:17 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Wed, Apr 03, 2019 at 06:41:53PM +0200, Sebastian Andrzej Siewior wrote:
> The previous commits refactor the restoration of the FPU registers so
> that they can be loaded from in-kernel memory. This overhead can be
> avoided if the load can be performed without a pagefault.
> 
> Attempt to restore FPU registers by invoking
> copy_user_to_fpregs_zeroing(). If it fails try the slowpath which can handle
> pagefaults.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  arch/x86/kernel/fpu/signal.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index a5b086ec426a5..f20e1d1fffa29 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -242,10 +242,10 @@ sanitize_restored_xstate(union fpregs_state *state,
>  /*
>   * Restore the extended state if present. Otherwise, restore the FP/SSE state.
>   */
> -static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
> +static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
>  {
>  	if (use_xsave()) {
> -		if ((unsigned long)buf % 64 || fx_only) {
> +		if (fx_only) {
>  			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
>  			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
>  			return copy_user_to_fxregs(buf);
> @@ -327,7 +327,19 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
>  		if (ret)
>  			goto err_out;
>  		envp = &env;
> +	} else {

I've added here:

+               /*
+                * Attempt to restore the FPU registers directly from user
+                * memory. For that to succeed, the user accesses cannot cause
+                * page faults. If they do, fall back to the slow path below,
+                * going through the kernel buffer.
+                */

so that it is clear what's happening.

This function is doing gazillion things again ;-\

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 16:48           ` Borislav Petkov
@ 2019-04-12 17:19             ` Sebastian Andrzej Siewior
  2019-04-12 17:29               ` Borislav Petkov
  0 siblings, 1 reply; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-12 17:19 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-12 18:48:28 [+0200], Borislav Petkov wrote:
> On Fri, Apr 12, 2019 at 06:37:41PM +0200, Sebastian Andrzej Siewior wrote:
> > (as you mentioned) so we would always record both trace points.
> > Therefore I would suggest to remove it.
> 
> Remove which one?

remove x86_fpu_activate_state.

> Recording both TPs seems to make sense unless it doesn't make a whole
> lotta sense to have:
> 
> fpregs_mark_activate
> |-> trace_x86_fpu_activate_state		<-- TP1
> |-> fpregs_activate
>     |-> trace_x86_fpu_regs_activated		<-- TP2

> Yeah, looks like the two are too much and too close for no good
> reason. There's nothing particularly spectacular in-between in
> fpregs_activate().

correct.

> > Maybe we could add a new one to __fpregs_load_activate() one in case we
> > avoid loading registers because of fpregs_state_valid(). This might make
> > sense.
> 
> But that's only the switch_fpu_return() path. Is fpregs_mark_activate()
> is going to use only the trace_x86_fpu_regs_activated() one? Note the
> "d" at the end.

You could have a trace-point in case we return to userland and we don't
load FPU registers because it looks like they are valid.
It was just an idea.

We could also rip all trcepoints out, rethink the situation and add new
ones based on current code.
- on schedule out, we may need to save registers (depending on
  TIF_NEED_FPU_LOAD) which is new. Before the series we always did.

- on schedule in do nothing but set that TIF bit. This is probably
  boring.

- on return to userland we should load the registers but can avoid it if
  we assume that they are valid for the current task
  (fpregs_state_valid())

- in kernel_fpu_begin() we may trash the task's FPU state (by saving its
  registers or by resetting fpu_fpregs_owner_ctx).

Those might be interesting.

Currently we have:
  "x86/fpu: %p load: %d xfeatures: %llx xcomp_bv: %llx"

and we have to find out what happens based on where that TP was
recorded. Also I'm not sure if the recorded xfeatures change over time.
I think they do not…

>   [ Btw, those two names need adjusting too: who came up with such close,
>      confusing names?!
>      ]

you mean trace_x86_fpu_activate_state and trace_x86_fpu_regs_activated?
They were added in d1898b733619b ("x86/fpu: Add tracepoints to dump FPU
state at key points") and we wouldn't have any otherwise.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-12 17:17   ` Borislav Petkov
@ 2019-04-12 17:27     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-12 17:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-12 19:17:59 [+0200], Borislav Petkov wrote:
> > @@ -327,7 +327,19 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
> >  		if (ret)
> >  			goto err_out;
> >  		envp = &env;
> > +	} else {
> 
> I've added here:

I would suggest to add it in the last patch once we have the complete
fast path. The other one extended (in case you want to add a comment
there, too).

> +               /*
/**

> +                * Attempt to restore the FPU registers directly from user
> +                * memory. For that to succeed, the user accesses cannot cause

I would say "user access" because it is always a single instruction.

> +                * page faults. If they do, fall back to the slow path below,
> +                * going through the kernel buffer.

"with the enabled page fault handler".

> +                */
> 
> so that it is clear what's happening.
> 
> This function is doing gazillion things again ;-\

It is the old code with disabled page fault handler. But I understand.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 17:19             ` Sebastian Andrzej Siewior
@ 2019-04-12 17:29               ` Borislav Petkov
  2019-04-15  9:14                 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-12 17:29 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Fri, Apr 12, 2019 at 07:19:55PM +0200, Sebastian Andrzej Siewior wrote:
> remove x86_fpu_activate_state.

Ok, zapping it as part of this patch.

> We could also rip all trcepoints out, rethink the situation and add new
> ones based on current code.

Well, since this changes FPU regs handling considerably, I think the
only correct step would be to adjust the tracepoints. BUT(!) we should
not forget we're not supposed to break luserspace.

> - on schedule out, we may need to save registers (depending on
>   TIF_NEED_FPU_LOAD) which is new. Before the series we always did.

That makes sense.

> - on schedule in do nothing but set that TIF bit. This is probably
>   boring.

Yah.

> - on return to userland we should load the registers but can avoid it if
>   we assume that they are valid for the current task
>   (fpregs_state_valid())

That is interesting info.

> - in kernel_fpu_begin() we may trash the task's FPU state (by saving its
>   registers or by resetting fpu_fpregs_owner_ctx).

Do we care?

You mean, in case you have workloads which might involve a lot of
in-kernel FPU use which would punish task context switches?

> Those might be interesting.
> 
> Currently we have:
>   "x86/fpu: %p load: %d xfeatures: %llx xcomp_bv: %llx"
> 
> and we have to find out what happens based on where that TP was
> recorded. Also I'm not sure if the recorded xfeatures change over time.
> I think they do not…

Good question.

> you mean trace_x86_fpu_activate_state and trace_x86_fpu_regs_activated?
> They were added in d1898b733619b ("x86/fpu: Add tracepoints to dump FPU
> state at key points") and we wouldn't have any otherwise.

I guess it made sense then...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
                   ` (28 preceding siblings ...)
  2019-04-08 17:08 ` Thomas Gleixner
@ 2019-04-12 18:30 ` Borislav Petkov
  2019-04-15  8:58   ` Sebastian Andrzej Siewior
  29 siblings, 1 reply; 83+ messages in thread
From: Borislav Petkov @ 2019-04-12 18:30 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On Wed, Apr 03, 2019 at 06:41:29PM +0200, Sebastian Andrzej Siewior wrote:
> v7…v8:
> Rebased on top of v5.1-rc1. And then:

Ok, all applied and pushed here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-x86-fpu

Pls have a quick look if all is still ok and not fat-fingered by me. :)

Will let the 0day bot chew on it and I'll run it through my boxes too.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()
  2019-04-03 16:41 ` [PATCH 01/27] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() Sebastian Andrzej Siewior
@ 2019-04-13 20:46   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, linux-kernel, hpa, jannh, dave.hansen, mingo, mingo, x86,
	tglx, pbonzini, bigeasy, Jason, luto, rkrcmar, kvm, bp

Commit-ID:  39ea9baffda91df8bfee9b45610242a3191ea1ec
Gitweb:     https://git.kernel.org/tip/39ea9baffda91df8bfee9b45610242a3191ea1ec
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:30 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Tue, 9 Apr 2019 19:27:29 +0200

x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()

This is a preparation for the removal of the ->initialized member in the
fpu struct.

__fpu__restore_sig() is deactivating the FPU via fpu__drop() and then
setting manually ->initialized followed by fpu__restore(). The result is
that it is possible to manipulate fpu->state and the state of registers
won't be saved/restored on a context switch which would overwrite
fpu->state:

fpu__drop(fpu):
  ...
  fpu->initialized = 0;
  preempt_enable();

  <--- context switch

Don't access the fpu->state while the content is read from user space
and examined/sanitized. Use a temporary kmalloc() buffer for the
preparation of the FPU registers and once the state is considered okay,
load it. Should something go wrong, return with an error and without
altering the original FPU registers.

The removal of fpu__initialize() is a nop because fpu->initialized is
already set for the user task.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-2-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/signal.h |  2 +-
 arch/x86/kernel/fpu/regset.c      |  5 ++---
 arch/x86/kernel/fpu/signal.c      | 40 +++++++++++++++------------------------
 3 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 44bbc39a57b3..7fb516b6893a 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -22,7 +22,7 @@ int ia32_setup_frame(int sig, struct ksignal *ksig,
 
 extern void convert_from_fxsr(struct user_i387_ia32_struct *env,
 			      struct task_struct *tsk);
-extern void convert_to_fxsr(struct task_struct *tsk,
+extern void convert_to_fxsr(struct fxregs_state *fxsave,
 			    const struct user_i387_ia32_struct *env);
 
 unsigned long
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index bc02f5144b95..5dbc099178a8 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -269,11 +269,10 @@ convert_from_fxsr(struct user_i387_ia32_struct *env, struct task_struct *tsk)
 		memcpy(&to[i], &from[i], sizeof(to[0]));
 }
 
-void convert_to_fxsr(struct task_struct *tsk,
+void convert_to_fxsr(struct fxregs_state *fxsave,
 		     const struct user_i387_ia32_struct *env)
 
 {
-	struct fxregs_state *fxsave = &tsk->thread.fpu.state.fxsave;
 	struct _fpreg *from = (struct _fpreg *) &env->st_space[0];
 	struct _fpxreg *to = (struct _fpxreg *) &fxsave->st_space[0];
 	int i;
@@ -350,7 +349,7 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 
 	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &env, 0, -1);
 	if (!ret)
-		convert_to_fxsr(target, &env);
+		convert_to_fxsr(&target->thread.fpu.state.fxsave, &env);
 
 	/*
 	 * update the header bit in the xsave header, indicating the
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 55b80de13ea5..7296a9bb78e7 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -207,11 +207,11 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 }
 
 static inline void
-sanitize_restored_xstate(struct task_struct *tsk,
+sanitize_restored_xstate(union fpregs_state *state,
 			 struct user_i387_ia32_struct *ia32_env,
 			 u64 xfeatures, int fx_only)
 {
-	struct xregs_state *xsave = &tsk->thread.fpu.state.xsave;
+	struct xregs_state *xsave = &state->xsave;
 	struct xstate_header *header = &xsave->header;
 
 	if (use_xsave()) {
@@ -238,7 +238,7 @@ sanitize_restored_xstate(struct task_struct *tsk,
 		 */
 		xsave->i387.mxcsr &= mxcsr_feature_mask;
 
-		convert_to_fxsr(tsk, ia32_env);
+		convert_to_fxsr(&state->fxsave, ia32_env);
 	}
 }
 
@@ -284,8 +284,6 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	if (!access_ok(buf, size))
 		return -EACCES;
 
-	fpu__initialize(fpu);
-
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return fpregs_soft_set(current, NULL,
 				       0, sizeof(struct user_i387_ia32_struct),
@@ -315,40 +313,32 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 * header. Validate and sanitize the copied state.
 		 */
 		struct user_i387_ia32_struct env;
+		union fpregs_state *state;
 		int err = 0;
+		void *tmp;
 
-		/*
-		 * Drop the current fpu which clears fpu->initialized. This ensures
-		 * that any context-switch during the copy of the new state,
-		 * avoids the intermediate state from getting restored/saved.
-		 * Thus avoiding the new restored state from getting corrupted.
-		 * We will be ready to restore/save the state only after
-		 * fpu->initialized is again set.
-		 */
-		fpu__drop(fpu);
+		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+		state = PTR_ALIGN(tmp, 64);
 
 		if (using_compacted_format()) {
-			err = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
+			err = copy_user_to_xstate(&state->xsave, buf_fx);
 		} else {
-			err = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
+			err = __copy_from_user(&state->xsave, buf_fx, state_size);
 
 			if (!err && state_size > offsetof(struct xregs_state, header))
-				err = validate_xstate_header(&fpu->state.xsave.header);
+				err = validate_xstate_header(&state->xsave.header);
 		}
 
 		if (err || __copy_from_user(&env, buf, sizeof(env))) {
-			fpstate_init(&fpu->state);
-			trace_x86_fpu_init_state(fpu);
 			err = -1;
 		} else {
-			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
+			sanitize_restored_xstate(state, &env, xfeatures, fx_only);
+			copy_kernel_to_fpregs(state);
 		}
 
-		local_bh_disable();
-		fpu->initialized = 1;
-		fpu__restore(fpu);
-		local_bh_enable();
-
+		kfree(tmp);
 		return err;
 	} else {
 		/*

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Remove fpu__restore()
  2019-04-03 16:41 ` [PATCH 02/27] x86/fpu: Remove fpu__restore() Sebastian Andrzej Siewior
@ 2019-04-13 20:47   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:47 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jroedel, bp, Babu.Moger, jannh, linux-kernel, mingo, mingo,
	bigeasy, dave.hansen, x86, dima, pbonzini, nstange, luto,
	chang.seok.bae, aubrey.li, corbet, rkrcmar, tglx, hpa, Jason,
	riel, kvm

Commit-ID:  6dd677a044e606fd343e31c2108b13d74aec1ca5
Gitweb:     https://git.kernel.org/tip/6dd677a044e606fd343e31c2108b13d74aec1ca5
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:31 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Tue, 9 Apr 2019 19:27:42 +0200

x86/fpu: Remove fpu__restore()

There are no users of fpu__restore() so it is time to remove it. The
comment regarding fpu__restore() and TS bit is stale since commit

  b3b0870ef3ffe ("i387: do not preload FPU state at task switch time")

and has no meaning since.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-doc@vger.kernel.org
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-3-bigeasy@linutronix.de
---
 Documentation/preempt-locking.txt   |  1 -
 arch/x86/include/asm/fpu/internal.h |  1 -
 arch/x86/kernel/fpu/core.c          | 24 ------------------------
 arch/x86/kernel/process_32.c        |  4 +---
 arch/x86/kernel/process_64.c        |  4 +---
 5 files changed, 2 insertions(+), 32 deletions(-)

diff --git a/Documentation/preempt-locking.txt b/Documentation/preempt-locking.txt
index 509f5a422d57..dce336134e54 100644
--- a/Documentation/preempt-locking.txt
+++ b/Documentation/preempt-locking.txt
@@ -52,7 +52,6 @@ preemption must be disabled around such regions.
 
 Note, some FPU functions are already explicitly preempt safe.  For example,
 kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
-However, fpu__restore() must be called with preemption disabled.
 
 
 RULE #3: Lock acquire and release must be performed by same task
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index fb04a3ded7dd..75a1d5f712ee 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -28,7 +28,6 @@ extern void fpu__initialize(struct fpu *fpu);
 extern void fpu__prepare_read(struct fpu *fpu);
 extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
-extern void fpu__restore(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
 extern int  fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2e5003fef51a..1d3ae7988f7f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -303,30 +303,6 @@ void fpu__prepare_write(struct fpu *fpu)
 	}
 }
 
-/*
- * 'fpu__restore()' is called to copy FPU registers from
- * the FPU fpstate to the live hw registers and to activate
- * access to the hardware registers, so that FPU instructions
- * can be used afterwards.
- *
- * Must be called with kernel preemption disabled (for example
- * with local interrupts disabled, as it is in the case of
- * do_device_not_available()).
- */
-void fpu__restore(struct fpu *fpu)
-{
-	fpu__initialize(fpu);
-
-	/* Avoid __kernel_fpu_begin() right after fpregs_activate() */
-	kernel_fpu_disable();
-	trace_x86_fpu_before_restore(fpu);
-	fpregs_activate(fpu);
-	copy_kernel_to_fpregs(&fpu->state);
-	trace_x86_fpu_after_restore(fpu);
-	kernel_fpu_enable();
-}
-EXPORT_SYMBOL_GPL(fpu__restore);
-
 /*
  * Drops current FPU state: deactivates the fpregs and
  * the fpstate. NOTE: it still leaves previous contents
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index e471d8e6f0b2..7888a41a03cd 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -267,9 +267,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	/*
 	 * Leave lazy mode, flushing any hypercalls made here.
 	 * This must be done before restoring TLS segments so
-	 * the GDT and LDT are properly updated, and must be
-	 * done before fpu__restore(), so the TS bit is up
-	 * to date.
+	 * the GDT and LDT are properly updated.
 	 */
 	arch_end_context_switch(next_p);
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6a62f4af9fcf..e1983b3a16c4 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -538,9 +538,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	/*
 	 * Leave lazy mode, flushing any hypercalls made here.  This
 	 * must be done after loading TLS entries in the GDT but before
-	 * loading segments that might reference them, and and it must
-	 * be done before fpu__restore(), so the TS bit is up to
-	 * date.
+	 * loading segments that might reference them.
 	 */
 	arch_end_context_switch(next_p);
 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Remove preempt_disable() in fpu__clear()
  2019-04-03 16:41 ` [PATCH 03/27] x86/fpu: Remove preempt_disable() in fpu__clear() Sebastian Andrzej Siewior
@ 2019-04-13 20:48   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, mingo, luto, x86, tglx, dave.hansen, nstange, bigeasy,
	linux-kernel, hpa, bp, rkrcmar, Jason, pbonzini, riel, kvm

Commit-ID:  60e528d6ce3f60a058bbb64f8acb2a07f84b172a
Gitweb:     https://git.kernel.org/tip/60e528d6ce3f60a058bbb64f8acb2a07f84b172a
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:32 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Tue, 9 Apr 2019 19:27:46 +0200

x86/fpu: Remove preempt_disable() in fpu__clear()

The preempt_disable() section was introduced in commit

  a10b6a16cdad8 ("x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic")

and it was said to be temporary.

fpu__initialize() initializes the FPU struct to its initial value and
then sets ->initialized to 1. The last part is the important one.
The content of the state does not matter because it gets set via
copy_init_fpstate_to_fpregs().

A preemption here has little meaning because the registers will always be
set to the same content after copy_init_fpstate_to_fpregs(). A softirq
with a kernel_fpu_begin() could also force to save FPU's registers after
fpu__initialize() without changing the outcome here.

Remove the preempt_disable() section in fpu__clear(), preemption here
does not hurt.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-4-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/core.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1d3ae7988f7f..1940319268ae 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -366,11 +366,9 @@ void fpu__clear(struct fpu *fpu)
 	 * Make sure fpstate is cleared and initialized.
 	 */
 	if (static_cpu_has(X86_FEATURE_FPU)) {
-		preempt_disable();
 		fpu__initialize(fpu);
 		user_fpu_begin();
 		copy_init_fpstate_to_fpregs();
-		preempt_enable();
 	}
 }
 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Always init the state in fpu__clear()
  2019-04-03 16:41 ` [PATCH 04/27] x86/fpu: Always init the `state' " Sebastian Andrzej Siewior
@ 2019-04-13 20:48   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:48 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, bp, kvm, aubrey.li, linux-kernel, x86, tglx, mingo, rkrcmar,
	luto, bigeasy, pbonzini, jannh, mingo, Jason, nstange, riel,
	dave.hansen, billm

Commit-ID:  88f5260a3bf9bfb276b5b4aac2e81587e425a1d7
Gitweb:     https://git.kernel.org/tip/88f5260a3bf9bfb276b5b4aac2e81587e425a1d7
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:33 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Tue, 9 Apr 2019 19:28:06 +0200

x86/fpu: Always init the state in fpu__clear()

fpu__clear() only initializes the state if the CPU has FPU support.
This initialisation is also required for FPU-less systems and takes
place in math_emulate(). Since fpu__initialize() only performs the
initialization if ->initialized is zero it does not matter that it
is invoked each time an opcode is emulated. It makes the removal of
->initialized easier if the struct is also initialized in the FPU-less
case at the same time.

Move fpu__initialize() before the FPU feature check so it is also
performed in the FPU-less case too.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Bill Metzenthen <billm@melbpc.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-5-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 1 -
 arch/x86/kernel/fpu/core.c          | 5 ++---
 arch/x86/math-emu/fpu_entry.c       | 3 ---
 3 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 75a1d5f712ee..70ecb7c032cb 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -24,7 +24,6 @@
 /*
  * High level FPU state handling functions:
  */
-extern void fpu__initialize(struct fpu *fpu);
 extern void fpu__prepare_read(struct fpu *fpu);
 extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1940319268ae..e43296854e37 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -223,7 +223,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
  * Activate the current task's in-memory FPU context,
  * if it has not been used before:
  */
-void fpu__initialize(struct fpu *fpu)
+static void fpu__initialize(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
@@ -236,7 +236,6 @@ void fpu__initialize(struct fpu *fpu)
 		fpu->initialized = 1;
 	}
 }
-EXPORT_SYMBOL_GPL(fpu__initialize);
 
 /*
  * This function must be called before we read a task's fpstate.
@@ -365,8 +364,8 @@ void fpu__clear(struct fpu *fpu)
 	/*
 	 * Make sure fpstate is cleared and initialized.
 	 */
+	fpu__initialize(fpu);
 	if (static_cpu_has(X86_FEATURE_FPU)) {
-		fpu__initialize(fpu);
 		user_fpu_begin();
 		copy_init_fpstate_to_fpregs();
 	}
diff --git a/arch/x86/math-emu/fpu_entry.c b/arch/x86/math-emu/fpu_entry.c
index 9e2ba7e667f6..a873da6b46d6 100644
--- a/arch/x86/math-emu/fpu_entry.c
+++ b/arch/x86/math-emu/fpu_entry.c
@@ -113,9 +113,6 @@ void math_emulate(struct math_emu_info *info)
 	unsigned long code_base = 0;
 	unsigned long code_limit = 0;	/* Initialized to stop compiler warnings */
 	struct desc_struct code_descriptor;
-	struct fpu *fpu = &current->thread.fpu;
-
-	fpu__initialize(fpu);
 
 #ifdef RE_ENTRANT_CHECKING
 	if (emulating) {

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()
  2019-04-03 16:41 ` [PATCH 05/27] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
@ 2019-04-13 20:49   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jason, mingo, tglx, pbonzini, dave.hansen, rkrcmar, riel, bp,
	jannh, kvm, bigeasy, mingo, x86, linux-kernel, luto, hpa

Commit-ID:  fbcc9e0c37ba3186c41b5eb1aee2f7f3b711bc1e
Gitweb:     https://git.kernel.org/tip/fbcc9e0c37ba3186c41b5eb1aee2f7f3b711bc1e
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:34 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Tue, 9 Apr 2019 20:48:11 +0200

x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()

With lazy-FPU support the (now named variable) ->initialized was set to
true if the CPU's FPU registers were holding a valid state of the
FPU registers for the active process. If it was set to false then the
FPU state was saved in fpu->state and the FPU was deactivated.

With lazy-FPU gone, ->initialized is always true for user threads and
kernel threads never call this function so ->initialized is always true
in copy_fpstate_to_sigframe().

The using_compacted_format() check is also a leftover from the lazy-FPU
time. In the

  ->initialized == false

case copy_to_user() would copy the compacted buffer while userland would
expect the non-compacted format instead. So in order to save the FPU
state in the non-compacted form it issues XSAVE to save the *current*
FPU state.

If the FPU is not enabled, the attempt raises the FPU trap, the trap
restores the FPU contents and re-enables the FPU and XSAVE is invoked
again and succeeds.

*This* does not longer work since commit

  bef8b6da9522 ("x86/fpu: Handle #NM without FPU emulation as an error")

Remove the check for ->initialized because it is always true and remove
the false condition. Update the comment to reflect that the state is
always live.

 [ bp: Massage. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-6-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 35 ++++++++---------------------------
 1 file changed, 8 insertions(+), 27 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 7296a9bb78e7..c1a5999affa0 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -144,9 +144,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  *	buf == buf_fx for 64-bit frames and 32-bit fsave frame.
  *	buf != buf_fx for 32-bit frames with fxstate.
  *
- * If the fpu, extended register state is live, save the state directly
- * to the user frame pointed by the aligned pointer 'buf_fx'. Otherwise,
- * copy the thread's fpu state to the user frame starting at 'buf_fx'.
+ * Save the state directly to the user frame pointed by the aligned pointer
+ * 'buf_fx'.
  *
  * If this is a 32-bit frame with fxstate, put a fsave header before
  * the aligned state at 'buf_fx'.
@@ -157,7 +156,6 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
 	struct fpu *fpu = &current->thread.fpu;
-	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -172,29 +170,12 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	if (fpu->initialized || using_compacted_format()) {
-		/* Save the live register state to the user directly. */
-		if (copy_fpregs_to_sigframe(buf_fx))
-			return -1;
-		/* Update the thread's fxstate to save the fsave header. */
-		if (ia32_fxstate)
-			copy_fxregs_to_kernel(fpu);
-	} else {
-		/*
-		 * It is a *bug* if kernel uses compacted-format for xsave
-		 * area and we copy it out directly to a signal frame. It
-		 * should have been handled above by saving the registers
-		 * directly.
-		 */
-		if (boot_cpu_has(X86_FEATURE_XSAVES)) {
-			WARN_ONCE(1, "x86/fpu: saving compacted-format xsave area to a signal frame!\n");
-			return -1;
-		}
-
-		fpstate_sanitize_xstate(fpu);
-		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
-			return -1;
-	}
+	/* Save the live registers state to the user frame directly. */
+	if (copy_fpregs_to_sigframe(buf_fx))
+		return -1;
+	/* Update the thread's fxstate to save the fsave header. */
+	if (ia32_fxstate)
+		copy_fxregs_to_kernel(fpu);
 
 	/* Save the fsave header for the 32-bit frames. */
 	if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()
  2019-04-03 16:41 ` [PATCH 06/27] x86/fpu: Don't save fxregs for ia32 frames " Sebastian Andrzej Siewior
@ 2019-04-13 20:50   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, kvm, jannh, bigeasy, rkrcmar, hpa, mingo, luto,
	riel, Jason, tglx, pbonzini, x86, bp, mingo, dave.hansen

Commit-ID:  39388e80f9b0c3788bfb6efe3054bdce0c3ead45
Gitweb:     https://git.kernel.org/tip/39388e80f9b0c3788bfb6efe3054bdce0c3ead45
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:35 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Wed, 10 Apr 2019 14:46:35 +0200

x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()

In commit

  72a671ced66db ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")

the 32bit and 64bit path of the signal delivery code were merged.

The 32bit version:

  int save_i387_xstate_ia32(void __user *buf)
  …
         if (cpu_has_xsave)
                 return save_i387_xsave(fp);
         if (cpu_has_fxsr)
                 return save_i387_fxsave(fp);

The 64bit version:

  int save_i387_xstate(void __user *buf)
  …
         if (user_has_fpu()) {
                 if (use_xsave())
                         err = xsave_user(buf);
                 else
                         err = fxsave_user(buf);

                 if (unlikely(err)) {
                         __clear_user(buf, xstate_size);
                         return err;

The merge:

  int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
  …
         if (user_has_fpu()) {
                 /* Save the live register state to the user directly. */
                 if (save_user_xstate(buf_fx))
                         return -1;
                 /* Update the thread's fxstate to save the fsave header. */
                 if (ia32_fxstate)
                         fpu_fxsave(&tsk->thread.fpu);

I don't think that we needed to save the FPU registers to ->thread.fpu
because the registers were stored in buf_fx. Today the state will be
restored from buf_fx after the signal was handled (I assume that this
was also the case with lazy-FPU).

Since commit

  66463db4fc560 ("x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()")

it is ensured that the signal handler starts with clear/fresh set of FPU
registers which means that the previous store is futile.

Remove the copy_fxregs_to_kernel() call because task's FPU state is
cleared later in handle_signal() via fpu__clear().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-7-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index c1a5999affa0..34989d2a8893 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -155,7 +155,6 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  */
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
-	struct fpu *fpu = &current->thread.fpu;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -173,9 +172,6 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 	/* Save the live registers state to the user frame directly. */
 	if (copy_fpregs_to_sigframe(buf_fx))
 		return -1;
-	/* Update the thread's fxstate to save the fsave header. */
-	if (ia32_fxstate)
-		copy_fxregs_to_kernel(fpu);
 
 	/* Save the fsave header for the 32-bit frames. */
 	if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Remove fpu->initialized
  2019-04-03 16:41 ` [PATCH 07/27] x86/fpu: Remove fpu->initialized Sebastian Andrzej Siewior
@ 2019-04-13 20:50   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:50 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: rkrcmar, mingo, mhiramat, bigeasy, jannh, dima, jroedel, kvm,
	will.deacon, mingo, linux-kernel, luto, Babu.Moger,
	sergey.senozhatsky, nstange, riel, peterz, Jason, hpa,
	mathieu.desnoyers, dave.hansen, x86, tglx, aubrey.li, pbonzini,
	bp, chang.seok.bae

Commit-ID:  2722146eb78451b30e4717a267a3a2b44e4ad317
Gitweb:     https://git.kernel.org/tip/2722146eb78451b30e4717a267a3a2b44e4ad317
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:36 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Wed, 10 Apr 2019 15:42:40 +0200

x86/fpu: Remove fpu->initialized

The struct fpu.initialized member is always set to one for user tasks
and zero for kernel tasks. This avoids saving/restoring the FPU
registers for kernel threads.

The ->initialized = 0 case for user tasks has been removed in previous
changes, for instance, by doing an explicit unconditional init at fork()
time for FPU-less systems which was otherwise delayed until the emulated
opcode.

The context switch code (switch_fpu_prepare() + switch_fpu_finish())
can't unconditionally save/restore registers for kernel threads. Not
only would it slow down the switch but also load a zeroed xcomp_bv for
XSAVES.

For kernel_fpu_begin() (+end) the situation is similar: EFI with runtime
services uses this before alternatives_patched is true. Which means that
this function is used too early and it wasn't the case before.

For those two cases, use current->mm to distinguish between user and
kernel thread. For kernel_fpu_begin() skip save/restore of the FPU
registers.

During the context switch into a kernel thread don't do anything. There
is no reason to save the FPU state of a kernel thread.

The reordering in __switch_to() is important because the current()
pointer needs to be valid before switch_fpu_finish() is invoked so ->mm
is seen of the new task instead the old one.

N.B.: fpu__save() doesn't need to check ->mm because it is called by
user tasks only.

 [ bp: Massage. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-8-bigeasy@linutronix.de
---
 arch/x86/ia32/ia32_signal.c         | 17 ++++-----
 arch/x86/include/asm/fpu/internal.h | 18 +++++-----
 arch/x86/include/asm/fpu/types.h    |  9 -----
 arch/x86/include/asm/trace/fpu.h    |  5 +--
 arch/x86/kernel/fpu/core.c          | 70 +++++++++++--------------------------
 arch/x86/kernel/fpu/init.c          |  2 --
 arch/x86/kernel/fpu/regset.c        | 19 +++-------
 arch/x86/kernel/fpu/xstate.c        |  2 --
 arch/x86/kernel/process_32.c        |  4 +--
 arch/x86/kernel/process_64.c        |  4 +--
 arch/x86/kernel/signal.c            | 17 ++++-----
 arch/x86/mm/pkeys.c                 |  7 +---
 12 files changed, 53 insertions(+), 121 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 321fe5f5d0e9..6eeb3249f22f 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -216,8 +216,7 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 				 size_t frame_size,
 				 void __user **fpstate)
 {
-	struct fpu *fpu = &current->thread.fpu;
-	unsigned long sp;
+	unsigned long sp, fx_aligned, math_size;
 
 	/* Default to using normal stack */
 	sp = regs->sp;
@@ -231,15 +230,11 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 		 ksig->ka.sa.sa_restorer)
 		sp = (unsigned long) ksig->ka.sa.sa_restorer;
 
-	if (fpu->initialized) {
-		unsigned long fx_aligned, math_size;
-
-		sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
-		*fpstate = (struct _fpstate_32 __user *) sp;
-		if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned,
-				    math_size) < 0)
-			return (void __user *) -1L;
-	}
+	sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
+	*fpstate = (struct _fpstate_32 __user *) sp;
+	if (copy_fpstate_to_sigframe(*fpstate, (void __user *)fx_aligned,
+				     math_size) < 0)
+		return (void __user *) -1L;
 
 	sp -= frame_size;
 	/* Align the stack pointer according to the i386 ABI,
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 70ecb7c032cb..04042eacc852 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -494,11 +494,14 @@ static inline void fpregs_activate(struct fpu *fpu)
  *
  *  - switch_fpu_finish() restores the new state as
  *    necessary.
+ *
+ * The FPU context is only stored/restored for a user task and
+ * ->mm is used to distinguish between kernel and user threads.
  */
 static inline void
 switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
-	if (static_cpu_has(X86_FEATURE_FPU) && old_fpu->initialized) {
+	if (static_cpu_has(X86_FEATURE_FPU) && current->mm) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
 			old_fpu->last_cpu = -1;
 		else
@@ -506,8 +509,7 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 
 		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
-	} else
-		old_fpu->last_cpu = -1;
+	}
 }
 
 /*
@@ -520,12 +522,12 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 {
-	bool preload = static_cpu_has(X86_FEATURE_FPU) &&
-		       new_fpu->initialized;
+	if (static_cpu_has(X86_FEATURE_FPU)) {
+		if (!fpregs_state_valid(new_fpu, cpu)) {
+			if (current->mm)
+				copy_kernel_to_fpregs(&new_fpu->state);
+		}
 
-	if (preload) {
-		if (!fpregs_state_valid(new_fpu, cpu))
-			copy_kernel_to_fpregs(&new_fpu->state);
 		fpregs_activate(new_fpu);
 	}
 }
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 2e32e178e064..f098f6cab94b 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -293,15 +293,6 @@ struct fpu {
 	 */
 	unsigned int			last_cpu;
 
-	/*
-	 * @initialized:
-	 *
-	 * This flag indicates whether this context is initialized: if the task
-	 * is not running then we can restore from this context, if the task
-	 * is running then we should save into this context.
-	 */
-	unsigned char			initialized;
-
 	/*
 	 * @avx512_timestamp:
 	 *
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index 069c04be1507..bd65f6ba950f 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -13,22 +13,19 @@ DECLARE_EVENT_CLASS(x86_fpu,
 
 	TP_STRUCT__entry(
 		__field(struct fpu *, fpu)
-		__field(bool, initialized)
 		__field(u64, xfeatures)
 		__field(u64, xcomp_bv)
 		),
 
 	TP_fast_assign(
 		__entry->fpu		= fpu;
-		__entry->initialized	= fpu->initialized;
 		if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
 			__entry->xfeatures = fpu->state.xsave.header.xfeatures;
 			__entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
 		}
 	),
-	TP_printk("x86/fpu: %p initialized: %d xfeatures: %llx xcomp_bv: %llx",
+	TP_printk("x86/fpu: %p xfeatures: %llx xcomp_bv: %llx",
 			__entry->fpu,
-			__entry->initialized,
 			__entry->xfeatures,
 			__entry->xcomp_bv
 	)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index e43296854e37..97e27de2b7c0 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -101,7 +101,7 @@ static void __kernel_fpu_begin(void)
 
 	kernel_fpu_disable();
 
-	if (fpu->initialized) {
+	if (current->mm) {
 		/*
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
@@ -116,7 +116,7 @@ static void __kernel_fpu_end(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
-	if (fpu->initialized)
+	if (current->mm)
 		copy_kernel_to_fpregs(&fpu->state);
 
 	kernel_fpu_enable();
@@ -147,11 +147,10 @@ void fpu__save(struct fpu *fpu)
 
 	preempt_disable();
 	trace_x86_fpu_before_save(fpu);
-	if (fpu->initialized) {
-		if (!copy_fpregs_to_fpstate(fpu)) {
-			copy_kernel_to_fpregs(&fpu->state);
-		}
-	}
+
+	if (!copy_fpregs_to_fpstate(fpu))
+		copy_kernel_to_fpregs(&fpu->state);
+
 	trace_x86_fpu_after_save(fpu);
 	preempt_enable();
 }
@@ -190,7 +189,7 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
 	dst_fpu->last_cpu = -1;
 
-	if (!src_fpu->initialized || !static_cpu_has(X86_FEATURE_FPU))
+	if (!static_cpu_has(X86_FEATURE_FPU))
 		return 0;
 
 	WARN_ON_FPU(src_fpu != &current->thread.fpu);
@@ -227,14 +226,10 @@ static void fpu__initialize(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
-	if (!fpu->initialized) {
-		fpstate_init(&fpu->state);
-		trace_x86_fpu_init_state(fpu);
+	fpstate_init(&fpu->state);
+	trace_x86_fpu_init_state(fpu);
 
-		trace_x86_fpu_activate_state(fpu);
-		/* Safe to do for the current task: */
-		fpu->initialized = 1;
-	}
+	trace_x86_fpu_activate_state(fpu);
 }
 
 /*
@@ -247,32 +242,20 @@ static void fpu__initialize(struct fpu *fpu)
  *
  * - or it's called for stopped tasks (ptrace), in which case the
  *   registers were already saved by the context-switch code when
- *   the task scheduled out - we only have to initialize the registers
- *   if they've never been initialized.
+ *   the task scheduled out.
  *
  * If the task has used the FPU before then save it.
  */
 void fpu__prepare_read(struct fpu *fpu)
 {
-	if (fpu == &current->thread.fpu) {
+	if (fpu == &current->thread.fpu)
 		fpu__save(fpu);
-	} else {
-		if (!fpu->initialized) {
-			fpstate_init(&fpu->state);
-			trace_x86_fpu_init_state(fpu);
-
-			trace_x86_fpu_activate_state(fpu);
-			/* Safe to do for current and for stopped child tasks: */
-			fpu->initialized = 1;
-		}
-	}
 }
 
 /*
  * This function must be called before we write a task's fpstate.
  *
- * If the task has used the FPU before then invalidate any cached FPU registers.
- * If the task has not used the FPU before then initialize its fpstate.
+ * Invalidate any cached FPU registers.
  *
  * After this function call, after registers in the fpstate are
  * modified and the child task has woken up, the child task will
@@ -289,17 +272,8 @@ void fpu__prepare_write(struct fpu *fpu)
 	 */
 	WARN_ON_FPU(fpu == &current->thread.fpu);
 
-	if (fpu->initialized) {
-		/* Invalidate any cached state: */
-		__fpu_invalidate_fpregs_state(fpu);
-	} else {
-		fpstate_init(&fpu->state);
-		trace_x86_fpu_init_state(fpu);
-
-		trace_x86_fpu_activate_state(fpu);
-		/* Safe to do for stopped child tasks: */
-		fpu->initialized = 1;
-	}
+	/* Invalidate any cached state: */
+	__fpu_invalidate_fpregs_state(fpu);
 }
 
 /*
@@ -316,17 +290,13 @@ void fpu__drop(struct fpu *fpu)
 	preempt_disable();
 
 	if (fpu == &current->thread.fpu) {
-		if (fpu->initialized) {
-			/* Ignore delayed exceptions from user space */
-			asm volatile("1: fwait\n"
-				     "2:\n"
-				     _ASM_EXTABLE(1b, 2b));
-			fpregs_deactivate(fpu);
-		}
+		/* Ignore delayed exceptions from user space */
+		asm volatile("1: fwait\n"
+			     "2:\n"
+			     _ASM_EXTABLE(1b, 2b));
+		fpregs_deactivate(fpu);
 	}
 
-	fpu->initialized = 0;
-
 	trace_x86_fpu_dropped(fpu);
 
 	preempt_enable();
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 6abd83572b01..20d8fa7124c7 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -239,8 +239,6 @@ static void __init fpu__init_system_ctx_switch(void)
 
 	WARN_ON_FPU(!on_boot_cpu);
 	on_boot_cpu = 0;
-
-	WARN_ON_FPU(current->thread.fpu.initialized);
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 5dbc099178a8..d652b939ccfb 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -15,16 +15,12 @@
  */
 int regset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	struct fpu *target_fpu = &target->thread.fpu;
-
-	return target_fpu->initialized ? regset->n : 0;
+	return regset->n;
 }
 
 int regset_xregset_fpregs_active(struct task_struct *target, const struct user_regset *regset)
 {
-	struct fpu *target_fpu = &target->thread.fpu;
-
-	if (boot_cpu_has(X86_FEATURE_FXSR) && target_fpu->initialized)
+	if (boot_cpu_has(X86_FEATURE_FXSR))
 		return regset->n;
 	else
 		return 0;
@@ -370,16 +366,9 @@ int fpregs_set(struct task_struct *target, const struct user_regset *regset,
 int dump_fpu(struct pt_regs *regs, struct user_i387_struct *ufpu)
 {
 	struct task_struct *tsk = current;
-	struct fpu *fpu = &tsk->thread.fpu;
-	int fpvalid;
-
-	fpvalid = fpu->initialized;
-	if (fpvalid)
-		fpvalid = !fpregs_get(tsk, NULL,
-				      0, sizeof(struct user_i387_ia32_struct),
-				      ufpu, NULL);
 
-	return fpvalid;
+	return !fpregs_get(tsk, NULL, 0, sizeof(struct user_i387_ia32_struct),
+			   ufpu, NULL);
 }
 EXPORT_SYMBOL(dump_fpu);
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index d7432c2b1051..8bfcc5b9e04b 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -892,8 +892,6 @@ const void *get_xsave_field_ptr(int xsave_state)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
-	if (!fpu->initialized)
-		return NULL;
 	/*
 	 * fpu__save() takes the CPU's xstate registers
 	 * and saves them off to the 'fpu memory buffer.
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 7888a41a03cd..77d9eb43ccac 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -288,10 +288,10 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	if (prev->gs | next->gs)
 		lazy_load_gs(next->gs);
 
-	switch_fpu_finish(next_fpu, cpu);
-
 	this_cpu_write(current_task, next_p);
 
+	switch_fpu_finish(next_fpu, cpu);
+
 	/* Load the Intel cache allocation PQR MSR. */
 	resctrl_sched_in();
 
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index e1983b3a16c4..ffea7c557963 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -566,14 +566,14 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	x86_fsgsbase_load(prev, next);
 
-	switch_fpu_finish(next_fpu, cpu);
-
 	/*
 	 * Switch the PDA and FPU contexts.
 	 */
 	this_cpu_write(current_task, next_p);
 	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
 
+	switch_fpu_finish(next_fpu, cpu);
+
 	/* Reload sp0. */
 	update_task_stack(next_p);
 
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index b419e1a1a0ce..87b327c6cb10 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -246,7 +246,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
 	int onsigstack = on_sig_stack(sp);
-	struct fpu *fpu = &current->thread.fpu;
+	int ret;
 
 	/* redzone */
 	if (IS_ENABLED(CONFIG_X86_64))
@@ -265,11 +265,9 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		sp = (unsigned long) ka->sa.sa_restorer;
 	}
 
-	if (fpu->initialized) {
-		sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
-					  &buf_fx, &math_size);
-		*fpstate = (void __user *)sp;
-	}
+	sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
+				  &buf_fx, &math_size);
+	*fpstate = (void __user *)sp;
 
 	sp = align_sigframe(sp - frame_size);
 
@@ -281,8 +279,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 		return (void __user *)-1L;
 
 	/* save i387 and extended state */
-	if (fpu->initialized &&
-	    copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size) < 0)
+	ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size);
+	if (ret < 0)
 		return (void __user *)-1L;
 
 	return (void __user *)sp;
@@ -763,8 +761,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 		/*
 		 * Ensure the signal handler starts with the new fpu state.
 		 */
-		if (fpu->initialized)
-			fpu__clear(fpu);
+		fpu__clear(fpu);
 	}
 	signal_setup_done(failed, ksig, stepping);
 }
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 047a77f6a10c..05bb9a44eb1c 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -39,17 +39,12 @@ int __execute_only_pkey(struct mm_struct *mm)
 	 * dance to set PKRU if we do not need to.  Check it
 	 * first and assume that if the execute-only pkey is
 	 * write-disabled that we do not have to set it
-	 * ourselves.  We need preempt off so that nobody
-	 * can make fpregs inactive.
+	 * ourselves.
 	 */
-	preempt_disable();
 	if (!need_to_set_mm_pkey &&
-	    current->thread.fpu.initialized &&
 	    !__pkru_allows_read(read_pkru(), execute_only_pkey)) {
-		preempt_enable();
 		return execute_only_pkey;
 	}
-	preempt_enable();
 
 	/*
 	 * Set up PKRU so that it denies access for everything

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Remove user_fpu_begin()
  2019-04-03 16:41 ` [PATCH 08/27] x86/fpu: Remove user_fpu_begin() Sebastian Andrzej Siewior
@ 2019-04-13 20:51   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:51 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mingo, bp, nstange, Jason, x86, riel, pbonzini,
	jannh, luto, dave.hansen, mingo, bigeasy, hpa, aubrey.li, tglx,
	rkrcmar, kvm

Commit-ID:  0169f53e0d97bb675075506810494bd86b8c934e
Gitweb:     https://git.kernel.org/tip/0169f53e0d97bb675075506810494bd86b8c934e
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:37 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Wed, 10 Apr 2019 15:58:44 +0200

x86/fpu: Remove user_fpu_begin()

user_fpu_begin() sets fpu_fpregs_owner_ctx to task's fpu struct. This is
always the case since there is no lazy FPU anymore.

fpu_fpregs_owner_ctx is used during context switch to decide if it needs
to load the saved registers or if the currently loaded registers are
valid. It could be skipped during a

  taskA -> kernel thread -> taskA

switch because the switch to the kernel thread would not alter the CPU's
sFPU tate.

Since this field is always updated during context switch and
never invalidated, setting it manually (in user context) makes no
difference. A kernel thread with kernel_fpu_begin() block could
set fpu_fpregs_owner_ctx to NULL but a kernel thread does not use
user_fpu_begin().

This is a leftover from the lazy-FPU time.

Remove user_fpu_begin(), it does not change fpu_fpregs_owner_ctx's
content.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-9-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 17 -----------------
 arch/x86/kernel/fpu/core.c          |  4 +---
 arch/x86/kernel/fpu/signal.c        |  1 -
 3 files changed, 1 insertion(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 04042eacc852..54f70cae2f15 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -532,23 +532,6 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 	}
 }
 
-/*
- * Needs to be preemption-safe.
- *
- * NOTE! user_fpu_begin() must be used only immediately before restoring
- * the save state. It does not do any saving/restoring on its own. In
- * lazy FPU mode, it is just an optimization to avoid a #NM exception,
- * the task can lose the FPU right after preempt_enable().
- */
-static inline void user_fpu_begin(void)
-{
-	struct fpu *fpu = &current->thread.fpu;
-
-	preempt_disable();
-	fpregs_activate(fpu);
-	preempt_enable();
-}
-
 /*
  * MXCSR and XCR definitions:
  */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 97e27de2b7c0..739ca3ae2bdc 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -335,10 +335,8 @@ void fpu__clear(struct fpu *fpu)
 	 * Make sure fpstate is cleared and initialized.
 	 */
 	fpu__initialize(fpu);
-	if (static_cpu_has(X86_FEATURE_FPU)) {
-		user_fpu_begin();
+	if (static_cpu_has(X86_FEATURE_FPU))
 		copy_init_fpstate_to_fpregs();
-	}
 }
 
 /*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 34989d2a8893..155f4552413e 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -322,7 +322,6 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		 * For 64-bit frames and 32-bit fsave frames, restore the user
 		 * state to the registers directly (with exceptions handled).
 		 */
-		user_fpu_begin();
 		if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
 			fpu__clear(fpu);
 			return -1;

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Add an __fpregs_load_activate() internal helper
  2019-04-03 16:41 ` [PATCH 09/27] x86/fpu: Add (__)make_fpregs_active helpers Sebastian Andrzej Siewior
@ 2019-04-13 20:52   ` tip-bot for Rik van Riel
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Rik van Riel @ 2019-04-13 20:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Jason, aubrey.li, luto, dave.hansen, rkrcmar, jannh, x86,
	pbonzini, mingo, riel, bp, tglx, kvm, bigeasy, hpa, linux-kernel,
	mingo

Commit-ID:  4ee91519e1dccc175665fe24bb20a47c6053575c
Gitweb:     https://git.kernel.org/tip/4ee91519e1dccc175665fe24bb20a47c6053575c
Author:     Rik van Riel <riel@surriel.com>
AuthorDate: Wed, 3 Apr 2019 18:41:38 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Wed, 10 Apr 2019 16:23:14 +0200

x86/fpu: Add an __fpregs_load_activate() internal helper

Add a helper function that ensures the floating point registers for the
current task are active. Use with preemption disabled.

While at it, add fpregs_lock/unlock() helpers too, to be used in later
patches.

 [ bp: Add a comment about its intended usage. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-10-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/api.h      | 11 +++++++++++
 arch/x86/include/asm/fpu/internal.h | 22 ++++++++++++++--------
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index b56d504af654..73e684160f35 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -10,6 +10,7 @@
 
 #ifndef _ASM_X86_FPU_API_H
 #define _ASM_X86_FPU_API_H
+#include <linux/preempt.h>
 
 /*
  * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
@@ -22,6 +23,16 @@ extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 extern bool irq_fpu_usable(void);
 
+static inline void fpregs_lock(void)
+{
+	preempt_disable();
+}
+
+static inline void fpregs_unlock(void)
+{
+	preempt_enable();
+}
+
 /*
  * Query the presence of one or more xfeatures. Works on any legacy CPU as well.
  *
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 54f70cae2f15..3e0c2c496f2d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -484,6 +484,18 @@ static inline void fpregs_activate(struct fpu *fpu)
 	trace_x86_fpu_regs_activated(fpu);
 }
 
+/*
+ * Internal helper, do not use directly. Use switch_fpu_return() instead.
+ */
+static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
+{
+	if (!fpregs_state_valid(fpu, cpu)) {
+		if (current->mm)
+			copy_kernel_to_fpregs(&fpu->state);
+		fpregs_activate(fpu);
+	}
+}
+
 /*
  * FPU state switching for scheduling.
  *
@@ -522,14 +534,8 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 {
-	if (static_cpu_has(X86_FEATURE_FPU)) {
-		if (!fpregs_state_valid(new_fpu, cpu)) {
-			if (current->mm)
-				copy_kernel_to_fpregs(&new_fpu->state);
-		}
-
-		fpregs_activate(new_fpu);
-	}
+	if (static_cpu_has(X86_FEATURE_FPU))
+		__fpregs_load_activate(new_fpu, cpu);
 }
 
 /*

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Make __raw_xsave_addr() use a feature number instead of mask
  2019-04-03 16:41 ` [PATCH 10/27] x86/fpu: Make __raw_xsave_addr() use feature number instead of mask Sebastian Andrzej Siewior
@ 2019-04-13 20:52   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, hpa, dave.hansen, tglx, pbonzini, mingo, x86, rkrcmar,
	Jason, sergey.senozhatsky, bigeasy, bp, kvm, mingo, linux-kernel,
	luto

Commit-ID:  07baeb04f37c952981d63359ff840118ce8f5434
Gitweb:     https://git.kernel.org/tip/07baeb04f37c952981d63359ff840118ce8f5434
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:39 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Wed, 10 Apr 2019 16:33:45 +0200

x86/fpu: Make __raw_xsave_addr() use a feature number instead of mask

Most users of __raw_xsave_addr() use a feature number, shift it to a
mask and then __raw_xsave_addr() shifts it back to the feature number.

Make __raw_xsave_addr() use the feature number as an argument.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-11-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/xstate.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 8bfcc5b9e04b..4f7f3c5d0d0c 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -805,20 +805,18 @@ void fpu__resume_cpu(void)
 }
 
 /*
- * Given an xstate feature mask, calculate where in the xsave
+ * Given an xstate feature nr, calculate where in the xsave
  * buffer the state is.  Callers should ensure that the buffer
  * is valid.
  */
-static void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask)
+static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 {
-	int feature_nr = fls64(xstate_feature_mask) - 1;
-
-	if (!xfeature_enabled(feature_nr)) {
+	if (!xfeature_enabled(xfeature_nr)) {
 		WARN_ON_FPU(1);
 		return NULL;
 	}
 
-	return (void *)xsave + xstate_comp_offsets[feature_nr];
+	return (void *)xsave + xstate_comp_offsets[xfeature_nr];
 }
 /*
  * Given the xsave area and a state inside, this function returns the
@@ -840,6 +838,7 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, int xstate_feature_mask
  */
 void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 {
+	int xfeature_nr;
 	/*
 	 * Do we even *have* xsave state?
 	 */
@@ -867,7 +866,8 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	if (!(xsave->header.xfeatures & xstate_feature))
 		return NULL;
 
-	return __raw_xsave_addr(xsave, xstate_feature);
+	xfeature_nr = fls64(xstate_feature) - 1;
+	return __raw_xsave_addr(xsave, xfeature_nr);
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
 
@@ -1014,7 +1014,7 @@ int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int of
 		 * Copy only in-use xstates:
 		 */
 		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, 1 << i);
+			void *src = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
@@ -1100,7 +1100,7 @@ int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned i
 		 * Copy only in-use xstates:
 		 */
 		if ((header.xfeatures >> i) & 1) {
-			void *src = __raw_xsave_addr(xsave, 1 << i);
+			void *src = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
@@ -1157,7 +1157,7 @@ int copy_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf)
 		u64 mask = ((u64)1 << i);
 
 		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, 1 << i);
+			void *dst = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];
@@ -1211,7 +1211,7 @@ int copy_user_to_xstate(struct xregs_state *xsave, const void __user *ubuf)
 		u64 mask = ((u64)1 << i);
 
 		if (hdr.xfeatures & mask) {
-			void *dst = __raw_xsave_addr(xsave, 1 << i);
+			void *dst = __raw_xsave_addr(xsave, i);
 
 			offset = xstate_offsets[i];
 			size = xstate_sizes[i];

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Use a feature number instead of mask in two more helpers
  2019-04-03 16:41 ` [PATCH 11/27] x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use " Sebastian Andrzej Siewior
@ 2019-04-13 20:53   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: x86, peterz, Siarhei.Liakh, mingo, rkrcmar, luto, bigeasy,
	mhiramat, bp, kvm, riel, hpa, sergey.senozhatsky, dave.hansen,
	dave.hansen, linux-kernel, mingo, Jason, pbonzini, jannh, tglx,
	linux, ebiederm

Commit-ID:  abd16d68d65229e5acafdadc32704239131bf2ea
Gitweb:     https://git.kernel.org/tip/abd16d68d65229e5acafdadc32704239131bf2ea
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:40 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Wed, 10 Apr 2019 18:20:27 +0200

x86/fpu: Use a feature number instead of mask in two more helpers

After changing the argument of __raw_xsave_addr() from a mask to
number Dave suggested to check if it makes sense to do the same for
get_xsave_addr(). As it turns out it does.

Only get_xsave_addr() needs the mask to check if the requested feature
is part of what is supported/saved and then uses the number again. The
shift operation is cheaper compared to fls64() (find last bit set).
Also, the feature number uses less opcode space compared to the mask. :)

Make the get_xsave_addr() argument a xfeature number instead of a mask
and fix up its callers.

Furthermore, use xfeature_nr and xfeature_mask consistently.

This results in the following changes to the kvm code:

  feature -> xfeature_mask
  index -> xfeature_nr

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Siarhei Liakh <Siarhei.Liakh@concurrent-rt.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-12-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/xstate.h |  4 ++--
 arch/x86/kernel/fpu/xstate.c      | 22 ++++++++++------------
 arch/x86/kernel/traps.c           |  2 +-
 arch/x86/kvm/x86.c                | 28 ++++++++++++++--------------
 arch/x86/mm/mpx.c                 |  6 +++---
 5 files changed, 30 insertions(+), 32 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 48581988d78c..fbe41f808e5d 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -46,8 +46,8 @@ extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
 void fpu__xstate_clear_all_cpu_caps(void);
-void *get_xsave_addr(struct xregs_state *xsave, int xstate);
-const void *get_xsave_field_ptr(int xstate_field);
+void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
+const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
 int copy_xstate_to_user(void __user *ubuf, struct xregs_state *xsave, unsigned int offset, unsigned int size);
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 4f7f3c5d0d0c..9c459fd1d38e 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -830,15 +830,14 @@ static void *__raw_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
  *
  * Inputs:
  *	xstate: the thread's storage area for all FPU data
- *	xstate_feature: state which is defined in xsave.h (e.g.
- *	XFEATURE_MASK_FP, XFEATURE_MASK_SSE, etc...)
+ *	xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
+ *	XFEATURE_SSE, etc...)
  * Output:
  *	address of the state in the xsave area, or NULL if the
  *	field is not present in the xsave buffer.
  */
-void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
+void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
 {
-	int xfeature_nr;
 	/*
 	 * Do we even *have* xsave state?
 	 */
@@ -850,11 +849,11 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	 * have not enabled.  Remember that pcntxt_mask is
 	 * what we write to the XCR0 register.
 	 */
-	WARN_ONCE(!(xfeatures_mask & xstate_feature),
+	WARN_ONCE(!(xfeatures_mask & BIT_ULL(xfeature_nr)),
 		  "get of unsupported state");
 	/*
 	 * This assumes the last 'xsave*' instruction to
-	 * have requested that 'xstate_feature' be saved.
+	 * have requested that 'xfeature_nr' be saved.
 	 * If it did not, we might be seeing and old value
 	 * of the field in the buffer.
 	 *
@@ -863,10 +862,9 @@ void *get_xsave_addr(struct xregs_state *xsave, int xstate_feature)
 	 * or because the "init optimization" caused it
 	 * to not be saved.
 	 */
-	if (!(xsave->header.xfeatures & xstate_feature))
+	if (!(xsave->header.xfeatures & BIT_ULL(xfeature_nr)))
 		return NULL;
 
-	xfeature_nr = fls64(xstate_feature) - 1;
 	return __raw_xsave_addr(xsave, xfeature_nr);
 }
 EXPORT_SYMBOL_GPL(get_xsave_addr);
@@ -882,13 +880,13 @@ EXPORT_SYMBOL_GPL(get_xsave_addr);
  * Note that this only works on the current task.
  *
  * Inputs:
- *	@xsave_state: state which is defined in xsave.h (e.g. XFEATURE_MASK_FP,
- *	XFEATURE_MASK_SSE, etc...)
+ *	@xfeature_nr: state which is defined in xsave.h (e.g. XFEATURE_FP,
+ *	XFEATURE_SSE, etc...)
  * Output:
  *	address of the state in the xsave area or NULL if the state
  *	is not present or is in its 'init state'.
  */
-const void *get_xsave_field_ptr(int xsave_state)
+const void *get_xsave_field_ptr(int xfeature_nr)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
@@ -898,7 +896,7 @@ const void *get_xsave_field_ptr(int xsave_state)
 	 */
 	fpu__save(fpu);
 
-	return get_xsave_addr(&fpu->state.xsave, xsave_state);
+	return get_xsave_addr(&fpu->state.xsave, xfeature_nr);
 }
 
 #ifdef CONFIG_ARCH_HAS_PKEYS
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index d26f9e9c3d83..8b6d03e55d2f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -456,7 +456,7 @@ dotraplinkage void do_bounds(struct pt_regs *regs, long error_code)
 	 * which is all zeros which indicates MPX was not
 	 * responsible for the exception.
 	 */
-	bndcsr = get_xsave_field_ptr(XFEATURE_MASK_BNDCSR);
+	bndcsr = get_xsave_field_ptr(XFEATURE_BNDCSR);
 	if (!bndcsr)
 		goto exit_trap;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 099b851dabaf..8022e7769b3a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3674,15 +3674,15 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
-		u64 feature = valid & -valid;
-		int index = fls64(feature) - 1;
-		void *src = get_xsave_addr(xsave, feature);
+		u64 xfeature_mask = valid & -valid;
+		int xfeature_nr = fls64(xfeature_mask) - 1;
+		void *src = get_xsave_addr(xsave, xfeature_nr);
 
 		if (src) {
 			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, index,
+			cpuid_count(XSTATE_CPUID, xfeature_nr,
 				    &size, &offset, &ecx, &edx);
-			if (feature == XFEATURE_MASK_PKRU)
+			if (xfeature_nr == XFEATURE_PKRU)
 				memcpy(dest + offset, &vcpu->arch.pkru,
 				       sizeof(vcpu->arch.pkru));
 			else
@@ -3690,7 +3690,7 @@ static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
 
 		}
 
-		valid -= feature;
+		valid -= xfeature_mask;
 	}
 }
 
@@ -3717,22 +3717,22 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
 	 */
 	valid = xstate_bv & ~XFEATURE_MASK_FPSSE;
 	while (valid) {
-		u64 feature = valid & -valid;
-		int index = fls64(feature) - 1;
-		void *dest = get_xsave_addr(xsave, feature);
+		u64 xfeature_mask = valid & -valid;
+		int xfeature_nr = fls64(xfeature_mask) - 1;
+		void *dest = get_xsave_addr(xsave, xfeature_nr);
 
 		if (dest) {
 			u32 size, offset, ecx, edx;
-			cpuid_count(XSTATE_CPUID, index,
+			cpuid_count(XSTATE_CPUID, xfeature_nr,
 				    &size, &offset, &ecx, &edx);
-			if (feature == XFEATURE_MASK_PKRU)
+			if (xfeature_nr == XFEATURE_PKRU)
 				memcpy(&vcpu->arch.pkru, src + offset,
 				       sizeof(vcpu->arch.pkru));
 			else
 				memcpy(dest, src + offset, size);
 		}
 
-		valid -= feature;
+		valid -= xfeature_mask;
 	}
 }
 
@@ -8850,11 +8850,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 		if (init_event)
 			kvm_put_guest_fpu(vcpu);
 		mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu->state.xsave,
-					XFEATURE_MASK_BNDREGS);
+					XFEATURE_BNDREGS);
 		if (mpx_state_buffer)
 			memset(mpx_state_buffer, 0, sizeof(struct mpx_bndreg_state));
 		mpx_state_buffer = get_xsave_addr(&vcpu->arch.guest_fpu->state.xsave,
-					XFEATURE_MASK_BNDCSR);
+					XFEATURE_BNDCSR);
 		if (mpx_state_buffer)
 			memset(mpx_state_buffer, 0, sizeof(struct mpx_bndcsr));
 		if (init_event)
diff --git a/arch/x86/mm/mpx.c b/arch/x86/mm/mpx.c
index c805db6236b4..59726aaf4671 100644
--- a/arch/x86/mm/mpx.c
+++ b/arch/x86/mm/mpx.c
@@ -142,7 +142,7 @@ int mpx_fault_info(struct mpx_fault_info *info, struct pt_regs *regs)
 		goto err_out;
 	}
 	/* get bndregs field from current task's xsave area */
-	bndregs = get_xsave_field_ptr(XFEATURE_MASK_BNDREGS);
+	bndregs = get_xsave_field_ptr(XFEATURE_BNDREGS);
 	if (!bndregs) {
 		err = -EINVAL;
 		goto err_out;
@@ -190,7 +190,7 @@ static __user void *mpx_get_bounds_dir(void)
 	 * The bounds directory pointer is stored in a register
 	 * only accessible if we first do an xsave.
 	 */
-	bndcsr = get_xsave_field_ptr(XFEATURE_MASK_BNDCSR);
+	bndcsr = get_xsave_field_ptr(XFEATURE_BNDCSR);
 	if (!bndcsr)
 		return MPX_INVALID_BOUNDS_DIR;
 
@@ -376,7 +376,7 @@ static int do_mpx_bt_fault(void)
 	const struct mpx_bndcsr *bndcsr;
 	struct mm_struct *mm = current->mm;
 
-	bndcsr = get_xsave_field_ptr(XFEATURE_MASK_BNDCSR);
+	bndcsr = get_xsave_field_ptr(XFEATURE_BNDCSR);
 	if (!bndcsr)
 		return -EINVAL;
 	/*

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/pkeys: Provide *pkru() helpers
  2019-04-03 16:41 ` [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions Sebastian Andrzej Siewior
  2019-04-10 16:36   ` Borislav Petkov
@ 2019-04-13 20:54   ` tip-bot for Sebastian Andrzej Siewior
  1 sibling, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jroedel, jgross, mhocko, bp, rkrcmar, Jason, tglx,
	kirill.shutemov, hpa, bigeasy, luto, dave.hansen, ak, riel, x86,
	mingo, pbonzini, mingo, kvm, linux-kernel

Commit-ID:  c806e88734b9e9aea260bf2261c129aa23968fca
Gitweb:     https://git.kernel.org/tip/c806e88734b9e9aea260bf2261c129aa23968fca
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:41 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 15:40:58 +0200

x86/pkeys: Provide *pkru() helpers

Dave Hansen asked for __read_pkru() and __write_pkru() to be
symmetrical.

As part of the series __write_pkru() will read back the value and only
write it if it is different.

In order to make both functions symmetrical, move the function
containing only the opcode asm into a function called like the
instruction itself.

__write_pkru() will just invoke wrpkru() but in a follow-up patch will
also read back the value.

 [ bp: Convert asm opcode wrapper names to rd/wrpkru(). ]

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-13-bigeasy@linutronix.de
---
 arch/x86/include/asm/pgtable.h       |  2 +-
 arch/x86/include/asm/special_insns.h | 12 +++++++++---
 arch/x86/kvm/vmx/vmx.c               |  2 +-
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 2779ace16d23..e8875ca75623 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -127,7 +127,7 @@ static inline int pte_dirty(pte_t pte)
 static inline u32 read_pkru(void)
 {
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		return __read_pkru();
+		return rdpkru();
 	return 0;
 }
 
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 43c029cdc3fe..34897e2b52c9 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -92,7 +92,7 @@ static inline void native_write_cr8(unsigned long val)
 #endif
 
 #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
-static inline u32 __read_pkru(void)
+static inline u32 rdpkru(void)
 {
 	u32 ecx = 0;
 	u32 edx, pkru;
@@ -107,7 +107,7 @@ static inline u32 __read_pkru(void)
 	return pkru;
 }
 
-static inline void __write_pkru(u32 pkru)
+static inline void wrpkru(u32 pkru)
 {
 	u32 ecx = 0, edx = 0;
 
@@ -118,8 +118,14 @@ static inline void __write_pkru(u32 pkru)
 	asm volatile(".byte 0x0f,0x01,0xef\n\t"
 		     : : "a" (pkru), "c"(ecx), "d"(edx));
 }
+
+static inline void __write_pkru(u32 pkru)
+{
+	wrpkru(pkru);
+}
+
 #else
-static inline u32 __read_pkru(void)
+static inline u32 rdpkru(void)
 {
 	return 0;
 }
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ab432a930ae8..dd1e1eea4f0b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6501,7 +6501,7 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	if (static_cpu_has(X86_FEATURE_PKU) &&
 	    kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
-		vcpu->arch.pkru = __read_pkru();
+		vcpu->arch.pkru = rdpkru();
 		if (vcpu->arch.pkru != vmx->host_pkru)
 			__write_pkru(vmx->host_pkru);
 	}

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Only write PKRU if it is different from current
  2019-04-03 16:41 ` [PATCH 13/27] x86/fpu: Only write PKRU if it is different from current Sebastian Andrzej Siewior
@ 2019-04-13 20:55   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, riel, jgross, linux-kernel, bigeasy, tglx, dave.hansen, hpa,
	luto, Jason, pbonzini, mingo, rkrcmar, x86, kvm, mingo

Commit-ID:  577ff465f5a6a5a0866d75a033844810baca20a0
Gitweb:     https://git.kernel.org/tip/577ff465f5a6a5a0866d75a033844810baca20a0
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:42 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 15:41:05 +0200

x86/fpu: Only write PKRU if it is different from current

According to Dave Hansen, WRPKRU is more expensive than RDPKRU. It has
a higher cycle cost and it's also practically a (light) speculation
barrier.

As an optimisation read the current PKRU value and only write the new
one if it is different.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-14-bigeasy@linutronix.de
---
 arch/x86/include/asm/special_insns.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 34897e2b52c9..0a3c4cab39db 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -121,6 +121,13 @@ static inline void wrpkru(u32 pkru)
 
 static inline void __write_pkru(u32 pkru)
 {
+	/*
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
+	 */
+	if (pkru == rdpkru())
+		return;
+
 	wrpkru(pkru);
 }
 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/pkeys: Don't check if PKRU is zero before writing it
  2019-04-03 16:41 ` [PATCH 14/27] x86/pkeys: Don't check if PKRU is zero before writting it Sebastian Andrzej Siewior
@ 2019-04-13 20:55   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:55 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, bigeasy, peterz, linux-kernel, hpa, rkrcmar, luto,
	dave.hansen, x86, mingo, kvm, tglx, mingo, Jason, bp, pbonzini

Commit-ID:  0556cbdc2fbcb3068e5b924a8b3d5386ae0dd27d
Gitweb:     https://git.kernel.org/tip/0556cbdc2fbcb3068e5b924a8b3d5386ae0dd27d
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:43 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 15:41:05 +0200

x86/pkeys: Don't check if PKRU is zero before writing it

write_pkru() checks if the current value is the same as the expected
value. So instead of just checking if the current and new value is zero
(and skip the write in such a case) we can benefit from that.

Remove the zero check of PKRU, __write_pkru() provides such a check now.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-15-bigeasy@linutronix.de
---
 arch/x86/mm/pkeys.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 05bb9a44eb1c..50f65fc1b9a3 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -142,13 +142,6 @@ u32 init_pkru_value = PKRU_AD_KEY( 1) | PKRU_AD_KEY( 2) | PKRU_AD_KEY( 3) |
 void copy_init_pkru_to_fpregs(void)
 {
 	u32 init_pkru_value_snapshot = READ_ONCE(init_pkru_value);
-	/*
-	 * Any write to PKRU takes it out of the XSAVE 'init
-	 * state' which increases context switch cost.  Avoid
-	 * writing 0 when PKRU was already 0.
-	 */
-	if (!init_pkru_value_snapshot && !read_pkru())
-		return;
 	/*
 	 * Override the PKRU state that came from 'init_fpstate'
 	 * with the baseline from the process.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Eager switch PKRU state
  2019-04-03 16:41 ` [PATCH 15/27] x86/fpu: Eager switch PKRU state Sebastian Andrzej Siewior
@ 2019-04-13 20:56   ` tip-bot for Rik van Riel
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Rik van Riel @ 2019-04-13 20:56 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: kirill.shutemov, kvm, bigeasy, x86, jgross, luto, rkrcmar, mingo,
	jroedel, Jason, bp, aubrey.li, jannh, mingo, tglx, dave.hansen,
	hpa, peterz, mhocko, pbonzini, ak, linux-kernel, riel

Commit-ID:  0cecca9d03c964abbd2b7927d0670eb70db4ebf2
Gitweb:     https://git.kernel.org/tip/0cecca9d03c964abbd2b7927d0670eb70db4ebf2
Author:     Rik van Riel <riel@surriel.com>
AuthorDate: Wed, 3 Apr 2019 18:41:44 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 15:57:10 +0200

x86/fpu: Eager switch PKRU state

While most of a task's FPU state is only needed in user space, the
protection keys need to be in place immediately after a context switch.

The reason is that any access to userspace memory while running in
kernel mode also needs to abide by the memory permissions specified in
the protection keys.

The "eager switch" is a preparation for loading the FPU state on return
to userland. Instead of decoupling PKRU state from xstate, update PKRU
within xstate on write operations by the kernel.

For user tasks the PKRU should be always read from the xsave area and it
should not change anything because the PKRU value was loaded as part of
FPU restore.

For kernel threads the default "init_pkru_value" will be written. Before
this commit, the kernel thread would end up with a random value which it
inherited from the previous user task.

 [ bigeasy: save pkru to xstate, no cache, don't use __raw_xsave_addr() ]

 [ bp: update commit message, sort headers properly in asm/fpu/xstate.h ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-16-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 24 ++++++++++++++++++++++--
 arch/x86/include/asm/fpu/xstate.h   |  4 +++-
 arch/x86/include/asm/pgtable.h      |  6 ++++++
 arch/x86/mm/pkeys.c                 |  1 -
 4 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 3e0c2c496f2d..6eb4a0b1ad0e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -14,6 +14,7 @@
 #include <linux/compat.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/mm.h>
 
 #include <asm/user.h>
 #include <asm/fpu/api.h>
@@ -534,8 +535,27 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 {
-	if (static_cpu_has(X86_FEATURE_FPU))
-		__fpregs_load_activate(new_fpu, cpu);
+	u32 pkru_val = init_pkru_value;
+	struct pkru_state *pk;
+
+	if (!static_cpu_has(X86_FEATURE_FPU))
+		return;
+
+	__fpregs_load_activate(new_fpu, cpu);
+
+	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
+		return;
+
+	/*
+	 * PKRU state is switched eagerly because it needs to be valid before we
+	 * return to userland e.g. for a copy_to_user() operation.
+	 */
+	if (current->mm) {
+		pk = get_xsave_addr(&new_fpu->state.xsave, XFEATURE_PKRU);
+		if (pk)
+			pkru_val = pk->pkru;
+	}
+	__write_pkru(pkru_val);
 }
 
 /*
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index fbe41f808e5d..7e42b285c856 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -2,9 +2,11 @@
 #ifndef __ASM_X86_XSAVE_H
 #define __ASM_X86_XSAVE_H
 
+#include <linux/uaccess.h>
 #include <linux/types.h>
+
 #include <asm/processor.h>
-#include <linux/uaccess.h>
+#include <asm/user.h>
 
 /* Bit 63 of XCR0 is reserved for future expansion */
 #define XFEATURE_MASK_EXTEND	(~(XFEATURE_MASK_FPSSE | (1ULL << 63)))
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index e8875ca75623..9beb371b1adf 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1355,6 +1355,12 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
 #define PKRU_WD_BIT 0x2
 #define PKRU_BITS_PER_PKEY 2
 
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+extern u32 init_pkru_value;
+#else
+#define init_pkru_value	0
+#endif
+
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
 {
 	int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 50f65fc1b9a3..2ecbf4155f98 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -126,7 +126,6 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot, int pkey
  * in the process's lifetime will not accidentally get access
  * to data which is pkey-protected later on.
  */
-static
 u32 init_pkru_value = PKRU_AD_KEY( 1) | PKRU_AD_KEY( 2) | PKRU_AD_KEY( 3) |
 		      PKRU_AD_KEY( 4) | PKRU_AD_KEY( 5) | PKRU_AD_KEY( 6) |
 		      PKRU_AD_KEY( 7) | PKRU_AD_KEY( 8) | PKRU_AD_KEY( 9) |

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/entry: Add TIF_NEED_FPU_LOAD
  2019-04-03 16:41 ` [PATCH 16/27] x86/entry: Add TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
@ 2019-04-13 20:57   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pbonzini, Jason, linux-kernel, mingo, rkrcmar, tim.c.chen,
	aubrey.li, luto, riel, tglx, hpa, dave.hansen, jannh, x86, kvm,
	konrad.wilk, bigeasy, bp, mingo

Commit-ID:  383c252545edcc708128e2028a2318b05c45ede4
Gitweb:     https://git.kernel.org/tip/383c252545edcc708128e2028a2318b05c45ede4
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:45 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 16:21:51 +0200

x86/entry: Add TIF_NEED_FPU_LOAD

Add TIF_NEED_FPU_LOAD. This flag is used for loading the FPU registers
before returning to userland. It must not be set on systems without a
FPU.

If this flag is cleared, the CPU's FPU registers hold the latest,
up-to-date content of the current task's (current()) FPU registers.
The in-memory copy (union fpregs_state) is not valid.

If this flag is set, then all of CPU's FPU registers may hold a random
value (except for PKRU) and it is required to load the content of the
FPU registers on return to userland.

Introduce it now as a preparatory change before adding the main feature.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-17-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 8 ++++++++
 arch/x86/include/asm/thread_info.h  | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 6eb4a0b1ad0e..da75d7b3e37d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -508,6 +508,14 @@ static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
  *  - switch_fpu_finish() restores the new state as
  *    necessary.
  *
+ * If TIF_NEED_FPU_LOAD is cleared then the CPU's FPU registers
+ * are saved in the current thread's FPU register state.
+ *
+ * If TIF_NEED_FPU_LOAD is set then CPU's FPU registers may not
+ * hold current()'s FPU registers. It is required to load the
+ * registers before returning to userland or using the content
+ * otherwise.
+ *
  * The FPU context is only stored/restored for a user task and
  * ->mm is used to distinguish between kernel and user threads.
  */
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index e0eccbcb8447..f9453536f9bb 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -88,6 +88,7 @@ struct thread_info {
 #define TIF_USER_RETURN_NOTIFY	11	/* notify kernel of userspace return */
 #define TIF_UPROBE		12	/* breakpointed or singlestepping */
 #define TIF_PATCH_PENDING	13	/* pending live patching update */
+#define TIF_NEED_FPU_LOAD	14	/* load FPU on return to userspace */
 #define TIF_NOCPUID		15	/* CPUID is not accessible in userland */
 #define TIF_NOTSC		16	/* TSC is not accessible in userland */
 #define TIF_IA32		17	/* IA32 compatibility process */
@@ -117,6 +118,7 @@ struct thread_info {
 #define _TIF_USER_RETURN_NOTIFY	(1 << TIF_USER_RETURN_NOTIFY)
 #define _TIF_UPROBE		(1 << TIF_UPROBE)
 #define _TIF_PATCH_PENDING	(1 << TIF_PATCH_PENDING)
+#define _TIF_NEED_FPU_LOAD	(1 << TIF_NEED_FPU_LOAD)
 #define _TIF_NOCPUID		(1 << TIF_NOCPUID)
 #define _TIF_NOTSC		(1 << TIF_NOTSC)
 #define _TIF_IA32		(1 << TIF_IA32)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Always store the registers in copy_fpstate_to_sigframe()
  2019-04-03 16:41 ` [PATCH 17/27] x86/fpu: Always store the registers in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
@ 2019-04-13 20:57   ` tip-bot for Rik van Riel
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Rik van Riel @ 2019-04-13 20:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jannh, riel, dave.hansen, x86, hpa, kvm, pbonzini, bigeasy,
	linux-kernel, mingo, rkrcmar, mingo, bp, Jason, luto, tglx

Commit-ID:  69277c98f5eef0d9839699b7825923c3985f665f
Gitweb:     https://git.kernel.org/tip/69277c98f5eef0d9839699b7825923c3985f665f
Author:     Rik van Riel <riel@surriel.com>
AuthorDate: Wed, 3 Apr 2019 18:41:46 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 18:08:57 +0200

x86/fpu: Always store the registers in copy_fpstate_to_sigframe()

copy_fpstate_to_sigframe() stores the registers directly to user space.
This is okay because the FPU registers are valid and saving them
directly avoids saving them into kernel memory and making a copy.

However, this cannot be done anymore if the FPU registers are going
to be restored on the return to userland. It is possible that the FPU
registers will be invalidated in the middle of the save operation and
this should be done with disabled preemption / BH.

Save the FPU registers to the task's FPU struct and copy them to the
user memory later on.

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-18-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 155f4552413e..8f23f5237218 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -144,8 +144,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  *	buf == buf_fx for 64-bit frames and 32-bit fsave frame.
  *	buf != buf_fx for 32-bit frames with fxstate.
  *
- * Save the state directly to the user frame pointed by the aligned pointer
- * 'buf_fx'.
+ * Save the state to task's fpu->state and then copy it to the user frame
+ * pointed to by the aligned pointer 'buf_fx'.
  *
  * If this is a 32-bit frame with fxstate, put a fsave header before
  * the aligned state at 'buf_fx'.
@@ -155,6 +155,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  */
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
+	struct fpu *fpu = &current->thread.fpu;
+	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -169,9 +171,16 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	/* Save the live registers state to the user frame directly. */
-	if (copy_fpregs_to_sigframe(buf_fx))
-		return -1;
+	copy_fpregs_to_fpstate(fpu);
+
+	if (using_compacted_format()) {
+		if (copy_xstate_to_user(buf_fx, xsave, 0, size))
+			return -1;
+	} else {
+		fpstate_sanitize_xstate(fpu);
+		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
+			return -1;
+	}
 
 	/* Save the fsave header for the 32-bit frames. */
 	if ((ia32_fxstate || !use_fxsr()) && save_fsave_header(tsk, buf))

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD
  2019-04-03 16:41 ` [PATCH 18/27] x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
@ 2019-04-13 20:58   ` tip-bot for Rik van Riel
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Rik van Riel @ 2019-04-13 20:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bigeasy, hpa, dave.hansen, tglx, kvm, bp, rkrcmar, pbonzini,
	luto, mingo, mingo, x86, linux-kernel, Jason, riel, jannh

Commit-ID:  a352a3b7b7920212ee4c45a41500c66826318e92
Gitweb:     https://git.kernel.org/tip/a352a3b7b7920212ee4c45a41500c66826318e92
Author:     Rik van Riel <riel@surriel.com>
AuthorDate: Wed, 3 Apr 2019 18:41:47 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 18:20:04 +0200

x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD

The FPU registers need only to be saved if TIF_NEED_FPU_LOAD is not set.
Otherwise this has been already done and can be skipped.

 [ bp: Massage a bit. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-19-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 8f23f5237218..9b9dfdc96285 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -171,7 +171,17 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	copy_fpregs_to_fpstate(fpu);
+	/*
+	 * If we do not need to load the FPU registers at return to userspace
+	 * then the CPU has the current state and we need to save it. Otherwise,
+	 * it has already been done and we can skip it.
+	 */
+	fpregs_lock();
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+		copy_fpregs_to_fpstate(fpu);
+		set_thread_flag(TIF_NEED_FPU_LOAD);
+	}
+	fpregs_unlock();
 
 	if (using_compacted_format()) {
 		if (copy_xstate_to_user(buf_fx, xsave, 0, size))

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Update xstate's PKRU value on write_pkru()
  2019-04-03 16:41 ` [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru() Sebastian Andrzej Siewior
  2019-04-08 18:14   ` Dave Hansen
@ 2019-04-13 20:59   ` tip-bot for Sebastian Andrzej Siewior
  1 sibling, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 20:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: riel, x86, jgross, linux-kernel, bp, mhocko, ak, dave.hansen,
	rkrcmar, Jason, tglx, mingo, pbonzini, luto, bigeasy, mingo, kvm,
	jroedel, kirill.shutemov, hpa

Commit-ID:  0d714dba162620fd8b9f5b3104a487e041353c4d
Gitweb:     https://git.kernel.org/tip/0d714dba162620fd8b9f5b3104a487e041353c4d
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:48 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 20:33:29 +0200

x86/fpu: Update xstate's PKRU value on write_pkru()

During the context switch the xstate is loaded which also includes the
PKRU value.

If xstate is restored on return to userland it is required
that the PKRU value in xstate is the same as the one in the CPU.

Save the PKRU in xstate during modification.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-20-bigeasy@linutronix.de
---
 arch/x86/include/asm/pgtable.h | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 9beb371b1adf..5cfbbb6d458d 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -23,6 +23,8 @@
 
 #ifndef __ASSEMBLY__
 #include <asm/x86_init.h>
+#include <asm/fpu/xstate.h>
+#include <asm/fpu/api.h>
 
 extern pgd_t early_top_pgt[PTRS_PER_PGD];
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
@@ -133,8 +135,23 @@ static inline u32 read_pkru(void)
 
 static inline void write_pkru(u32 pkru)
 {
-	if (boot_cpu_has(X86_FEATURE_OSPKE))
-		__write_pkru(pkru);
+	struct pkru_state *pk;
+
+	if (!boot_cpu_has(X86_FEATURE_OSPKE))
+		return;
+
+	pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
+
+	/*
+	 * The PKRU value in xstate needs to be in sync with the value that is
+	 * written to the CPU. The FPU restore on return to userland would
+	 * otherwise load the previous value again.
+	 */
+	fpregs_lock();
+	if (pk)
+		pk->pkru = pkru;
+	__write_pkru(pkru);
+	fpregs_unlock();
 }
 
 static inline int pte_young(pte_t pte)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Inline copy_user_to_fpregs_zeroing()
  2019-04-03 16:41 ` [PATCH 20/27] x86/fpu: Inline copy_user_to_fpregs_zeroing() Sebastian Andrzej Siewior
@ 2019-04-13 21:00   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, pbonzini, Jason, luto, mingo, x86, dave.hansen, tglx, mingo,
	kvm, linux-kernel, riel, jannh, bp, rkrcmar, bigeasy

Commit-ID:  e0d3602f933367881bddfff310a744e6e61c284c
Gitweb:     https://git.kernel.org/tip/e0d3602f933367881bddfff310a744e6e61c284c
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:49 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Thu, 11 Apr 2019 20:45:20 +0200

x86/fpu: Inline copy_user_to_fpregs_zeroing()

Start refactoring __fpu__restore_sig() by inlining
copy_user_to_fpregs_zeroing(). The original function remains and will be
used to restore from userland memory if possible.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-21-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 9b9dfdc96285..c2ff43fbbd07 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -337,11 +337,29 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		kfree(tmp);
 		return err;
 	} else {
+		int ret;
+
 		/*
 		 * For 64-bit frames and 32-bit fsave frames, restore the user
 		 * state to the registers directly (with exceptions handled).
 		 */
-		if (copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only)) {
+		if (use_xsave()) {
+			if ((unsigned long)buf_fx % 64 || fx_only) {
+				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				ret = copy_user_to_fxregs(buf_fx);
+			} else {
+				u64 init_bv = xfeatures_mask & ~xfeatures;
+				if (unlikely(init_bv))
+					copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+				ret = copy_user_to_xregs(buf_fx, xfeatures);
+			}
+		} else if (use_fxsr()) {
+			ret = copy_user_to_fxregs(buf_fx);
+		} else
+			ret = copy_user_to_fregs(buf_fx);
+
+		if (ret) {
 			fpu__clear(fpu);
 			return -1;
 		}

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Restore from kernel memory on the 64-bit path too
  2019-04-03 16:41 ` [PATCH 21/27] x86/fpu: Let __fpu__restore_sig() restore the !32bit+fxsr frame from kernel memory Sebastian Andrzej Siewior
@ 2019-04-13 21:00   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dave.hansen, luto, x86, jannh, tglx, hpa, mingo, Jason,
	aubrey.li, kvm, riel, linux-kernel, bigeasy, bp, mingo, rkrcmar,
	pbonzini

Commit-ID:  926b21f37b072ae4c117052de45a975c6d468fec
Gitweb:     https://git.kernel.org/tip/926b21f37b072ae4c117052de45a975c6d468fec
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:50 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 15:02:41 +0200

x86/fpu: Restore from kernel memory on the 64-bit path too

The 64-bit case (both 64-bit and 32-bit frames) loads the new state from
user memory.

However, doing this is not desired if the FPU state is going to be
restored on return to userland: it would be required to disable
preemption in order to avoid a context switch which would set
TIF_NEED_FPU_LOAD. If this happens before the restore operation then the
loaded registers would become volatile.

Furthermore, disabling preemption while accessing user memory requires
to disable the pagefault handler. An error during FXRSTOR would then
mean that either a page fault occurred (and it would have to be retried
with enabled page fault handler) or a #GP occurred because the xstate is
bogus (after all, the signal handler can modify it).

In order to avoid that mess, copy the FPU state from userland, validate
it and then load it. The copy_kernel_…() helpers are basically just
like the old helpers except that they operate on kernel memory and the
fault handler just sets the error value and the caller handles it.

copy_user_to_fpregs_zeroing() and its helpers remain and will be used
later for a fastpath optimisation.

 [ bp: Clarify commit message. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-22-bigeasy@linutronix.de
---
 arch/x86/include/asm/fpu/internal.h | 43 +++++++++++++++++++++++++
 arch/x86/kernel/fpu/signal.c        | 62 +++++++++++++++++++++++++++++--------
 2 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index da75d7b3e37d..2cf04fbcba5d 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -121,6 +121,21 @@ extern void fpstate_sanitize_xstate(struct fpu *fpu);
 	err;								\
 })
 
+#define kernel_insn_err(insn, output, input...)				\
+({									\
+	int err;							\
+	asm volatile("1:" #insn "\n\t"					\
+		     "2:\n"						\
+		     ".section .fixup,\"ax\"\n"				\
+		     "3:  movl $-1,%[err]\n"				\
+		     "    jmp  2b\n"					\
+		     ".previous\n"					\
+		     _ASM_EXTABLE(1b, 3b)				\
+		     : [err] "=r" (err), output				\
+		     : "0"(0), input);					\
+	err;								\
+})
+
 #define kernel_insn(insn, output, input...)				\
 	asm volatile("1:" #insn "\n\t"					\
 		     "2:\n"						\
@@ -149,6 +164,14 @@ static inline void copy_kernel_to_fxregs(struct fxregs_state *fx)
 		kernel_insn(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
+static inline int copy_kernel_to_fxregs_err(struct fxregs_state *fx)
+{
+	if (IS_ENABLED(CONFIG_X86_32))
+		return kernel_insn_err(fxrstor %[fx], "=m" (*fx), [fx] "m" (*fx));
+	else
+		return kernel_insn_err(fxrstorq %[fx], "=m" (*fx), [fx] "m" (*fx));
+}
+
 static inline int copy_user_to_fxregs(struct fxregs_state __user *fx)
 {
 	if (IS_ENABLED(CONFIG_X86_32))
@@ -162,6 +185,11 @@ static inline void copy_kernel_to_fregs(struct fregs_state *fx)
 	kernel_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
 }
 
+static inline int copy_kernel_to_fregs_err(struct fregs_state *fx)
+{
+	return kernel_insn_err(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
+}
+
 static inline int copy_user_to_fregs(struct fregs_state __user *fx)
 {
 	return user_insn(frstor %[fx], "=m" (*fx), [fx] "m" (*fx));
@@ -361,6 +389,21 @@ static inline int copy_user_to_xregs(struct xregs_state __user *buf, u64 mask)
 	return err;
 }
 
+/*
+ * Restore xstate from kernel space xsave area, return an error code instead of
+ * an exception.
+ */
+static inline int copy_kernel_to_xregs_err(struct xregs_state *xstate, u64 mask)
+{
+	u32 lmask = mask;
+	u32 hmask = mask >> 32;
+	int err;
+
+	XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
+
+	return err;
+}
+
 /*
  * These must be called with preempt disabled. Returns
  * 'true' if the FPU state is still intact and we can
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index c2ff43fbbd07..9ea1eaa4c9b1 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -234,7 +234,8 @@ sanitize_restored_xstate(union fpregs_state *state,
 		 */
 		xsave->i387.mxcsr &= mxcsr_feature_mask;
 
-		convert_to_fxsr(&state->fxsave, ia32_env);
+		if (ia32_env)
+			convert_to_fxsr(&state->fxsave, ia32_env);
 	}
 }
 
@@ -337,28 +338,63 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		kfree(tmp);
 		return err;
 	} else {
+		union fpregs_state *state;
+		void *tmp;
 		int ret;
 
+		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
+		if (!tmp)
+			return -ENOMEM;
+		state = PTR_ALIGN(tmp, 64);
+
 		/*
 		 * For 64-bit frames and 32-bit fsave frames, restore the user
 		 * state to the registers directly (with exceptions handled).
 		 */
-		if (use_xsave()) {
-			if ((unsigned long)buf_fx % 64 || fx_only) {
-				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-				ret = copy_user_to_fxregs(buf_fx);
+		if ((unsigned long)buf_fx % 64)
+			fx_only = 1;
+
+		if (use_xsave() && !fx_only) {
+			u64 init_bv = xfeatures_mask & ~xfeatures;
+
+			if (using_compacted_format()) {
+				ret = copy_user_to_xstate(&state->xsave, buf_fx);
 			} else {
-				u64 init_bv = xfeatures_mask & ~xfeatures;
-				if (unlikely(init_bv))
-					copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-				ret = copy_user_to_xregs(buf_fx, xfeatures);
+				ret = __copy_from_user(&state->xsave, buf_fx, state_size);
+
+				if (!ret && state_size > offsetof(struct xregs_state, header))
+					ret = validate_xstate_header(&state->xsave.header);
 			}
+			if (ret)
+				goto err_out;
+
+			sanitize_restored_xstate(state, NULL, xfeatures, fx_only);
+
+			if (unlikely(init_bv))
+				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
+
 		} else if (use_fxsr()) {
-			ret = copy_user_to_fxregs(buf_fx);
-		} else
-			ret = copy_user_to_fregs(buf_fx);
+			ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
+			if (ret)
+				goto err_out;
 
+			if (use_xsave()) {
+				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+			}
+			state->fxsave.mxcsr &= mxcsr_feature_mask;
+
+			ret = copy_kernel_to_fxregs_err(&state->fxsave);
+		} else {
+			ret = __copy_from_user(&state->fsave, buf_fx, state_size);
+			if (ret)
+				goto err_out;
+			ret = copy_kernel_to_fregs_err(&state->fsave);
+		}
+
+err_out:
+		kfree(tmp);
 		if (ret) {
 			fpu__clear(fpu);
 			return -1;

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Merge the two code paths in __fpu__restore_sig()
  2019-04-03 16:41 ` [PATCH 22/27] x86/fpu: Merge the two code paths in __fpu__restore_sig() Sebastian Andrzej Siewior
@ 2019-04-13 21:01   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: rkrcmar, hpa, mingo, bigeasy, tglx, dave.hansen, bp, pbonzini,
	luto, jannh, linux-kernel, kvm, x86, riel, Jason, mingo

Commit-ID:  c2ff9e9a3d9d6c019394a22989a228d02970a8b1
Gitweb:     https://git.kernel.org/tip/c2ff9e9a3d9d6c019394a22989a228d02970a8b1
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:51 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 15:41:25 +0200

x86/fpu: Merge the two code paths in __fpu__restore_sig()

The ia32_fxstate case (32bit with fxsr) and the other (64bit frames or
32bit frames without fxsr) restore both from kernel memory and sanitize
the content.

The !ia32_fxstate version restores missing xstates from "init state"
while the ia32_fxstate doesn't and skips it.

Merge the two code paths and keep the !ia32_fxstate one. Copy only the
user_i387_ia32_struct data structure in the ia32_fxstate.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-23-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 139 +++++++++++++++++--------------------------
 1 file changed, 54 insertions(+), 85 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 9ea1eaa4c9b1..b13e86b29426 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -263,12 +263,17 @@ static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_
 
 static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 {
+	struct user_i387_ia32_struct *envp = NULL;
+	int state_size = fpu_kernel_xstate_size;
 	int ia32_fxstate = (buf != buf_fx);
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
-	int state_size = fpu_kernel_xstate_size;
+	struct user_i387_ia32_struct env;
+	union fpregs_state *state;
 	u64 xfeatures = 0;
 	int fx_only = 0;
+	int ret = 0;
+	void *tmp;
 
 	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
 			 IS_ENABLED(CONFIG_IA32_EMULATION));
@@ -303,105 +308,69 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		}
 	}
 
+	tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
+	if (!tmp)
+		return -ENOMEM;
+	state = PTR_ALIGN(tmp, 64);
+
+	if ((unsigned long)buf_fx % 64)
+		fx_only = 1;
+
+	/*
+	 * For 32-bit frames with fxstate, copy the fxstate so it can be
+	 * reconstructed later.
+	 */
 	if (ia32_fxstate) {
-		/*
-		 * For 32-bit frames with fxstate, copy the user state to the
-		 * thread's fpu state, reconstruct fxstate from the fsave
-		 * header. Validate and sanitize the copied state.
-		 */
-		struct user_i387_ia32_struct env;
-		union fpregs_state *state;
-		int err = 0;
-		void *tmp;
+		ret = __copy_from_user(&env, buf, sizeof(env));
+		if (ret)
+			goto err_out;
+		envp = &env;
+	}
 
-		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
-		if (!tmp)
-			return -ENOMEM;
-		state = PTR_ALIGN(tmp, 64);
+	if (use_xsave() && !fx_only) {
+		u64 init_bv = xfeatures_mask & ~xfeatures;
 
 		if (using_compacted_format()) {
-			err = copy_user_to_xstate(&state->xsave, buf_fx);
+			ret = copy_user_to_xstate(&state->xsave, buf_fx);
 		} else {
-			err = __copy_from_user(&state->xsave, buf_fx, state_size);
+			ret = __copy_from_user(&state->xsave, buf_fx, state_size);
 
-			if (!err && state_size > offsetof(struct xregs_state, header))
-				err = validate_xstate_header(&state->xsave.header);
+			if (!ret && state_size > offsetof(struct xregs_state, header))
+				ret = validate_xstate_header(&state->xsave.header);
 		}
+		if (ret)
+			goto err_out;
 
-		if (err || __copy_from_user(&env, buf, sizeof(env))) {
-			err = -1;
-		} else {
-			sanitize_restored_xstate(state, &env, xfeatures, fx_only);
-			copy_kernel_to_fpregs(state);
-		}
-
-		kfree(tmp);
-		return err;
-	} else {
-		union fpregs_state *state;
-		void *tmp;
-		int ret;
-
-		tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
-		if (!tmp)
-			return -ENOMEM;
-		state = PTR_ALIGN(tmp, 64);
-
-		/*
-		 * For 64-bit frames and 32-bit fsave frames, restore the user
-		 * state to the registers directly (with exceptions handled).
-		 */
-		if ((unsigned long)buf_fx % 64)
-			fx_only = 1;
-
-		if (use_xsave() && !fx_only) {
-			u64 init_bv = xfeatures_mask & ~xfeatures;
-
-			if (using_compacted_format()) {
-				ret = copy_user_to_xstate(&state->xsave, buf_fx);
-			} else {
-				ret = __copy_from_user(&state->xsave, buf_fx, state_size);
-
-				if (!ret && state_size > offsetof(struct xregs_state, header))
-					ret = validate_xstate_header(&state->xsave.header);
-			}
-			if (ret)
-				goto err_out;
-
-			sanitize_restored_xstate(state, NULL, xfeatures, fx_only);
-
-			if (unlikely(init_bv))
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-			ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
+		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
 
-		} else if (use_fxsr()) {
-			ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
-			if (ret)
-				goto err_out;
+		if (unlikely(init_bv))
+			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
+		ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
 
-			if (use_xsave()) {
-				u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
-				copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-			}
-			state->fxsave.mxcsr &= mxcsr_feature_mask;
+	} else if (use_fxsr()) {
+		ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
+		if (ret)
+			goto err_out;
 
-			ret = copy_kernel_to_fxregs_err(&state->fxsave);
-		} else {
-			ret = __copy_from_user(&state->fsave, buf_fx, state_size);
-			if (ret)
-				goto err_out;
-			ret = copy_kernel_to_fregs_err(&state->fsave);
+		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+		if (use_xsave()) {
+			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
+			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
 		}
 
-err_out:
-		kfree(tmp);
-		if (ret) {
-			fpu__clear(fpu);
-			return -1;
-		}
+		ret = copy_kernel_to_fxregs_err(&state->fxsave);
+	} else {
+		ret = __copy_from_user(&state->fsave, buf_fx, state_size);
+		if (ret)
+			goto err_out;
+		ret = copy_kernel_to_fregs_err(&state->fsave);
 	}
 
-	return 0;
+err_out:
+	kfree(tmp);
+	if (ret)
+		fpu__clear(fpu);
+	return ret;
 }
 
 static inline int xstate_sigframe_size(void)

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Defer FPU state load until return to userspace
  2019-04-03 16:41 ` [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace Sebastian Andrzej Siewior
  2019-04-12 14:36   ` Borislav Petkov
@ 2019-04-13 21:02   ` tip-bot for Rik van Riel
  1 sibling, 0 replies; 83+ messages in thread
From: tip-bot for Rik van Riel @ 2019-04-13 21:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, hpa, Jason, mingo, longman, kvm, rkrcmar, bigeasy,
	nstange, bp, chang.seok.bae, linux-kernel, jroedel, wang.yi59,
	pbonzini, riel, jannh, x86, tglx, konrad.wilk, dima, aubrey.li,
	dave.hansen, tim.c.chen, Babu.Moger, luto

Commit-ID:  5f409e20b794565e2d60ad333e79334630a6c798
Gitweb:     https://git.kernel.org/tip/5f409e20b794565e2d60ad333e79334630a6c798
Author:     Rik van Riel <riel@surriel.com>
AuthorDate: Wed, 3 Apr 2019 18:41:52 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 19:34:47 +0200

x86/fpu: Defer FPU state load until return to userspace

Defer loading of FPU state until return to userspace. This gives
the kernel the potential to skip loading FPU state for tasks that
stay in kernel mode, or for tasks that end up with repeated
invocations of kernel_fpu_begin() & kernel_fpu_end().

The fpregs_lock/unlock() section ensures that the registers remain
unchanged. Otherwise a context switch or a bottom half could save the
registers to its FPU context and the processor's FPU registers would
became random if modified at the same time.

KVM swaps the host/guest registers on entry/exit path. This flow has
been kept as is. First it ensures that the registers are loaded and then
saves the current (host) state before it loads the guest's registers. The
swap is done at the very end with disabled interrupts so it should not
change anymore before theg guest is entered. The read/save version seems
to be cheaper compared to memcpy() in a micro benchmark.

Each thread gets TIF_NEED_FPU_LOAD set as part of fork() / fpu__copy().
For kernel threads, this flag gets never cleared which avoids saving /
restoring the FPU state for kernel threads and during in-kernel usage of
the FPU registers.

 [
   bp: Correct and update commit message and fix checkpatch warnings.
   s/register/registers/ where it is used in plural.
   minor comment corrections.
   remove unused trace_x86_fpu_activate_state() TP.
 ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Waiman Long <longman@redhat.com>
Cc: x86-ml <x86@kernel.org>
Cc: Yi Wang <wang.yi59@zte.com.cn>
Link: https://lkml.kernel.org/r/20190403164156.19645-24-bigeasy@linutronix.de
---
 arch/x86/entry/common.c             |  10 +++-
 arch/x86/include/asm/fpu/api.h      |  22 +++++++-
 arch/x86/include/asm/fpu/internal.h |  27 +++++----
 arch/x86/include/asm/trace/fpu.h    |  10 ++--
 arch/x86/kernel/fpu/core.c          | 106 +++++++++++++++++++++++++++---------
 arch/x86/kernel/fpu/signal.c        |  49 ++++++++++-------
 arch/x86/kernel/process.c           |   2 +-
 arch/x86/kernel/process_32.c        |   5 +-
 arch/x86/kernel/process_64.c        |   5 +-
 arch/x86/kvm/x86.c                  |  20 +++++--
 10 files changed, 184 insertions(+), 72 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 7bc105f47d21..51beb8d29123 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -25,12 +25,13 @@
 #include <linux/uprobes.h>
 #include <linux/livepatch.h>
 #include <linux/syscalls.h>
+#include <linux/uaccess.h>
 
 #include <asm/desc.h>
 #include <asm/traps.h>
 #include <asm/vdso.h>
-#include <linux/uaccess.h>
 #include <asm/cpufeature.h>
+#include <asm/fpu/api.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
@@ -196,6 +197,13 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
 	if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
 		exit_to_usermode_loop(regs, cached_flags);
 
+	/* Reload ti->flags; we may have rescheduled above. */
+	cached_flags = READ_ONCE(ti->flags);
+
+	fpregs_assert_state_consistent();
+	if (unlikely(cached_flags & _TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+
 #ifdef CONFIG_COMPAT
 	/*
 	 * Compat syscalls set TS_COMPAT.  Make sure we clear it before
diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h
index 73e684160f35..b774c52e5411 100644
--- a/arch/x86/include/asm/fpu/api.h
+++ b/arch/x86/include/asm/fpu/api.h
@@ -10,7 +10,7 @@
 
 #ifndef _ASM_X86_FPU_API_H
 #define _ASM_X86_FPU_API_H
-#include <linux/preempt.h>
+#include <linux/bottom_half.h>
 
 /*
  * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It
@@ -22,17 +22,37 @@
 extern void kernel_fpu_begin(void);
 extern void kernel_fpu_end(void);
 extern bool irq_fpu_usable(void);
+extern void fpregs_mark_activate(void);
 
+/*
+ * Use fpregs_lock() while editing CPU's FPU registers or fpu->state.
+ * A context switch will (and softirq might) save CPU's FPU registers to
+ * fpu->state and set TIF_NEED_FPU_LOAD leaving CPU's FPU registers in
+ * a random state.
+ */
 static inline void fpregs_lock(void)
 {
 	preempt_disable();
+	local_bh_disable();
 }
 
 static inline void fpregs_unlock(void)
 {
+	local_bh_enable();
 	preempt_enable();
 }
 
+#ifdef CONFIG_X86_DEBUG_FPU
+extern void fpregs_assert_state_consistent(void);
+#else
+static inline void fpregs_assert_state_consistent(void) { }
+#endif
+
+/*
+ * Load the task FPU state before returning to userspace.
+ */
+extern void switch_fpu_return(void);
+
 /*
  * Query the presence of one or more xfeatures. Works on any legacy CPU as well.
  *
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 2cf04fbcba5d..0c8a5093647a 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -30,7 +30,7 @@ extern void fpu__prepare_write(struct fpu *fpu);
 extern void fpu__save(struct fpu *fpu);
 extern int  fpu__restore_sig(void __user *buf, int ia32_frame);
 extern void fpu__drop(struct fpu *fpu);
-extern int  fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu);
+extern int  fpu__copy(struct task_struct *dst, struct task_struct *src);
 extern void fpu__clear(struct fpu *fpu);
 extern int  fpu__exception_code(struct fpu *fpu, int trap_nr);
 extern int  dump_fpu(struct pt_regs *ptregs, struct user_i387_struct *fpstate);
@@ -531,13 +531,20 @@ static inline void fpregs_activate(struct fpu *fpu)
 /*
  * Internal helper, do not use directly. Use switch_fpu_return() instead.
  */
-static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
+static inline void __fpregs_load_activate(void)
 {
+	struct fpu *fpu = &current->thread.fpu;
+	int cpu = smp_processor_id();
+
+	if (WARN_ON_ONCE(current->mm == NULL))
+		return;
+
 	if (!fpregs_state_valid(fpu, cpu)) {
-		if (current->mm)
-			copy_kernel_to_fpregs(&fpu->state);
+		copy_kernel_to_fpregs(&fpu->state);
 		fpregs_activate(fpu);
+		fpu->last_cpu = cpu;
 	}
+	clear_thread_flag(TIF_NEED_FPU_LOAD);
 }
 
 /*
@@ -548,8 +555,8 @@ static inline void __fpregs_load_activate(struct fpu *fpu, int cpu)
  *  - switch_fpu_prepare() saves the old state.
  *    This is done within the context of the old process.
  *
- *  - switch_fpu_finish() restores the new state as
- *    necessary.
+ *  - switch_fpu_finish() sets TIF_NEED_FPU_LOAD; the floating point state
+ *    will get loaded on return to userspace, or when the kernel needs it.
  *
  * If TIF_NEED_FPU_LOAD is cleared then the CPU's FPU registers
  * are saved in the current thread's FPU register state.
@@ -581,10 +588,10 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
  */
 
 /*
- * Set up the userspace FPU context for the new task, if the task
- * has used the FPU.
+ * Load PKRU from the FPU context if available. Delay loading of the
+ * complete FPU state until the return to userland.
  */
-static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
+static inline void switch_fpu_finish(struct fpu *new_fpu)
 {
 	u32 pkru_val = init_pkru_value;
 	struct pkru_state *pk;
@@ -592,7 +599,7 @@ static inline void switch_fpu_finish(struct fpu *new_fpu, int cpu)
 	if (!static_cpu_has(X86_FEATURE_FPU))
 		return;
 
-	__fpregs_load_activate(new_fpu, cpu);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
 
 	if (!cpu_feature_enabled(X86_FEATURE_OSPKE))
 		return;
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index bd65f6ba950f..879b77792f94 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -13,19 +13,22 @@ DECLARE_EVENT_CLASS(x86_fpu,
 
 	TP_STRUCT__entry(
 		__field(struct fpu *, fpu)
+		__field(bool, load_fpu)
 		__field(u64, xfeatures)
 		__field(u64, xcomp_bv)
 		),
 
 	TP_fast_assign(
 		__entry->fpu		= fpu;
+		__entry->load_fpu	= test_thread_flag(TIF_NEED_FPU_LOAD);
 		if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
 			__entry->xfeatures = fpu->state.xsave.header.xfeatures;
 			__entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
 		}
 	),
-	TP_printk("x86/fpu: %p xfeatures: %llx xcomp_bv: %llx",
+	TP_printk("x86/fpu: %p load: %d xfeatures: %llx xcomp_bv: %llx",
 			__entry->fpu,
+			__entry->load_fpu,
 			__entry->xfeatures,
 			__entry->xcomp_bv
 	)
@@ -61,11 +64,6 @@ DEFINE_EVENT(x86_fpu, x86_fpu_regs_deactivated,
 	TP_ARGS(fpu)
 );
 
-DEFINE_EVENT(x86_fpu, x86_fpu_activate_state,
-	TP_PROTO(struct fpu *fpu),
-	TP_ARGS(fpu)
-);
-
 DEFINE_EVENT(x86_fpu, x86_fpu_init_state,
 	TP_PROTO(struct fpu *fpu),
 	TP_ARGS(fpu)
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 739ca3ae2bdc..ce243f76bdb7 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -102,23 +102,20 @@ static void __kernel_fpu_begin(void)
 	kernel_fpu_disable();
 
 	if (current->mm) {
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
-		copy_fpregs_to_fpstate(fpu);
-	} else {
-		__cpu_invalidate_fpregs_state();
+		if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+			set_thread_flag(TIF_NEED_FPU_LOAD);
+			/*
+			 * Ignore return value -- we don't care if reg state
+			 * is clobbered.
+			 */
+			copy_fpregs_to_fpstate(fpu);
+		}
 	}
+	__cpu_invalidate_fpregs_state();
 }
 
 static void __kernel_fpu_end(void)
 {
-	struct fpu *fpu = &current->thread.fpu;
-
-	if (current->mm)
-		copy_kernel_to_fpregs(&fpu->state);
-
 	kernel_fpu_enable();
 }
 
@@ -145,14 +142,17 @@ void fpu__save(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
-	preempt_disable();
+	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!copy_fpregs_to_fpstate(fpu))
-		copy_kernel_to_fpregs(&fpu->state);
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
+		if (!copy_fpregs_to_fpstate(fpu)) {
+			copy_kernel_to_fpregs(&fpu->state);
+		}
+	}
 
 	trace_x86_fpu_after_save(fpu);
-	preempt_enable();
+	fpregs_unlock();
 }
 EXPORT_SYMBOL_GPL(fpu__save);
 
@@ -185,8 +185,11 @@ void fpstate_init(union fpregs_state *state)
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
-int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
+int fpu__copy(struct task_struct *dst, struct task_struct *src)
 {
+	struct fpu *dst_fpu = &dst->thread.fpu;
+	struct fpu *src_fpu = &src->thread.fpu;
+
 	dst_fpu->last_cpu = -1;
 
 	if (!static_cpu_has(X86_FEATURE_FPU))
@@ -201,16 +204,23 @@ int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_size);
 
 	/*
-	 * Save current FPU registers directly into the child
-	 * FPU context, without any memory-to-memory copying.
+	 * If the FPU registers are not current just memcpy() the state.
+	 * Otherwise save current FPU registers directly into the child's FPU
+	 * context, without any memory-to-memory copying.
 	 *
 	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to copy them back. )
+	 *   register contents so we have to load them back. )
 	 */
-	if (!copy_fpregs_to_fpstate(dst_fpu)) {
-		memcpy(&src_fpu->state, &dst_fpu->state, fpu_kernel_xstate_size);
-		copy_kernel_to_fpregs(&src_fpu->state);
-	}
+	fpregs_lock();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
+
+	else if (!copy_fpregs_to_fpstate(dst_fpu))
+		copy_kernel_to_fpregs(&dst_fpu->state);
+
+	fpregs_unlock();
+
+	set_tsk_thread_flag(dst, TIF_NEED_FPU_LOAD);
 
 	trace_x86_fpu_copy_src(src_fpu);
 	trace_x86_fpu_copy_dst(dst_fpu);
@@ -226,10 +236,9 @@ static void fpu__initialize(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
+	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpstate_init(&fpu->state);
 	trace_x86_fpu_init_state(fpu);
-
-	trace_x86_fpu_activate_state(fpu);
 }
 
 /*
@@ -308,6 +317,8 @@ void fpu__drop(struct fpu *fpu)
  */
 static inline void copy_init_fpstate_to_fpregs(void)
 {
+	fpregs_lock();
+
 	if (use_xsave())
 		copy_kernel_to_xregs(&init_fpstate.xsave, -1);
 	else if (static_cpu_has(X86_FEATURE_FXSR))
@@ -317,6 +328,9 @@ static inline void copy_init_fpstate_to_fpregs(void)
 
 	if (boot_cpu_has(X86_FEATURE_OSPKE))
 		copy_init_pkru_to_fpregs();
+
+	fpregs_mark_activate();
+	fpregs_unlock();
 }
 
 /*
@@ -339,6 +353,46 @@ void fpu__clear(struct fpu *fpu)
 		copy_init_fpstate_to_fpregs();
 }
 
+/*
+ * Load FPU context before returning to userspace.
+ */
+void switch_fpu_return(void)
+{
+	if (!static_cpu_has(X86_FEATURE_FPU))
+		return;
+
+	__fpregs_load_activate();
+}
+EXPORT_SYMBOL_GPL(switch_fpu_return);
+
+#ifdef CONFIG_X86_DEBUG_FPU
+/*
+ * If current FPU state according to its tracking (loaded FPU context on this
+ * CPU) is not valid then we must have TIF_NEED_FPU_LOAD set so the context is
+ * loaded on return to userland.
+ */
+void fpregs_assert_state_consistent(void)
+{
+	struct fpu *fpu = &current->thread.fpu;
+
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		return;
+
+	WARN_ON_FPU(!fpregs_state_valid(fpu, smp_processor_id()));
+}
+EXPORT_SYMBOL_GPL(fpregs_assert_state_consistent);
+#endif
+
+void fpregs_mark_activate(void)
+{
+	struct fpu *fpu = &current->thread.fpu;
+
+	fpregs_activate(fpu);
+	fpu->last_cpu = smp_processor_id();
+	clear_thread_flag(TIF_NEED_FPU_LOAD);
+}
+EXPORT_SYMBOL_GPL(fpregs_mark_activate);
+
 /*
  * x87 math exception handling:
  */
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index b13e86b29426..6df1f15e0cd5 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -269,11 +269,9 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 	struct task_struct *tsk = current;
 	struct fpu *fpu = &tsk->thread.fpu;
 	struct user_i387_ia32_struct env;
-	union fpregs_state *state;
 	u64 xfeatures = 0;
 	int fx_only = 0;
 	int ret = 0;
-	void *tmp;
 
 	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
 			 IS_ENABLED(CONFIG_IA32_EMULATION));
@@ -308,14 +306,18 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		}
 	}
 
-	tmp = kzalloc(sizeof(*state) + fpu_kernel_xstate_size + 64, GFP_KERNEL);
-	if (!tmp)
-		return -ENOMEM;
-	state = PTR_ALIGN(tmp, 64);
+	/*
+	 * The current state of the FPU registers does not matter. By setting
+	 * TIF_NEED_FPU_LOAD unconditionally it is ensured that the our xstate
+	 * is not modified on context switch and that the xstate is considered
+	 * to be loaded again on return to userland (overriding last_cpu avoids
+	 * the optimisation).
+	 */
+	set_thread_flag(TIF_NEED_FPU_LOAD);
+	__fpu_invalidate_fpregs_state(fpu);
 
 	if ((unsigned long)buf_fx % 64)
 		fx_only = 1;
-
 	/*
 	 * For 32-bit frames with fxstate, copy the fxstate so it can be
 	 * reconstructed later.
@@ -331,43 +333,52 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		u64 init_bv = xfeatures_mask & ~xfeatures;
 
 		if (using_compacted_format()) {
-			ret = copy_user_to_xstate(&state->xsave, buf_fx);
+			ret = copy_user_to_xstate(&fpu->state.xsave, buf_fx);
 		} else {
-			ret = __copy_from_user(&state->xsave, buf_fx, state_size);
+			ret = __copy_from_user(&fpu->state.xsave, buf_fx, state_size);
 
 			if (!ret && state_size > offsetof(struct xregs_state, header))
-				ret = validate_xstate_header(&state->xsave.header);
+				ret = validate_xstate_header(&fpu->state.xsave.header);
 		}
 		if (ret)
 			goto err_out;
 
-		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+		sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
 
+		fpregs_lock();
 		if (unlikely(init_bv))
 			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
-		ret = copy_kernel_to_xregs_err(&state->xsave, xfeatures);
+		ret = copy_kernel_to_xregs_err(&fpu->state.xsave, xfeatures);
 
 	} else if (use_fxsr()) {
-		ret = __copy_from_user(&state->fxsave, buf_fx, state_size);
-		if (ret)
+		ret = __copy_from_user(&fpu->state.fxsave, buf_fx, state_size);
+		if (ret) {
+			ret = -EFAULT;
 			goto err_out;
+		}
+
+		sanitize_restored_xstate(&fpu->state, envp, xfeatures, fx_only);
 
-		sanitize_restored_xstate(state, envp, xfeatures, fx_only);
+		fpregs_lock();
 		if (use_xsave()) {
 			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
 			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
 		}
 
-		ret = copy_kernel_to_fxregs_err(&state->fxsave);
+		ret = copy_kernel_to_fxregs_err(&fpu->state.fxsave);
 	} else {
-		ret = __copy_from_user(&state->fsave, buf_fx, state_size);
+		ret = __copy_from_user(&fpu->state.fsave, buf_fx, state_size);
 		if (ret)
 			goto err_out;
-		ret = copy_kernel_to_fregs_err(&state->fsave);
+
+		fpregs_lock();
+		ret = copy_kernel_to_fregs_err(&fpu->state.fsave);
 	}
+	if (!ret)
+		fpregs_mark_activate();
+	fpregs_unlock();
 
 err_out:
-	kfree(tmp);
 	if (ret)
 		fpu__clear(fpu);
 	return ret;
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 58ac7be52c7a..b9b8e6eb74f2 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -101,7 +101,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 	dst->thread.vm86 = NULL;
 #endif
 
-	return fpu__copy(&dst->thread.fpu, &src->thread.fpu);
+	return fpu__copy(dst, src);
 }
 
 /*
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 77d9eb43ccac..1bc47f3a4885 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -234,7 +234,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
-	switch_fpu_prepare(prev_fpu, cpu);
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_prepare(prev_fpu, cpu);
 
 	/*
 	 * Save away %gs. No need to save %fs, as it was saved on the
@@ -290,7 +291,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 
 	this_cpu_write(current_task, next_p);
 
-	switch_fpu_finish(next_fpu, cpu);
+	switch_fpu_finish(next_fpu);
 
 	/* Load the Intel cache allocation PQR MSR. */
 	resctrl_sched_in();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ffea7c557963..37b2ecef041e 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -520,7 +520,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_DEBUG_ENTRY) &&
 		     this_cpu_read(irq_count) != -1);
 
-	switch_fpu_prepare(prev_fpu, cpu);
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_prepare(prev_fpu, cpu);
 
 	/* We must save %fs and %gs before load_TLS() because
 	 * %fs and %gs may be cleared by load_TLS().
@@ -572,7 +573,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	this_cpu_write(current_task, next_p);
 	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
 
-	switch_fpu_finish(next_fpu, cpu);
+	switch_fpu_finish(next_fpu);
 
 	/* Reload sp0. */
 	update_task_stack(next_p);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8022e7769b3a..e340c3c0cba3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7877,6 +7877,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		wait_lapic_expire(vcpu);
 	guest_enter_irqoff();
 
+	fpregs_assert_state_consistent();
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		switch_fpu_return();
+
 	if (unlikely(vcpu->arch.switch_db_regs)) {
 		set_debugreg(0, 7);
 		set_debugreg(vcpu->arch.eff_db[0], 0);
@@ -8137,22 +8141,30 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
 /* Swap (qemu) user FPU context for the guest FPU context. */
 static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 {
-	preempt_disable();
+	fpregs_lock();
+
 	copy_fpregs_to_fpstate(&current->thread.fpu);
 	/* PKRU is separately restored in kvm_x86_ops->run.  */
 	__copy_kernel_to_fpregs(&vcpu->arch.guest_fpu->state,
 				~XFEATURE_MASK_PKRU);
-	preempt_enable();
+
+	fpregs_mark_activate();
+	fpregs_unlock();
+
 	trace_kvm_fpu(1);
 }
 
 /* When vcpu_run ends, restore user space FPU context. */
 static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 {
-	preempt_disable();
+	fpregs_lock();
+
 	copy_fpregs_to_fpstate(vcpu->arch.guest_fpu);
 	copy_kernel_to_fpregs(&current->thread.fpu.state);
-	preempt_enable();
+
+	fpregs_mark_activate();
+	fpregs_unlock();
+
 	++vcpu->stat.fpu_reload;
 	trace_kvm_fpu(0);
 }

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Add a fastpath to __fpu__restore_sig()
  2019-04-03 16:41 ` [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig() Sebastian Andrzej Siewior
  2019-04-08 17:05   ` Thomas Gleixner
  2019-04-12 17:17   ` Borislav Petkov
@ 2019-04-13 21:02   ` tip-bot for Sebastian Andrzej Siewior
  2 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pbonzini, mingo, x86, riel, tglx, bigeasy, Jason, dave.hansen,
	luto, bp, hpa, jannh, rkrcmar, mingo, kvm, linux-kernel

Commit-ID:  1d731e731c4cd7cbd3b1aa295f0932e7610da82f
Gitweb:     https://git.kernel.org/tip/1d731e731c4cd7cbd3b1aa295f0932e7610da82f
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:53 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 20:04:49 +0200

x86/fpu: Add a fastpath to __fpu__restore_sig()

The previous commits refactor the restoration of the FPU registers so
that they can be loaded from in-kernel memory. This overhead can be
avoided if the load can be performed without a pagefault.

Attempt to restore FPU registers by invoking
copy_user_to_fpregs_zeroing(). If it fails try the slowpath which can
handle pagefaults.

 [ bp: Add a comment over the fastpath to be able to find one's way
   around the function. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-25-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 6df1f15e0cd5..a1bd7be70206 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -242,10 +242,10 @@ sanitize_restored_xstate(union fpregs_state *state,
 /*
  * Restore the extended state if present. Otherwise, restore the FP/SSE state.
  */
-static inline int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
+static int copy_user_to_fpregs_zeroing(void __user *buf, u64 xbv, int fx_only)
 {
 	if (use_xsave()) {
-		if ((unsigned long)buf % 64 || fx_only) {
+		if (fx_only) {
 			u64 init_bv = xfeatures_mask & ~XFEATURE_MASK_FPSSE;
 			copy_kernel_to_xregs(&init_fpstate.xsave, init_bv);
 			return copy_user_to_fxregs(buf);
@@ -327,8 +327,27 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		if (ret)
 			goto err_out;
 		envp = &env;
+	} else {
+		/*
+		 * Attempt to restore the FPU registers directly from user
+		 * memory. For that to succeed, the user access cannot cause
+		 * page faults. If it does, fall back to the slow path below,
+		 * going through the kernel buffer with the enabled pagefault
+		 * handler.
+		 */
+		fpregs_lock();
+		pagefault_disable();
+		ret = copy_user_to_fpregs_zeroing(buf_fx, xfeatures, fx_only);
+		pagefault_enable();
+		if (!ret) {
+			fpregs_mark_activate();
+			fpregs_unlock();
+			return 0;
+		}
+		fpregs_unlock();
 	}
 
+
 	if (use_xsave() && !fx_only) {
 		u64 init_bv = xfeatures_mask & ~xfeatures;
 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()
  2019-04-03 16:41 ` [PATCH 25/27] x86/fpu: Add a fastpath to copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
@ 2019-04-13 21:03   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jannh, Jason, bigeasy, tglx, rkrcmar, riel, kvm, dave.hansen,
	mingo, linux-kernel, x86, hpa, bp, luto, pbonzini, mingo

Commit-ID:  da2f32fb8dc7cbd9433cb2e990693734b30a2465
Gitweb:     https://git.kernel.org/tip/da2f32fb8dc7cbd9433cb2e990693734b30a2465
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:54 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 20:05:36 +0200

x86/fpu: Add a fastpath to copy_fpstate_to_sigframe()

Try to save the FPU registers directly to the userland stack frame if
the CPU holds the FPU registers for the current task. This has to be
done with the pagefault disabled because we can't fault (while the FPU
registers are locked) and therefore the operation might fail. If it
fails try the slowpath which can handle faults.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-26-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a1bd7be70206..3c3167576216 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -144,8 +144,10 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  *	buf == buf_fx for 64-bit frames and 32-bit fsave frame.
  *	buf != buf_fx for 32-bit frames with fxstate.
  *
- * Save the state to task's fpu->state and then copy it to the user frame
- * pointed to by the aligned pointer 'buf_fx'.
+ * Try to save it directly to the user frame with disabled page fault handler.
+ * If this fails then do the slow path where the FPU state is first saved to
+ * task's fpu->state and then copy it to the user frame pointed to by the
+ * aligned pointer 'buf_fx'.
  *
  * If this is a 32-bit frame with fxstate, put a fsave header before
  * the aligned state at 'buf_fx'.
@@ -159,6 +161,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
+	int ret = -EFAULT;
 
 	ia32_fxstate &= (IS_ENABLED(CONFIG_X86_32) ||
 			 IS_ENABLED(CONFIG_IA32_EMULATION));
@@ -173,23 +176,30 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 
 	/*
 	 * If we do not need to load the FPU registers at return to userspace
-	 * then the CPU has the current state and we need to save it. Otherwise,
-	 * it has already been done and we can skip it.
+	 * then the CPU has the current state. Try to save it directly to
+	 * userland's stack frame if it does not cause a pagefault. If it does,
+	 * try the slowpath.
 	 */
 	fpregs_lock();
 	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		copy_fpregs_to_fpstate(fpu);
+		pagefault_disable();
+		ret = copy_fpregs_to_sigframe(buf_fx);
+		pagefault_enable();
+		if (ret)
+			copy_fpregs_to_fpstate(fpu);
 		set_thread_flag(TIF_NEED_FPU_LOAD);
 	}
 	fpregs_unlock();
 
-	if (using_compacted_format()) {
-		if (copy_xstate_to_user(buf_fx, xsave, 0, size))
-			return -1;
-	} else {
-		fpstate_sanitize_xstate(fpu);
-		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
-			return -1;
+	if (ret) {
+		if (using_compacted_format()) {
+			if (copy_xstate_to_user(buf_fx, xsave, 0, size))
+				return -1;
+		} else {
+			fpstate_sanitize_xstate(fpu);
+			if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
+				return -1;
+		}
 	}
 
 	/* Save the fsave header for the 32-bit frames. */

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/fpu: Restore regs in copy_fpstate_to_sigframe() in order to use the fastpath
  2019-04-03 16:41 ` [PATCH 26/27] x86/fpu: Restore FPU register in copy_fpstate_to_sigframe() in order to use the fastpath Sebastian Andrzej Siewior
@ 2019-04-13 21:04   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pbonzini, rkrcmar, x86, mingo, mingo, Jason, kvm, dave.hansen,
	riel, luto, bp, bigeasy, tglx, jannh, hpa, linux-kernel

Commit-ID:  06b251dff78704c7d122bd109384d970a7dbe94d
Gitweb:     https://git.kernel.org/tip/06b251dff78704c7d122bd109384d970a7dbe94d
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Fri, 12 Apr 2019 20:16:15 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 20:16:15 +0200

x86/fpu: Restore regs in copy_fpstate_to_sigframe() in order to use the fastpath

If a task is scheduled out and receives a signal then it won't be
able to take the fastpath because the registers aren't available. The
slowpath is more expensive compared to XRSTOR + XSAVE which usually
succeeds.

Here are some clock_gettime() numbers from a bigger box with AVX512
during bootup:

- __fpregs_load_activate() takes 140ns - 350ns. If it was the most recent
  FPU context on the CPU then the optimisation in __fpregs_load_activate()
  will skip the load (which was disabled during the test).

- copy_fpregs_to_sigframe() takes 200ns - 450ns if it succeeds. On a
  pagefault it is 1.8us - 3us usually in the 2.6us area.

- The slowpath takes 1.5us - 6us. Usually in the 2.6us area.

My testcases (including lat_sig) take the fastpath without
__fpregs_load_activate(). I expect this to be the majority.

Since the slowpath is in the >1us area it makes sense to load the
registers and attempt to save them directly. The direct save may fail
but should only happen on the first invocation or after fork() while the
page is read-only.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-27-bigeasy@linutronix.de
---
 arch/x86/kernel/fpu/signal.c | 25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 3c3167576216..7026f1c4e5e3 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -175,20 +175,21 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
 	/*
-	 * If we do not need to load the FPU registers at return to userspace
-	 * then the CPU has the current state. Try to save it directly to
-	 * userland's stack frame if it does not cause a pagefault. If it does,
-	 * try the slowpath.
+	 * Load the FPU registers if they are not valid for the current task.
+	 * With a valid FPU state we can attempt to save the state directly to
+	 * userland's stack frame which will likely succeed. If it does not, do
+	 * the slowpath.
 	 */
 	fpregs_lock();
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		pagefault_disable();
-		ret = copy_fpregs_to_sigframe(buf_fx);
-		pagefault_enable();
-		if (ret)
-			copy_fpregs_to_fpstate(fpu);
-		set_thread_flag(TIF_NEED_FPU_LOAD);
-	}
+	if (test_thread_flag(TIF_NEED_FPU_LOAD))
+		__fpregs_load_activate();
+
+	pagefault_disable();
+	ret = copy_fpregs_to_sigframe(buf_fx);
+	pagefault_enable();
+	if (ret && !test_thread_flag(TIF_NEED_FPU_LOAD))
+		copy_fpregs_to_fpstate(fpu);
+	set_thread_flag(TIF_NEED_FPU_LOAD);
 	fpregs_unlock();
 
 	if (ret) {

^ permalink raw reply	[flat|nested] 83+ messages in thread

* [tip:x86/fpu] x86/pkeys: Add PKRU value to init_fpstate
  2019-04-03 16:41 ` [PATCH 27/27] x86/pkeys: add PKRU value to init_fpstate Sebastian Andrzej Siewior
@ 2019-04-13 21:05   ` tip-bot for Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: tip-bot for Sebastian Andrzej Siewior @ 2019-04-13 21:05 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, bp, hpa, ak, x86, riel, rkrcmar, luto, kvm, linux-kernel,
	Jason, bigeasy, linux, tglx, konrad.wilk, peterz, pbonzini,
	mingo, pasha.tatashin, chang.seok.bae, dave.hansen

Commit-ID:  a5eff7259790d5314eff10563d6e59d358cce482
Gitweb:     https://git.kernel.org/tip/a5eff7259790d5314eff10563d6e59d358cce482
Author:     Sebastian Andrzej Siewior <bigeasy@linutronix.de>
AuthorDate: Wed, 3 Apr 2019 18:41:56 +0200
Committer:  Borislav Petkov <bp@suse.de>
CommitDate: Fri, 12 Apr 2019 20:21:10 +0200

x86/pkeys: Add PKRU value to init_fpstate

The task's initial PKRU value is set partly for fpu__clear()/
copy_init_pkru_to_fpregs(). It is not part of init_fpstate.xsave and
instead it is set explicitly.

If the user removes the PKRU state from XSAVE in the signal handler then
__fpu__restore_sig() will restore the missing bits from `init_fpstate'
and initialize the PKRU value to 0.

Add the `init_pkru_value' to `init_fpstate' so it is set to the init
value in such a case.

In theory copy_init_pkru_to_fpregs() could be removed because restoring
the PKRU at return-to-userland should be enough.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-28-bigeasy@linutronix.de
---
 arch/x86/kernel/cpu/common.c | 5 +++++
 arch/x86/mm/pkeys.c          | 6 ++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cb28e98a0659..352fa19e6311 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -372,6 +372,8 @@ static bool pku_disabled;
 
 static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 {
+	struct pkru_state *pk;
+
 	/* check the boot processor, plus compile options for PKU: */
 	if (!cpu_feature_enabled(X86_FEATURE_PKU))
 		return;
@@ -382,6 +384,9 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
 		return;
 
 	cr4_set_bits(X86_CR4_PKE);
+	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
+	if (pk)
+		pk->pkru = init_pkru_value;
 	/*
 	 * Seting X86_CR4_PKE will cause the X86_FEATURE_OSPKE
 	 * cpuid bit to be set.  We need to ensure that we
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 2ecbf4155f98..1dcfc91c8f0c 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -18,6 +18,7 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
+#include <asm/fpu/internal.h>		/* init_fpstate			*/
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -161,6 +162,7 @@ static ssize_t init_pkru_read_file(struct file *file, char __user *user_buf,
 static ssize_t init_pkru_write_file(struct file *file,
 		 const char __user *user_buf, size_t count, loff_t *ppos)
 {
+	struct pkru_state *pk;
 	char buf[32];
 	ssize_t len;
 	u32 new_init_pkru;
@@ -183,6 +185,10 @@ static ssize_t init_pkru_write_file(struct file *file,
 		return -EINVAL;
 
 	WRITE_ONCE(init_pkru_value, new_init_pkru);
+	pk = get_xsave_addr(&init_fpstate.xsave, XFEATURE_PKRU);
+	if (!pk)
+		return -EINVAL;
+	pk->pkru = new_init_pkru;
 	return count;
 }
 

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH v9 00/27] x86: load FPU registers on return to userland
  2019-04-12 18:30 ` Borislav Petkov
@ 2019-04-15  8:58   ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-15  8:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-12 20:30:59 [+0200], Borislav Petkov wrote:
> On Wed, Apr 03, 2019 at 06:41:29PM +0200, Sebastian Andrzej Siewior wrote:
> > v7…v8:
> > Rebased on top of v5.1-rc1. And then:
> 
> Ok, all applied and pushed here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-x86-fpu
> 
> Pls have a quick look if all is still ok and not fat-fingered by me. :)

So I compared my v9 and against your HEAD (a5eff7259790d) and it all
looks sane.
Thank you for all that effort you put in it.

> Thx.

Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace
  2019-04-12 17:29               ` Borislav Petkov
@ 2019-04-15  9:14                 ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 83+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-04-15  9:14 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, x86, Andy Lutomirski, Paolo Bonzini,
	Radim Krčmář,
	kvm, Jason A. Donenfeld, Rik van Riel, Dave Hansen

On 2019-04-12 19:29:31 [+0200], Borislav Petkov wrote:
> > - in kernel_fpu_begin() we may trash the task's FPU state (by saving its
> >   registers or by resetting fpu_fpregs_owner_ctx).
> 
> Do we care?
> 
> You mean, in case you have workloads which might involve a lot of
> in-kernel FPU use which would punish task context switches?

Well, depends how you look on it. So for instance:
- Does kernel_fpu_begin() save task's FPU registers or are they already
  stored and it is a nop.
- The task avoid loading its registers on return to userland but that
  FPU usage in BH made it impossible.
- Is there any FPU usage in BH?

And there is this: My program usually works but with this scheduling
pattern and FPU usage in kernel it computes wrong result.
 
Sebastian

^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, back to index

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-03 16:41 [PATCH v9 00/27] x86: load FPU registers on return to userland Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 01/27] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() Sebastian Andrzej Siewior
2019-04-13 20:46   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 02/27] x86/fpu: Remove fpu__restore() Sebastian Andrzej Siewior
2019-04-13 20:47   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 03/27] x86/fpu: Remove preempt_disable() in fpu__clear() Sebastian Andrzej Siewior
2019-04-13 20:48   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 04/27] x86/fpu: Always init the `state' " Sebastian Andrzej Siewior
2019-04-13 20:48   ` [tip:x86/fpu] x86/fpu: Always init the state " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 05/27] x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
2019-04-13 20:49   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 06/27] x86/fpu: Don't save fxregs for ia32 frames " Sebastian Andrzej Siewior
2019-04-13 20:50   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 07/27] x86/fpu: Remove fpu->initialized Sebastian Andrzej Siewior
2019-04-13 20:50   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 08/27] x86/fpu: Remove user_fpu_begin() Sebastian Andrzej Siewior
2019-04-13 20:51   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 09/27] x86/fpu: Add (__)make_fpregs_active helpers Sebastian Andrzej Siewior
2019-04-13 20:52   ` [tip:x86/fpu] x86/fpu: Add an __fpregs_load_activate() internal helper tip-bot for Rik van Riel
2019-04-03 16:41 ` [PATCH 10/27] x86/fpu: Make __raw_xsave_addr() use feature number instead of mask Sebastian Andrzej Siewior
2019-04-13 20:52   ` [tip:x86/fpu] x86/fpu: Make __raw_xsave_addr() use a " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 11/27] x86/fpu: Make get_xsave_field_ptr() and get_xsave_addr() use " Sebastian Andrzej Siewior
2019-04-13 20:53   ` [tip:x86/fpu] x86/fpu: Use a feature number instead of mask in two more helpers tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 12/27] x86/pkru: Provide .*_pkru_ins() functions Sebastian Andrzej Siewior
2019-04-10 16:36   ` Borislav Petkov
2019-04-10 16:52     ` Borislav Petkov
2019-04-10 21:25       ` Sebastian Andrzej Siewior
2019-04-10 21:29         ` Dave Hansen
2019-04-11 13:24           ` Borislav Petkov
2019-04-13 20:54   ` [tip:x86/fpu] x86/pkeys: Provide *pkru() helpers tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 13/27] x86/fpu: Only write PKRU if it is different from current Sebastian Andrzej Siewior
2019-04-13 20:55   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 14/27] x86/pkeys: Don't check if PKRU is zero before writting it Sebastian Andrzej Siewior
2019-04-13 20:55   ` [tip:x86/fpu] x86/pkeys: Don't check if PKRU is zero before writing it tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 15/27] x86/fpu: Eager switch PKRU state Sebastian Andrzej Siewior
2019-04-13 20:56   ` [tip:x86/fpu] " tip-bot for Rik van Riel
2019-04-03 16:41 ` [PATCH 16/27] x86/entry: Add TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
2019-04-13 20:57   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 17/27] x86/fpu: Always store the registers in copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
2019-04-13 20:57   ` [tip:x86/fpu] " tip-bot for Rik van Riel
2019-04-03 16:41 ` [PATCH 18/27] x86/fpu: Prepare copy_fpstate_to_sigframe() for TIF_NEED_FPU_LOAD Sebastian Andrzej Siewior
2019-04-13 20:58   ` [tip:x86/fpu] " tip-bot for Rik van Riel
2019-04-03 16:41 ` [PATCH 19/27] x86/fpu: Update xstate's PKRU value on write_pkru() Sebastian Andrzej Siewior
2019-04-08 18:14   ` Dave Hansen
2019-04-08 20:03     ` Sebastian Andrzej Siewior
2019-04-13 20:59   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 20/27] x86/fpu: Inline copy_user_to_fpregs_zeroing() Sebastian Andrzej Siewior
2019-04-13 21:00   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 21/27] x86/fpu: Let __fpu__restore_sig() restore the !32bit+fxsr frame from kernel memory Sebastian Andrzej Siewior
2019-04-13 21:00   ` [tip:x86/fpu] x86/fpu: Restore from kernel memory on the 64-bit path too tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 22/27] x86/fpu: Merge the two code paths in __fpu__restore_sig() Sebastian Andrzej Siewior
2019-04-13 21:01   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 23/27] x86/fpu: Defer FPU state load until return to userspace Sebastian Andrzej Siewior
2019-04-12 14:36   ` Borislav Petkov
2019-04-12 15:24     ` Sebastian Andrzej Siewior
2019-04-12 16:22       ` Borislav Petkov
2019-04-12 16:37         ` Sebastian Andrzej Siewior
2019-04-12 16:48           ` Borislav Petkov
2019-04-12 17:19             ` Sebastian Andrzej Siewior
2019-04-12 17:29               ` Borislav Petkov
2019-04-15  9:14                 ` Sebastian Andrzej Siewior
2019-04-13 21:02   ` [tip:x86/fpu] " tip-bot for Rik van Riel
2019-04-03 16:41 ` [PATCH 24/27] x86/fpu: Add a fastpath to __fpu__restore_sig() Sebastian Andrzej Siewior
2019-04-08 17:05   ` Thomas Gleixner
2019-04-08 20:02     ` Sebastian Andrzej Siewior
2019-04-09  7:27       ` Thomas Gleixner
2019-04-12 17:17   ` Borislav Petkov
2019-04-12 17:27     ` Sebastian Andrzej Siewior
2019-04-13 21:02   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 25/27] x86/fpu: Add a fastpath to copy_fpstate_to_sigframe() Sebastian Andrzej Siewior
2019-04-13 21:03   ` [tip:x86/fpu] " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 26/27] x86/fpu: Restore FPU register in copy_fpstate_to_sigframe() in order to use the fastpath Sebastian Andrzej Siewior
2019-04-13 21:04   ` [tip:x86/fpu] x86/fpu: Restore regs " tip-bot for Sebastian Andrzej Siewior
2019-04-03 16:41 ` [PATCH 27/27] x86/pkeys: add PKRU value to init_fpstate Sebastian Andrzej Siewior
2019-04-13 21:05   ` [tip:x86/fpu] x86/pkeys: Add " tip-bot for Sebastian Andrzej Siewior
2019-04-04 14:01 ` [PATCH v9 00/27] x86: load FPU registers on return to userland David Laight
2019-04-04 14:14   ` 'Sebastian Andrzej Siewior'
2019-04-04 14:26     ` Andy Lutomirski
2019-04-04 14:31       ` Sebastian Andrzej Siewior
2019-04-04 15:10       ` David Laight
2019-04-08 17:08 ` Thomas Gleixner
2019-04-12 18:30 ` Borislav Petkov
2019-04-15  8:58   ` Sebastian Andrzej Siewior

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git