linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Tony Luck <tony.luck@intel.com>,
	Yu-cheng Yu <yu-cheng.yu@intel.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Borislav Petkov <bp@suse.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Kan Liang <kan.liang@linux.intel.com>
Subject: [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs
Date: Mon, 14 Jun 2021 17:44:42 +0200	[thread overview]
Message-ID: <20210614155357.167589571@linutronix.de> (raw)
In-Reply-To: 20210614154408.673478623@linutronix.de

From: Dave Hansen <dave.hansen@linux.intel.com>

There are three ways in the ISA to bulk save FPU state:
 1. XSAVE, every CPU newer than 2008 does this
 2. FXSAVE, from ~2000->2007
 3. FNSAVE, pre-2000

XSAVE and FXSAVE are nice.  They just copy FPU state to memory.  FNSAVE is
nasty; it destroys the FPU state when it writes it to memory.  It is more
of a "move".

Currently, copy_fpregs_to_fpstate() returns a number to its caller to say
whether it used the nice, non-destructive XSAVE/FXSAVE or used the mean,
clobbering FNSAVE.  Some sites need special handling for the FNSAVE case to
restore any FNSAVE-clobbered state.  Others don't care, like when they are
about to load new state anyway.

The nasty part about the copy_fpregs_to_fpstate() interface is that it's
hard to tell if callers expect the "move" or the "copy" behavior.

Create a new, explicit "move" interface for callers that can handle
clobbering register state.  Make "copy" only do copies and never clobber
register state.

== switch_fpu_prepare() optimization ==

switch_fpu_prepare() had a nice optimization for the FNSAVE case.  It can
handle either clobbering or preserving register state.  For the
XSAVE/FXSAVE case, it records that the fpregs state is still loaded, just
in case a later "restore" operation can be elided.  For the FNSAVE case, it
marks the fpregs as not loaded on the CPU, since they were clobbered.

Instead of having switch_fpu_prepare() modify its behavior based on whether
registers were clobbered or not, simply switch its behavior based on
whether FNSAVE is in use.  This makes it much more clear what is going on
and what the common path is.

It would be simpler to just remove this FNSAVE optimization: Always save
and restore in the FNSAVE case.  This may incur the cost of the restore
even in cases where the restored state is never used.  But, it would only
hurt painfully ancient (>20 years old) processors.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/fpu/internal.h |   14 ++++--
 arch/x86/kernel/fpu/core.c          |   83 ++++++++++++++++++++----------------
 2 files changed, 58 insertions(+), 39 deletions(-)

--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -378,7 +378,8 @@ static inline int xrstor_from_kernel_err
 	return err;
 }
 
-extern int save_fpregs_to_fpstate(struct fpu *fpu);
+extern void save_fpregs_to_fpstate(struct fpu *fpu);
+extern void copy_fpregs_to_fpstate(struct fpu *fpu);
 
 static inline void __restore_fpregs_from_fpstate(union fpregs_state *fpstate, u64 mask)
 {
@@ -510,10 +511,15 @@ static inline void __fpregs_load_activat
 static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && !(current->flags & PF_KTHREAD)) {
-		if (!save_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
-		else
+		/*
+		 * Avoid the expense of restoring fpregs with FNSAVE when it
+		 * might be unnecssary. XSAVE and FXSAVE preserve the FPU state.
+		 */
+		save_fpregs_to_fpstate(old_fpu);
+		if (likely(use_xsave() || use_fxsr()))
 			old_fpu->last_cpu = cpu;
+		else
+			old_fpu->last_cpu = -1;
 
 		/* But leave fpu_fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -83,16 +83,17 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
- *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
+ * Must be called with fpregs locked.
+ *
+ * Returns 'true' if the FPU state has been clobbered and the register
+ * contents are lost.
+ *
+ * The legacy FNSAVE instruction clobebrs all FPU state unconditionally, so
+ * registers are essentially destroyed.
+ *
+ * XSAVE and FXSAVE preserve register contents.
  */
-int save_fpregs_to_fpstate(struct fpu *fpu)
+static bool __clobber_save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		xsave_to_kernel(&fpu->state.xsave);
@@ -103,23 +104,45 @@ int save_fpregs_to_fpstate(struct fpu *f
 		 */
 		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
 			fpu->avx512_timestamp = jiffies;
-		return 1;
+		return false;
 	}
 
 	if (likely(use_fxsr())) {
 		fxsave_to_kernel(&fpu->state.fxsave);
-		return 1;
+		return false;
 	}
 
-	/*
-	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
-	 */
+	/* Legacy FPU register saving, FNSAVE always clears FPU registers. */
 	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
+	return true;
+}
+
+/**
+ * save_fpregs_to_fpstate - Save fpregs in fpstate
+ * @fpu:	Pointer to FPU context
+ *
+ * Hardware register state might be clobbered when the
+ * function returns.
+ */
+void save_fpregs_to_fpstate(struct fpu *fpu)
+{
+	__clobber_save_fpregs_to_fpstate(fpu);
+}
+EXPORT_SYMBOL_GPL(save_fpregs_to_fpstate);
+
+/**
+ * copy_fpregs_to_fpstate - Copy fpregs to fpstate
+ * @fpu:	Pointer to FPU context
+ *
+ * Guarantees that the hardware register state is preserved.
+ */
+void copy_fpregs_to_fpstate(struct fpu *fpu)
+{
+	bool clobbered = __clobber_save_fpregs_to_fpstate(fpu);
 
-	return 0;
+	if (clobbered)
+		restore_fpregs_from_fpstate(&fpu->state);
 }
-EXPORT_SYMBOL(save_fpregs_to_fpstate);
 
 void kernel_fpu_begin_mask(unsigned int kfpu_mask)
 {
@@ -133,10 +156,6 @@ void kernel_fpu_begin_mask(unsigned int
 	if (!(current->flags & PF_KTHREAD) &&
 	    !test_thread_flag(TIF_NEED_FPU_LOAD)) {
 		set_thread_flag(TIF_NEED_FPU_LOAD);
-		/*
-		 * Ignore return value -- we don't care if reg state
-		 * is clobbered.
-		 */
 		save_fpregs_to_fpstate(&current->thread.fpu);
 	}
 	__cpu_invalidate_fpregs_state();
@@ -160,7 +179,8 @@ void kernel_fpu_end(void)
 EXPORT_SYMBOL_GPL(kernel_fpu_end);
 
 /*
- * Save the FPU state (mark it for reload if necessary):
+ * Save the FPU register state. If the registers are active then they are
+ * preserved.
  *
  * This only ever gets called for the current task.
  */
@@ -171,11 +191,8 @@ void fpu__save(struct fpu *fpu)
 	fpregs_lock();
 	trace_x86_fpu_before_save(fpu);
 
-	if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
-		if (!save_fpregs_to_fpstate(fpu)) {
-			restore_fpregs_from_fpstate(&fpu->state);
-		}
-	}
+	if (!test_thread_flag(TIF_NEED_FPU_LOAD))
+		copy_fpregs_to_fpstate(fpu);
 
 	trace_x86_fpu_after_save(fpu);
 	fpregs_unlock();
@@ -245,18 +262,14 @@ int fpu__copy(struct task_struct *dst, s
 
 	/*
 	 * If the FPU registers are not current just memcpy() the state.
-	 * Otherwise save current FPU registers directly into the child's FPU
-	 * context, without any memory-to-memory copying.
-	 *
-	 * ( The function 'fails' in the FNSAVE case, which destroys
-	 *   register contents so we have to load them back. )
+	 * Otherwise copy current FPU registers directly into the child's
+	 * FPU context.
 	 */
 	fpregs_lock();
 	if (test_thread_flag(TIF_NEED_FPU_LOAD))
 		memcpy(&dst_fpu->state, &src_fpu->state, fpu_kernel_xstate_size);
-
-	else if (!save_fpregs_to_fpstate(dst_fpu))
-		restore_fpregs_from_fpstate(&dst_fpu->state);
+	else
+		copy_fpregs_to_fpstate(dst_fpu);
 
 	fpregs_unlock();
 


  parent reply	other threads:[~2021-06-14 16:39 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14 15:44 [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Thomas Gleixner
2021-06-14 15:44 ` [patch V2 01/52] x86/fpu: Make init_fpstate correct with optimized XSAVE Thomas Gleixner
2021-06-14 19:15   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 02/52] x86/fpu: Fix copy_xstate_to_kernel() gap handling Thomas Gleixner
2021-06-15 11:07   ` Borislav Petkov
2021-06-15 12:47     ` Thomas Gleixner
2021-06-15 12:59       ` Borislav Petkov
2021-06-16 22:02   ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 03/52] x86/pkeys: Revert a5eff7259790 ("x86/pkeys: Add PKRU value to init_fpstate") Thomas Gleixner
2021-06-15 13:15   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 04/52] x86/fpu: Mark various FPU states __ro_after_init Thomas Gleixner
2021-06-14 15:44 ` [patch V2 05/52] x86/fpu: Remove unused get_xsave_field_ptr() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 06/52] x86/fpu: Move inlines where they belong Thomas Gleixner
2021-06-14 15:44 ` [patch V2 07/52] x86/fpu: Limit xstate copy size in xstateregs_set() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 08/52] x86/fpu: Sanitize xstateregs_set() Thomas Gleixner
2021-06-15 17:40   ` Borislav Petkov
2021-06-15 21:32     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 09/52] x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate() Thomas Gleixner
2021-06-16 15:02   ` Borislav Petkov
2021-06-16 23:51     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 10/52] x86/fpu: Simplify PTRACE_GETREGS code Thomas Gleixner
2021-06-14 15:44 ` [patch V2 11/52] x86/fpu: Rewrite xfpregs_set() Thomas Gleixner
2021-06-16 15:22   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 12/52] x86/fpu: Fail ptrace() requests that try to set invalid MXCSR values Thomas Gleixner
2021-06-16 15:31   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 13/52] x86/fpu: Clean up fpregs_set() Thomas Gleixner
2021-06-16 15:42   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 14/52] x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get() Thomas Gleixner
2021-06-16 16:13   ` Borislav Petkov
2021-06-17 12:42     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 15/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in xfpregs_get() Thomas Gleixner
2021-06-17  8:59   ` Borislav Petkov
2021-06-18 11:19     ` Borislav Petkov
2021-06-18 13:25       ` [PATCH] selftests/x86/ptrace_syscall: Add a PTRACE_GETFPREGS test Borislav Petkov
2021-06-14 15:44 ` [patch V2 16/52] x86/fpu: Use copy_uabi_xstate_to_membuf() in fpregs_get() Thomas Gleixner
2021-06-17 11:50   ` Borislav Petkov
2021-06-17 12:43     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 17/52] x86/fpu: Remove fpstate_sanitize_xstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 18/52] x86/fpu: Get rid of using_compacted_format() Thomas Gleixner
2021-06-17 11:59   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 19/52] x86/kvm: Avoid looking up PKRU in XSAVE buffer Thomas Gleixner
2021-06-17 12:09   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 20/52] x86/fpu: Cleanup arch_set_user_pkey_access() Thomas Gleixner
2021-06-17 12:22   ` Borislav Petkov
2021-06-17 12:49     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 21/52] x86/fpu: Get rid of copy_supervisor_to_kernel() Thomas Gleixner
2021-06-17 12:41   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 22/52] x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs() Thomas Gleixner
2021-06-17 12:48   ` Borislav Petkov
2021-06-14 15:44 ` [patch V2 23/52] x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 24/52] x86/fpu: Rename fxregs related copy functions Thomas Gleixner
2021-06-14 15:44 ` [patch V2 25/52] x86/fpu: Rename fregs " Thomas Gleixner
2021-06-14 15:44 ` [patch V2 26/52] x86/fpu: Rename xstate copy functions which are related to UABI Thomas Gleixner
2021-06-14 15:44 ` [patch V2 27/52] x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 28/52] x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 29/52] x86/fpu: Rename copy_kernel_to_fpregs() to restore_fpregs_from_kernel() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 30/52] x86/fpu: Rename initstate copy functions Thomas Gleixner
2021-06-14 15:44 ` [patch V2 31/52] x86/fpu: Rename "dynamic" XSTATEs to "independent" Thomas Gleixner
2021-06-14 15:44 ` [patch V2 32/52] x86/fpu/xstate: Sanitize handling of independent features Thomas Gleixner
2021-06-16 20:04   ` Liang, Kan
2021-06-17  7:15     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 33/52] x86/pkeys: Move read_pkru() and write_pkru() Thomas Gleixner
2021-06-14 15:44 ` Thomas Gleixner [this message]
2021-06-18 12:21   ` [patch V2 34/52] x86/fpu: Differentiate "copy" versus "move" of fpregs Thomas Gleixner
2021-06-14 15:44 ` [patch V2 35/52] x86/cpu: Sanitize X86_FEATURE_OSPKE Thomas Gleixner
2021-06-14 15:44 ` [patch V2 36/52] x86/pkru: Provide pkru_get_init_value() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 37/52] x86/pkru: Provide pkru_write_default() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 38/52] x86/cpu: Write the default PKRU value when enabling PKE Thomas Gleixner
2021-06-14 15:44 ` [patch V2 39/52] x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 40/52] x86/fpu: Rename fpu__clear_all() to fpu_flush_thread() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 41/52] x86/fpu: Clean up the fpu__clear() variants Thomas Gleixner
2021-06-14 15:44 ` [patch V2 42/52] x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 43/52] x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 44/52] x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 45/52] x86/fpu: Dont restore PKRU in fpregs_restore_userspace() Thomas Gleixner
2021-06-16  0:52   ` Yu, Yu-cheng
2021-06-16  8:56     ` Thomas Gleixner
2021-06-14 15:44 ` [patch V2 46/52] x86/fpu: Add PKRU storage outside of task XSAVE buffer Thomas Gleixner
2021-06-14 15:44 ` [patch V2 47/52] x86/fpu: Hook up PKRU into ptrace() Thomas Gleixner
2021-06-14 19:29   ` [patch V2-A " Thomas Gleixner
2021-06-14 15:44 ` [patch V2 48/52] x86/fpu: Mask PKRU from kernel XRSTOR[S] operations Thomas Gleixner
2021-06-14 15:44 ` [patch V2 49/52] x86/fpu: Remove PKRU handling from switch_fpu_finish() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 50/52] x86/fpu: Dont store PKRU in xstate in fpu_reset_fpstate() Thomas Gleixner
2021-06-14 15:44 ` [patch V2 51/52] x86/pkru: Remove xstate fiddling from write_pkru() Thomas Gleixner
2021-06-14 15:45 ` [patch V2 52/52] x86/fpu: Mark init_fpstate __ro_after_init Thomas Gleixner
2021-06-14 20:15 ` [patch] x86/fpu: x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate() Thomas Gleixner
2021-06-16  0:50 ` [patch V2 00/52] x86/fpu: Spring cleaning and PKRU sanitizing Yu, Yu-cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210614155357.167589571@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tony.luck@intel.com \
    --cc=yu-cheng.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).