All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] x86/fpu: Simplify the FPU state machine
@ 2017-01-26 11:26 Ingo Molnar
  2017-01-26 11:26 ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Ingo Molnar
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

Now that the lazy FPU switching code has been removed this series capitalizes
on the simpler FPU context switching machinery and simplifies the x86 FPU
handling state machine code.

The new code relies on just two obvious fields, ::fpstate_active and ::fpregs_cached:

/*
 * Highest level per task FPU state data structure that
 * contains the FPU register state plus various FPU
 * state fields:
 */
struct fpu {
	/*
	 * @fpstate_active:
	 *
	 * This flag indicates whether this context is active: if the task
	 * is not running then we can restore from this context, if the task
	 * is running then we should save into this context.
	 */
	unsigned char			fpstate_active;

	/*
	 * @fpregs_cached:
	 *
	 * This flag tells us whether this context is loaded into a CPU
	 * right now.
	 *
	 * This is set to 0 if a task is migrated to another CPU.
	 */
	unsigned char			fpregs_cached;

	/*
	 * @state:
	 *
	 * In-memory copy of all FPU registers that we save/restore
	 * over context switches. If the task is using the FPU then
	 * the registers in the FPU are more recent than this state
	 * copy. If the task context-switches away then they get
	 * saved here and represent the FPU state.
	 */
	union fpregs_state		state;
	/*
	 * WARNING: 'state' is dynamically-sized.  Do not put
	 * anything after it here.
	 */
};

In particular ::last_cpu is replaced by a single ::fpregs_cached field
(which is invalidated on migration), plus ::fpregs_state is gone.

There's some code size reduction:

 7 files changed, 41 insertions(+), 87 deletions(-)

... and the code is a lot more obvious now as well, I think.

Lightly tested.

The code can also be fetched from:

   git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/fpu

Thanks,

	Ingo

Ingo Molnar (7):
  x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  x86/fpu: Simplify fpu->fpregs_active use
  x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic
  x86/fpu: Split the state handling in fpu__drop()
  x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  x86/fpu: Decouple fpregs_activate()/fpregs_deactivate() from fpu->fpregs_active
  x86/fpu: Remove struct fpu::fpregs_active

 arch/x86/include/asm/fpu/internal.h | 42 ++++++++++--------------------------------
 arch/x86/include/asm/fpu/types.h    | 37 +++++--------------------------------
 arch/x86/include/asm/switch_to.h    | 10 ++++++++++
 arch/x86/include/asm/trace/fpu.h    |  5 +----
 arch/x86/kernel/fpu/core.c          | 29 ++++++++++++++++-------------
 arch/x86/kernel/fpu/signal.c        |  9 +++++----
 arch/x86/mm/pkeys.c                 |  3 +--
 drivers/lguest/x86/core.c           |  2 +-
 kernel/sched/core.c                 |  2 ++
 kernel/sched/sched.h                |  8 ++++++++
 10 files changed, 59 insertions(+), 88 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  2017-01-26 14:23   ` Rik van Riel
  2017-01-26 14:54   ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Rik van Riel
  2017-01-26 11:26 ` [PATCH 2/7] x86/fpu: Simplify fpu->fpregs_active use Ingo Molnar
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

fpu->last_cpu records the last CPU a given FPU context structure was used on.
This enables an important optimization: if a task schedules out to a kernel
thread and then gets scheduled back after only FPU-inactive kernel threads
executed, the FPU state in the registers is still intact and the FPU restore
can be skipped - speeding up the context switch.

The same logic can be implemented slightly simpler, by using a single boolean
flag: fpu->fpregs_cached tells us whether the context's FPU registers are
cached in the CPU.

The only difference is that this flag has to be invalidated when a task is
migrated away from its CPU - but that is a slow path compared to context
switches.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 15 ++++++++-------
 arch/x86/include/asm/fpu/types.h    | 24 ++++++++++--------------
 arch/x86/include/asm/switch_to.h    | 10 ++++++++++
 arch/x86/kernel/fpu/core.c          |  2 +-
 kernel/sched/core.c                 |  2 ++
 kernel/sched/sched.h                |  8 ++++++++
 6 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 255645f60ca2..2eaf93cf11cc 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -490,7 +490,7 @@ DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
 
 /*
  * The in-register FPU state for an FPU context on a CPU is assumed to be
- * valid if the fpu->last_cpu matches the CPU, and the fpu_fpregs_owner_ctx
+ * valid if fpu->fpregs_cached is still set, and if the fpu_fpregs_owner_ctx
  * matches the FPU.
  *
  * If the FPU register state is valid, the kernel can skip restoring the
@@ -512,12 +512,12 @@ static inline void __cpu_invalidate_fpregs_state(void)
 
 static inline void __fpu_invalidate_fpregs_state(struct fpu *fpu)
 {
-	fpu->last_cpu = -1;
+	fpu->fpregs_cached = 0;
 }
 
 static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
 {
-	return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && cpu == fpu->last_cpu;
+	return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && fpu->fpregs_cached;
 }
 
 /*
@@ -573,15 +573,16 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (old_fpu->fpregs_active) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
-			old_fpu->last_cpu = -1;
+			old_fpu->fpregs_cached = 0;
 		else
-			old_fpu->last_cpu = cpu;
+			old_fpu->fpregs_cached = 1;
 
 		/* But leave fpu_fpregs_owner_ctx! */
 		old_fpu->fpregs_active = 0;
 		trace_x86_fpu_regs_deactivated(old_fpu);
-	} else
-		old_fpu->last_cpu = -1;
+	} else {
+		old_fpu->fpregs_cached = 0;
+	}
 }
 
 /*
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 3c80f5b9c09d..3090b0d7b232 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -276,20 +276,6 @@ union fpregs_state {
  */
 struct fpu {
 	/*
-	 * @last_cpu:
-	 *
-	 * Records the last CPU on which this context was loaded into
-	 * FPU registers. (In the lazy-restore case we might be
-	 * able to reuse FPU registers across multiple context switches
-	 * this way, if no intermediate task used the FPU.)
-	 *
-	 * A value of -1 is used to indicate that the FPU state in context
-	 * memory is newer than the FPU state in registers, and that the
-	 * FPU state should be reloaded next time the task is run.
-	 */
-	unsigned int			last_cpu;
-
-	/*
 	 * @fpstate_active:
 	 *
 	 * This flag indicates whether this context is active: if the task
@@ -322,6 +308,16 @@ struct fpu {
 	unsigned char			fpregs_active;
 
 	/*
+	 * @fpregs_cached:
+	 *
+	 * This flag tells us whether this context is loaded into a CPU
+	 * right now.
+	 *
+	 * This is set to 0 if a task is migrated to another CPU.
+	 */
+	unsigned char			fpregs_cached;
+
+	/*
 	 * @state:
 	 *
 	 * In-memory copy of all FPU registers that we save/restore
diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
index fcc5cd387fd1..a7146dadb31d 100644
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -72,4 +72,14 @@ do {									\
 	((last) = __switch_to_asm((prev), (next)));			\
 } while (0)
 
+
+/*
+ * The task-migration arch callback clears the FPU registers cache:
+ */
+static inline void arch_task_migrate(struct task_struct *p)
+{
+	p->thread.fpu.fpregs_cached = 0;
+}
+#define arch_task_migrate arch_task_migrate
+
 #endif /* _ASM_X86_SWITCH_TO_H */
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index e1114f070c2d..287f1cb32b59 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -190,7 +190,7 @@ EXPORT_SYMBOL_GPL(fpstate_init);
 int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
 	dst_fpu->fpregs_active = 0;
-	dst_fpu->last_cpu = -1;
+	dst_fpu->fpregs_cached = 0;
 
 	if (!src_fpu->fpstate_active || !static_cpu_has(X86_FEATURE_FPU))
 		return 0;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c56fb57f2991..7eb2f3041fde 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1253,6 +1253,8 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
 			p->sched_class->migrate_task_rq(p);
 		p->se.nr_migrations++;
 		perf_event_task_migrate(p);
+
+		arch_task_migrate(p);
 	}
 
 	__set_task_cpu(p, new_cpu);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 7b34c7826ca5..ff8a894132e4 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1824,3 +1824,11 @@ static inline void cpufreq_update_this_cpu(struct rq *rq, unsigned int flags) {}
 #else /* arch_scale_freq_capacity */
 #define arch_scale_freq_invariant()	(false)
 #endif
+
+/*
+ * Default task-migration arch callback:
+ */
+#ifndef arch_task_migrate
+static inline void arch_task_migrate(struct task_struct *p) { }
+#endif
+
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/7] x86/fpu: Simplify fpu->fpregs_active use
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
  2017-01-26 11:26 ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  2017-01-26 16:30   ` Andy Lutomirski
  2017-01-26 11:26 ` [PATCH 3/7] x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic Ingo Molnar
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

The fpregs_active() inline function is pretty pointless - in almost
all the callsites it can be replaced with a direct fpu->fpregs_active
access.

Do so and eliminate the extra layer of obfuscation.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 17 +----------------
 arch/x86/kernel/fpu/core.c          |  2 +-
 arch/x86/kernel/fpu/signal.c        |  9 +++++----
 arch/x86/mm/pkeys.c                 |  2 +-
 drivers/lguest/x86/core.c           |  2 +-
 5 files changed, 9 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 2eaf93cf11cc..d6307e76837c 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -543,21 +543,6 @@ static inline void fpregs_activate(struct fpu *fpu)
 }
 
 /*
- * The question "does this thread have fpu access?"
- * is slightly racy, since preemption could come in
- * and revoke it immediately after the test.
- *
- * However, even in that very unlikely scenario,
- * we can just assume we have FPU access - typically
- * to save the FP state - we'll just take a #NM
- * fault and get the FPU access back.
- */
-static inline int fpregs_active(void)
-{
-	return current->thread.fpu.fpregs_active;
-}
-
-/*
  * FPU state switching for scheduling.
  *
  * This is a two-stage process:
@@ -618,7 +603,7 @@ static inline void user_fpu_begin(void)
 	struct fpu *fpu = &current->thread.fpu;
 
 	preempt_disable();
-	if (!fpregs_active())
+	if (!fpu->fpregs_active)
 		fpregs_activate(fpu);
 	preempt_enable();
 }
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 287f1cb32b59..8775343ca75b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -367,7 +367,7 @@ void fpu__current_fpstate_write_end(void)
 	 * registers may still be out of date.  Update them with
 	 * an XRSTOR if they are active.
 	 */
-	if (fpregs_active())
+	if (fpu->fpregs_active)
 		copy_kernel_to_fpregs(&fpu->state);
 
 	/*
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 2d682dac35d4..684025654d0c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -155,7 +155,8 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf)
  */
 int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 {
-	struct xregs_state *xsave = &current->thread.fpu.state.xsave;
+	struct fpu *fpu = &current->thread.fpu;
+	struct xregs_state *xsave = &fpu->state.xsave;
 	struct task_struct *tsk = current;
 	int ia32_fxstate = (buf != buf_fx);
 
@@ -170,13 +171,13 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	if (fpregs_active() || using_compacted_format()) {
+	if (fpu->fpregs_active || using_compacted_format()) {
 		/* Save the live register state to the user directly. */
 		if (copy_fpregs_to_sigframe(buf_fx))
 			return -1;
 		/* Update the thread's fxstate to save the fsave header. */
 		if (ia32_fxstate)
-			copy_fxregs_to_kernel(&tsk->thread.fpu);
+			copy_fxregs_to_kernel(fpu);
 	} else {
 		/*
 		 * It is a *bug* if kernel uses compacted-format for xsave
@@ -189,7 +190,7 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			return -1;
 		}
 
-		fpstate_sanitize_xstate(&tsk->thread.fpu);
+		fpstate_sanitize_xstate(fpu);
 		if (__copy_to_user(buf_fx, xsave, fpu_user_xstate_size))
 			return -1;
 	}
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index 2dab69a706ec..e2c23472233e 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -45,7 +45,7 @@ int __execute_only_pkey(struct mm_struct *mm)
 	 */
 	preempt_disable();
 	if (!need_to_set_mm_pkey &&
-	    fpregs_active() &&
+	    current->thread.fpu.fpregs_active &&
 	    !__pkru_allows_read(read_pkru(), execute_only_pkey)) {
 		preempt_enable();
 		return execute_only_pkey;
diff --git a/drivers/lguest/x86/core.c b/drivers/lguest/x86/core.c
index d71f6323ac00..328974ae690d 100644
--- a/drivers/lguest/x86/core.c
+++ b/drivers/lguest/x86/core.c
@@ -289,7 +289,7 @@ void lguest_arch_run_guest(struct lg_cpu *cpu)
 	 * a different CPU. So all the critical stuff should be done
 	 * before this.
 	 */
-	else if (cpu->regs->trapnum == 7 && !fpregs_active())
+	else if (cpu->regs->trapnum == 7 && !current->thread.fpu.fpstate_active)
 		fpu__restore(&current->thread.fpu);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/7] x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
  2017-01-26 11:26 ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Ingo Molnar
  2017-01-26 11:26 ` [PATCH 2/7] x86/fpu: Simplify fpu->fpregs_active use Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  2017-01-26 11:26 ` [PATCH 4/7] x86/fpu: Split the state handling in fpu__drop() Ingo Molnar
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

Do this temporarily only, to make it easier to change the FPU state machine,
in particular this change couples the fpu->fpregs_active and fpu->fpstate_active
states: they are only set/cleared together (as far as the scheduler sees them).

This will be removed by later patches.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8775343ca75b..7e1c77ac90e8 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -462,9 +462,11 @@ void fpu__clear(struct fpu *fpu)
 	 * Make sure fpstate is cleared and initialized.
 	 */
 	if (static_cpu_has(X86_FEATURE_FPU)) {
+		preempt_disable();
 		fpu__activate_curr(fpu);
 		user_fpu_begin();
 		copy_init_fpstate_to_fpregs();
+		preempt_enable();
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/7] x86/fpu: Split the state handling in fpu__drop()
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
                   ` (2 preceding siblings ...)
  2017-01-26 11:26 ` [PATCH 3/7] x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  2017-01-26 11:26 ` [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active Ingo Molnar
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

Prepare fpu__drop() to use fpu->fpregs_active.

There are two distinct usecases for fpu__drop() in this context:
exit_thread() when called for 'current' in exit(), and when called
for another task in fork().

This patch does not change behavior, it only adds a couple of
debug checks and structures the code to make the ->fpregs_active
change more obviously correct.

All the complications will be removed later on.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7e1c77ac90e8..2973afdb39f1 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -414,12 +414,19 @@ void fpu__drop(struct fpu *fpu)
 {
 	preempt_disable();
 
-	if (fpu->fpregs_active) {
-		/* Ignore delayed exceptions from user space */
-		asm volatile("1: fwait\n"
-			     "2:\n"
-			     _ASM_EXTABLE(1b, 2b));
-		fpregs_deactivate(fpu);
+	if (fpu == &current->thread.fpu) {
+		WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
+
+		if (fpu->fpregs_active) {
+			/* Ignore delayed exceptions from user space */
+			asm volatile("1: fwait\n"
+				     "2:\n"
+				     _ASM_EXTABLE(1b, 2b));
+			if (fpu->fpregs_active)
+				fpregs_deactivate(fpu);
+		}
+	} else {
+		WARN_ON_FPU(fpu->fpregs_active);
 	}
 
 	fpu->fpstate_active = 0;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
                   ` (3 preceding siblings ...)
  2017-01-26 11:26 ` [PATCH 4/7] x86/fpu: Split the state handling in fpu__drop() Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  2017-01-26 14:44   ` Rik van Riel
  2017-01-26 11:26 ` [PATCH 6/7] x86/fpu: Decouple fpregs_activate()/fpregs_deactivate() from fpu->fpregs_active Ingo Molnar
  2017-01-26 11:26 ` [PATCH 7/7] x86/fpu: Remove struct fpu::fpregs_active Ingo Molnar
  6 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

We want to simplify the FPU state machine by eliminating fpu->fpregs_active,
and we can do that because the two state flags (::fpregs_active and
::fpstate_active) are set essentially together.

The old lazy FPU switching code used to make a distinction - but there's
no lazy switching code anymore, we always switch in an 'eager' fashion.

Do this by first changing all substantial uses of fpu->fpregs_active
to fpu->fpstate_active and adding a few debug checks to double check
our assumption is correct.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h |  4 +++-
 arch/x86/kernel/fpu/core.c          | 16 ++++++++++------
 arch/x86/kernel/fpu/signal.c        |  4 +++-
 arch/x86/mm/pkeys.c                 |  3 +--
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index d6307e76837c..3a1cd8344a6b 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -556,7 +556,9 @@ static inline void fpregs_activate(struct fpu *fpu)
 static inline void
 switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
-	if (old_fpu->fpregs_active) {
+	WARN_ON_FPU(old_fpu->fpregs_active != old_fpu->fpstate_active);
+
+	if (old_fpu->fpstate_active) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
 			old_fpu->fpregs_cached = 0;
 		else
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 2973afdb39f1..4fed80d08bd7 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -100,7 +100,7 @@ void __kernel_fpu_begin(void)
 
 	kernel_fpu_disable();
 
-	if (fpu->fpregs_active) {
+	if (fpu->fpstate_active) {
 		/*
 		 * Ignore return value -- we don't care if reg state
 		 * is clobbered.
@@ -116,7 +116,7 @@ void __kernel_fpu_end(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
-	if (fpu->fpregs_active)
+	if (fpu->fpstate_active)
 		copy_kernel_to_fpregs(&fpu->state);
 
 	kernel_fpu_enable();
@@ -147,8 +147,10 @@ void fpu__save(struct fpu *fpu)
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
 	preempt_disable();
+	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
+
 	trace_x86_fpu_before_save(fpu);
-	if (fpu->fpregs_active) {
+	if (fpu->fpstate_active) {
 		if (!copy_fpregs_to_fpstate(fpu)) {
 			copy_kernel_to_fpregs(&fpu->state);
 		}
@@ -262,11 +264,12 @@ EXPORT_SYMBOL_GPL(fpu__activate_curr);
  */
 void fpu__activate_fpstate_read(struct fpu *fpu)
 {
+	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
 	/*
 	 * If fpregs are active (in the current CPU), then
 	 * copy them to the fpstate:
 	 */
-	if (fpu->fpregs_active) {
+	if (fpu->fpstate_active) {
 		fpu__save(fpu);
 	} else {
 		if (!fpu->fpstate_active) {
@@ -362,12 +365,13 @@ void fpu__current_fpstate_write_end(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
+	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
 	/*
 	 * 'fpu' now has an updated copy of the state, but the
 	 * registers may still be out of date.  Update them with
 	 * an XRSTOR if they are active.
 	 */
-	if (fpu->fpregs_active)
+	if (fpu->fpstate_active)
 		copy_kernel_to_fpregs(&fpu->state);
 
 	/*
@@ -417,7 +421,7 @@ void fpu__drop(struct fpu *fpu)
 	if (fpu == &current->thread.fpu) {
 		WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
 
-		if (fpu->fpregs_active) {
+		if (fpu->fpstate_active) {
 			/* Ignore delayed exceptions from user space */
 			asm volatile("1: fwait\n"
 				     "2:\n"
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 684025654d0c..a88083ba7f8b 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -171,7 +171,9 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	if (fpu->fpregs_active || using_compacted_format()) {
+	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
+
+	if (fpu->fpstate_active || using_compacted_format()) {
 		/* Save the live register state to the user directly. */
 		if (copy_fpregs_to_sigframe(buf_fx))
 			return -1;
diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
index e2c23472233e..4d24269c071f 100644
--- a/arch/x86/mm/pkeys.c
+++ b/arch/x86/mm/pkeys.c
@@ -18,7 +18,6 @@
 
 #include <asm/cpufeature.h>             /* boot_cpu_has, ...            */
 #include <asm/mmu_context.h>            /* vma_pkey()                   */
-#include <asm/fpu/internal.h>           /* fpregs_active()              */
 
 int __execute_only_pkey(struct mm_struct *mm)
 {
@@ -45,7 +44,7 @@ int __execute_only_pkey(struct mm_struct *mm)
 	 */
 	preempt_disable();
 	if (!need_to_set_mm_pkey &&
-	    current->thread.fpu.fpregs_active &&
+	    current->thread.fpu.fpstate_active &&
 	    !__pkru_allows_read(read_pkru(), execute_only_pkey)) {
 		preempt_enable();
 		return execute_only_pkey;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/7] x86/fpu: Decouple fpregs_activate()/fpregs_deactivate() from fpu->fpregs_active
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
                   ` (4 preceding siblings ...)
  2017-01-26 11:26 ` [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  2017-01-26 11:26 ` [PATCH 7/7] x86/fpu: Remove struct fpu::fpregs_active Ingo Molnar
  6 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

The fpregs_activate()/fpregs_deactivate() are currently called in such a pattern:

	if (!fpu->fpregs_active)
		fpregs_activate(fpu);

	...

	if (fpu->fpregs_active)
		fpregs_deactivate(fpu);

But note that it's actually safe to call them without checking the flag first.

This further decouples the fpu->fpregs_active flag from actual FPU logic.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 7 +------
 arch/x86/kernel/fpu/core.c          | 3 +--
 2 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 3a1cd8344a6b..0f4590fbc471 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -526,8 +526,6 @@ static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
  */
 static inline void fpregs_deactivate(struct fpu *fpu)
 {
-	WARN_ON_FPU(!fpu->fpregs_active);
-
 	fpu->fpregs_active = 0;
 	this_cpu_write(fpu_fpregs_owner_ctx, NULL);
 	trace_x86_fpu_regs_deactivated(fpu);
@@ -535,8 +533,6 @@ static inline void fpregs_deactivate(struct fpu *fpu)
 
 static inline void fpregs_activate(struct fpu *fpu)
 {
-	WARN_ON_FPU(fpu->fpregs_active);
-
 	fpu->fpregs_active = 1;
 	this_cpu_write(fpu_fpregs_owner_ctx, fpu);
 	trace_x86_fpu_regs_activated(fpu);
@@ -605,8 +601,7 @@ static inline void user_fpu_begin(void)
 	struct fpu *fpu = &current->thread.fpu;
 
 	preempt_disable();
-	if (!fpu->fpregs_active)
-		fpregs_activate(fpu);
+	fpregs_activate(fpu);
 	preempt_enable();
 }
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 4fed80d08bd7..a025be6637cc 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -426,8 +426,7 @@ void fpu__drop(struct fpu *fpu)
 			asm volatile("1: fwait\n"
 				     "2:\n"
 				     _ASM_EXTABLE(1b, 2b));
-			if (fpu->fpregs_active)
-				fpregs_deactivate(fpu);
+			fpregs_deactivate(fpu);
 		}
 	} else {
 		WARN_ON_FPU(fpu->fpregs_active);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/7] x86/fpu: Remove struct fpu::fpregs_active
  2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
                   ` (5 preceding siblings ...)
  2017-01-26 11:26 ` [PATCH 6/7] x86/fpu: Decouple fpregs_activate()/fpregs_deactivate() from fpu->fpregs_active Ingo Molnar
@ 2017-01-26 11:26 ` Ingo Molnar
  6 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 11:26 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

The previous changes paved the way for the removal of the
fpu::fpregs_active state flag - we now only have the
fpu::fpstate_active and fpu::fpregs_cached flags left.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h |  5 -----
 arch/x86/include/asm/fpu/types.h    | 23 -----------------------
 arch/x86/include/asm/trace/fpu.h    |  5 +----
 arch/x86/kernel/fpu/core.c          |  9 ---------
 arch/x86/kernel/fpu/signal.c        |  2 --
 5 files changed, 1 insertion(+), 43 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 0f4590fbc471..e62eee2e989e 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -526,14 +526,12 @@ static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
  */
 static inline void fpregs_deactivate(struct fpu *fpu)
 {
-	fpu->fpregs_active = 0;
 	this_cpu_write(fpu_fpregs_owner_ctx, NULL);
 	trace_x86_fpu_regs_deactivated(fpu);
 }
 
 static inline void fpregs_activate(struct fpu *fpu)
 {
-	fpu->fpregs_active = 1;
 	this_cpu_write(fpu_fpregs_owner_ctx, fpu);
 	trace_x86_fpu_regs_activated(fpu);
 }
@@ -552,8 +550,6 @@ static inline void fpregs_activate(struct fpu *fpu)
 static inline void
 switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
-	WARN_ON_FPU(old_fpu->fpregs_active != old_fpu->fpstate_active);
-
 	if (old_fpu->fpstate_active) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
 			old_fpu->fpregs_cached = 0;
@@ -561,7 +557,6 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 			old_fpu->fpregs_cached = 1;
 
 		/* But leave fpu_fpregs_owner_ctx! */
-		old_fpu->fpregs_active = 0;
 		trace_x86_fpu_regs_deactivated(old_fpu);
 	} else {
 		old_fpu->fpregs_cached = 0;
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 3090b0d7b232..07452fbd7867 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -285,29 +285,6 @@ struct fpu {
 	unsigned char			fpstate_active;
 
 	/*
-	 * @fpregs_active:
-	 *
-	 * This flag determines whether a given context is actively
-	 * loaded into the FPU's registers and that those registers
-	 * represent the task's current FPU state.
-	 *
-	 * Note the interaction with fpstate_active:
-	 *
-	 *   # task does not use the FPU:
-	 *   fpstate_active == 0
-	 *
-	 *   # task uses the FPU and regs are active:
-	 *   fpstate_active == 1 && fpregs_active == 1
-	 *
-	 *   # the regs are inactive but still match fpstate:
-	 *   fpstate_active == 1 && fpregs_active == 0 && fpregs_owner == fpu
-	 *
-	 * The third state is what we use for the lazy restore optimization
-	 * on lazy-switching CPUs.
-	 */
-	unsigned char			fpregs_active;
-
-	/*
 	 * @fpregs_cached:
 	 *
 	 * This flag tells us whether this context is loaded into a CPU
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index 342e59789fcd..da565aae9fd2 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -12,7 +12,6 @@ DECLARE_EVENT_CLASS(x86_fpu,
 
 	TP_STRUCT__entry(
 		__field(struct fpu *, fpu)
-		__field(bool, fpregs_active)
 		__field(bool, fpstate_active)
 		__field(u64, xfeatures)
 		__field(u64, xcomp_bv)
@@ -20,16 +19,14 @@ DECLARE_EVENT_CLASS(x86_fpu,
 
 	TP_fast_assign(
 		__entry->fpu		= fpu;
-		__entry->fpregs_active	= fpu->fpregs_active;
 		__entry->fpstate_active	= fpu->fpstate_active;
 		if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
 			__entry->xfeatures = fpu->state.xsave.header.xfeatures;
 			__entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
 		}
 	),
-	TP_printk("x86/fpu: %p fpregs_active: %d fpstate_active: %d xfeatures: %llx xcomp_bv: %llx",
+	TP_printk("x86/fpu: %p fpstate_active: %d xfeatures: %llx xcomp_bv: %llx",
 			__entry->fpu,
-			__entry->fpregs_active,
 			__entry->fpstate_active,
 			__entry->xfeatures,
 			__entry->xcomp_bv
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index a025be6637cc..217e37029585 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -147,8 +147,6 @@ void fpu__save(struct fpu *fpu)
 	WARN_ON_FPU(fpu != &current->thread.fpu);
 
 	preempt_disable();
-	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
-
 	trace_x86_fpu_before_save(fpu);
 	if (fpu->fpstate_active) {
 		if (!copy_fpregs_to_fpstate(fpu)) {
@@ -191,7 +189,6 @@ EXPORT_SYMBOL_GPL(fpstate_init);
 
 int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
-	dst_fpu->fpregs_active = 0;
 	dst_fpu->fpregs_cached = 0;
 
 	if (!src_fpu->fpstate_active || !static_cpu_has(X86_FEATURE_FPU))
@@ -264,7 +261,6 @@ EXPORT_SYMBOL_GPL(fpu__activate_curr);
  */
 void fpu__activate_fpstate_read(struct fpu *fpu)
 {
-	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
 	/*
 	 * If fpregs are active (in the current CPU), then
 	 * copy them to the fpstate:
@@ -365,7 +361,6 @@ void fpu__current_fpstate_write_end(void)
 {
 	struct fpu *fpu = &current->thread.fpu;
 
-	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
 	/*
 	 * 'fpu' now has an updated copy of the state, but the
 	 * registers may still be out of date.  Update them with
@@ -419,8 +414,6 @@ void fpu__drop(struct fpu *fpu)
 	preempt_disable();
 
 	if (fpu == &current->thread.fpu) {
-		WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
-
 		if (fpu->fpstate_active) {
 			/* Ignore delayed exceptions from user space */
 			asm volatile("1: fwait\n"
@@ -428,8 +421,6 @@ void fpu__drop(struct fpu *fpu)
 				     _ASM_EXTABLE(1b, 2b));
 			fpregs_deactivate(fpu);
 		}
-	} else {
-		WARN_ON_FPU(fpu->fpregs_active);
 	}
 
 	fpu->fpstate_active = 0;
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a88083ba7f8b..629106e51a29 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -171,8 +171,6 @@ int copy_fpstate_to_sigframe(void __user *buf, void __user *buf_fx, int size)
 			sizeof(struct user_i387_ia32_struct), NULL,
 			(struct _fpstate_32 __user *) buf) ? -1 : 1;
 
-	WARN_ON_FPU(fpu->fpstate_active != fpu->fpregs_active);
-
 	if (fpu->fpstate_active || using_compacted_format()) {
 		/* Save the live register state to the user directly. */
 		if (copy_fpregs_to_sigframe(buf_fx))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  2017-01-26 11:26 ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Ingo Molnar
@ 2017-01-26 14:23   ` Rik van Riel
  2017-01-26 14:53     ` Ingo Molnar
  2017-01-26 14:54   ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Rik van Riel
  1 sibling, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2017-01-26 14:23 UTC (permalink / raw)
  To: Ingo Molnar, linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> 
> @@ -322,6 +308,16 @@ struct fpu {
>  	unsigned char			fpregs_active;
>  
>  	/*
> +	 * @fpregs_cached:
> +	 *
> +	 * This flag tells us whether this context is loaded into a
> CPU
> +	 * right now.

Not quite. You are still checking against fpu_fpregs_owner_ctx.

How about something like

      * This flag tells us whether this context was loaded into
      * its current CPU; fpu_fpregs_owner_ctx will tell us whether
      * this context is actually in the registers.

> +	 *
> +	 * This is set to 0 if a task is migrated to another CPU.
> +	 */
> +	unsigned char			fpregs_cached;
> +
> +	/*
>  	 * @state:
>  	 *
>  	 * In-memory copy of all FPU registers that we save/restore

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 11:26 ` [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active Ingo Molnar
@ 2017-01-26 14:44   ` Rik van Riel
  2017-01-26 15:16     ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2017-01-26 14:44 UTC (permalink / raw)
  To: Ingo Molnar, linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> We want to simplify the FPU state machine by eliminating fpu-
> >fpregs_active,
> and we can do that because the two state flags (::fpregs_active and
> ::fpstate_active) are set essentially together.
> 
> The old lazy FPU switching code used to make a distinction - but
> there's
> no lazy switching code anymore, we always switch in an 'eager'
> fashion.

I've been working for a while now to fix that for
KVM VCPU threads.

Currently when we switch to a VCPU thread, we first
load that thread's userspace FPU context, and then
soon after we save that, and load the guest side FPU
context.

When a VCPU thread goes idle, we also go through
two FPU context transitions.

In order to skip the unnecessary FPU context switches
for VCPU threads, I have been relying on separate
fpstate_active and fpregs_active states.

Do you have any ideas on how I could implement that
kind of change without separate fpstate_active and
fpregs_active states?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  2017-01-26 14:23   ` Rik van Riel
@ 2017-01-26 14:53     ` Ingo Molnar
  2017-01-26 15:05       ` [PATCH] x86/fpu: Unify the naming of the FPU register cache validity flags Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 14:53 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu


* Rik van Riel <riel@redhat.com> wrote:

> On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> > 
> > @@ -322,6 +308,16 @@ struct fpu {
> >  	unsigned char			fpregs_active;
> >  
> >  	/*
> > +	 * @fpregs_cached:
> > +	 *
> > +	 * This flag tells us whether this context is loaded into a
> > CPU
> > +	 * right now.
> 
> Not quite. You are still checking against fpu_fpregs_owner_ctx.

> How about something like
> 
>       * This flag tells us whether this context was loaded into
>       * its current CPU; fpu_fpregs_owner_ctx will tell us whether
>       * this context is actually in the registers.

That's still not quite accurate: if ->fpregs_cached is 0 and fpu_fpregs_owner_ctx 
is still pointing to the FPU structure then the context is not actually in the 
registers anymore - it's a stale copy of some past version.

These values simply tell us whether an in-memory FPU context's latest version is 
in CPU registers or not: both have to be valid for the in-CPU registers to be 
valid and current. The fpu_fpregs_owner_ctx pointer is a per-CPU data structure 
that tells us this fact, the ->fpregs_cached flag tells us the same - but it is 
placed into the task/fpu structure.

Clearing any of those values invalidates the cache and the point of keeping them 
split is implementation efficiency: for some invalidations it's easier to use the 
per-cpu structure, for some others (such as ptrace access) it's easier to access 
the per-task flag. The FPU switch-in code has easy access to both values so 
there's no extra cost from having the cache validity flag split into two parts.

A consequence of this is that a correct implementation could in theory eliminate 
any of the two flags:

 - We could use only fpu_fpregs_owner_ctx and remove ->fpregs_cached, in this case
   the ptrace codepaths would have to invalidate the fpu_fpregs_owner_ctx pointer 
   which requires some care as it's not just a local CPU modification, i.e. a 
   single cmpxchg() would be required to invalidate the register state.

 - Or we could use only ->fpregs_cached and eliminate fpu_fpregs_owner_ctx: this 
   would be awkward from the kernel_fpu_begin()/end() API codepaths, which has no 
   easy access to the task that has its FPU context cached in the CPU registers. 
   (Which might not be the current task.)

So I think the best implementation is to have both flags, and to use the one that 
is the most efficient to access to drive the invalidations from.

What we could do is to unify the naming to explain all this a bit better - right 
now there's very little indication that ->fpregs_cached is closely related to 
fpu_fpregs_owner_ctx.

For example we could rename them to:

	->fpregs_cached         =>     ->fpregs_owner        [bool]
	fpu_fpregs_owner_ctx    =>       fpregs_owner_ctx    [ptr]

?

Clearing ->fpregs_owner or setting fpregs_owner_ctx to NULL invalidates the cache 
and it's clear from the naming that the two values are closely related.

Would this work with you?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  2017-01-26 11:26 ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Ingo Molnar
  2017-01-26 14:23   ` Rik van Riel
@ 2017-01-26 14:54   ` Rik van Riel
  2017-01-26 15:09     ` Ingo Molnar
  2017-01-26 16:51     ` Andy Lutomirski
  1 sibling, 2 replies; 22+ messages in thread
From: Rik van Riel @ 2017-01-26 14:54 UTC (permalink / raw)
  To: Ingo Molnar, linux-kernel
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:

> index c56fb57f2991..7eb2f3041fde 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1253,6 +1253,8 @@ void set_task_cpu(struct task_struct *p,
> unsigned int new_cpu)
>  			p->sched_class->migrate_task_rq(p);
>  		p->se.nr_migrations++;
>  		perf_event_task_migrate(p);
> +
> +		arch_task_migrate(p);
>  	}
> 

Does it really count as a "simplification" if you add a
scheduler callback?

This code does not seem any easier to understand than
the old code...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] x86/fpu: Unify the naming of the FPU register cache validity flags
  2017-01-26 14:53     ` Ingo Molnar
@ 2017-01-26 15:05       ` Ingo Molnar
  2017-01-26 15:31         ` Peter Zijlstra
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 15:05 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu


* Ingo Molnar <mingo@kernel.org> wrote:

> What we could do is to unify the naming to explain all this a bit better - right 
> now there's very little indication that ->fpregs_cached is closely related to 
> fpu_fpregs_owner_ctx.
> 
> For example we could rename them to:
> 
> 	->fpregs_cached         =>     ->fpregs_owner        [bool]
> 	fpu_fpregs_owner_ctx    =>       fpregs_owner_ctx    [ptr]
> 
> ?
> 
> Clearing ->fpregs_owner or setting fpregs_owner_ctx to NULL invalidates the 
> cache and it's clear from the naming that the two values are closely related.

Something like the patch below - only minimally tested.

Thanks,

	Ingo

================>
>From d5b99e1e25f86d4880bf85588eb4a4769610dd47 Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@kernel.org>
Date: Thu, 26 Jan 2017 16:01:00 +0100
Subject: [PATCH] x86/fpu: Unify the naming of the FPU register cache validity flags

Rik pointed out that the fpu_fpregs_owner_ctx and the ->fpregs_cached
flags are still described in a confusing way.

Clarify this some more by renaming them to:

        ->fpregs_cached         =>     ->fpregs_owner        [bool]
        fpu_fpregs_owner_ctx    =>       fpregs_owner_ctx    [ptr]

... which better expresses that they are a cache validity flag
split into two parts, where the cache can be invalidated if any
of the flags is cleared.

Also describe this relationship more accurately in the fpu/types.h header.

No change in functionality.

Reported-by: Rik van Riel <riel@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 22 +++++++++++-----------
 arch/x86/include/asm/fpu/types.h    | 12 ++++++++++--
 arch/x86/include/asm/switch_to.h    |  2 +-
 arch/x86/kernel/fpu/core.c          |  4 ++--
 arch/x86/kernel/smpboot.c           |  2 +-
 5 files changed, 25 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index e62eee2e989e..bbee00aac864 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -486,11 +486,11 @@ extern int copy_fpstate_to_sigframe(void __user *buf, void __user *fp, int size)
  * FPU context switch related helper methods:
  */
 
-DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
+DECLARE_PER_CPU(struct fpu *, fpregs_owner_ctx);
 
 /*
  * The in-register FPU state for an FPU context on a CPU is assumed to be
- * valid if fpu->fpregs_cached is still set, and if the fpu_fpregs_owner_ctx
+ * valid if fpu->fpregs_owner is still set, and if the fpregs_owner_ctx
  * matches the FPU.
  *
  * If the FPU register state is valid, the kernel can skip restoring the
@@ -507,17 +507,17 @@ DECLARE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
  */
 static inline void __cpu_invalidate_fpregs_state(void)
 {
-	__this_cpu_write(fpu_fpregs_owner_ctx, NULL);
+	__this_cpu_write(fpregs_owner_ctx, NULL);
 }
 
 static inline void __fpu_invalidate_fpregs_state(struct fpu *fpu)
 {
-	fpu->fpregs_cached = 0;
+	fpu->fpregs_owner = 0;
 }
 
 static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
 {
-	return fpu == this_cpu_read_stable(fpu_fpregs_owner_ctx) && fpu->fpregs_cached;
+	return fpu == this_cpu_read_stable(fpregs_owner_ctx) && fpu->fpregs_owner;
 }
 
 /*
@@ -526,13 +526,13 @@ static inline int fpregs_state_valid(struct fpu *fpu, unsigned int cpu)
  */
 static inline void fpregs_deactivate(struct fpu *fpu)
 {
-	this_cpu_write(fpu_fpregs_owner_ctx, NULL);
+	this_cpu_write(fpregs_owner_ctx, NULL);
 	trace_x86_fpu_regs_deactivated(fpu);
 }
 
 static inline void fpregs_activate(struct fpu *fpu)
 {
-	this_cpu_write(fpu_fpregs_owner_ctx, fpu);
+	this_cpu_write(fpregs_owner_ctx, fpu);
 	trace_x86_fpu_regs_activated(fpu);
 }
 
@@ -552,14 +552,14 @@ switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (old_fpu->fpstate_active) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
-			old_fpu->fpregs_cached = 0;
+			old_fpu->fpregs_owner = 0;
 		else
-			old_fpu->fpregs_cached = 1;
+			old_fpu->fpregs_owner = 1;
 
-		/* But leave fpu_fpregs_owner_ctx! */
+		/* But leave fpregs_owner_ctx! */
 		trace_x86_fpu_regs_deactivated(old_fpu);
 	} else {
-		old_fpu->fpregs_cached = 0;
+		old_fpu->fpregs_owner = 0;
 	}
 }
 
diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index 07452fbd7867..d15cbfe0e8c4 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -285,14 +285,22 @@ struct fpu {
 	unsigned char			fpstate_active;
 
 	/*
-	 * @fpregs_cached:
+	 * @fpregs_owner:
 	 *
 	 * This flag tells us whether this context is loaded into a CPU
 	 * right now.
 	 *
 	 * This is set to 0 if a task is migrated to another CPU.
+	 *
+	 * NOTE: the fpregs_owner_ctx percpu pointer also has to point to
+	 *       this FPU context for the register cache to be valid. If any
+	 *       of these two flags is cleared then the cache is invalid.
+	 *       Some internals can access the context-flag more easily,
+	 *       others have easier access to the percpu variable. The
+	 *       FPU context-switching code has access to both so there's
+	 *       very little cost of having the cache indexed in two ways:
 	 */
-	unsigned char			fpregs_cached;
+	unsigned char			fpregs_owner;
 
 	/*
 	 * @state:
diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
index a7146dadb31d..7a4915dd0547 100644
--- a/arch/x86/include/asm/switch_to.h
+++ b/arch/x86/include/asm/switch_to.h
@@ -78,7 +78,7 @@ do {									\
  */
 static inline void arch_task_migrate(struct task_struct *p)
 {
-	p->thread.fpu.fpregs_cached = 0;
+	p->thread.fpu.fpregs_owner = 0;
 }
 #define arch_task_migrate arch_task_migrate
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 217e37029585..1b3bf98072fe 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -39,7 +39,7 @@ static DEFINE_PER_CPU(bool, in_kernel_fpu);
 /*
  * Track which context is using the FPU on the CPU:
  */
-DEFINE_PER_CPU(struct fpu *, fpu_fpregs_owner_ctx);
+DEFINE_PER_CPU(struct fpu *, fpregs_owner_ctx);
 
 static void kernel_fpu_disable(void)
 {
@@ -189,7 +189,7 @@ EXPORT_SYMBOL_GPL(fpstate_init);
 
 int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
-	dst_fpu->fpregs_cached = 0;
+	dst_fpu->fpregs_owner = 0;
 
 	if (!src_fpu->fpstate_active || !static_cpu_has(X86_FEATURE_FPU))
 		return 0;
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 46732dc3b73c..8bf24aa01878 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1118,7 +1118,7 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle)
 		return err;
 
 	/* the FPU context is blank, nobody can own it */
-	per_cpu(fpu_fpregs_owner_ctx, cpu) = NULL;
+	per_cpu(fpregs_owner_ctx, cpu) = NULL;
 
 	common_cpu_up(cpu, tidle);
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  2017-01-26 14:54   ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Rik van Riel
@ 2017-01-26 15:09     ` Ingo Molnar
  2017-01-26 16:51     ` Andy Lutomirski
  1 sibling, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 15:09 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu


* Rik van Riel <riel@redhat.com> wrote:

> On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> 
> > index c56fb57f2991..7eb2f3041fde 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1253,6 +1253,8 @@ void set_task_cpu(struct task_struct *p,
> > unsigned int new_cpu)
> >  			p->sched_class->migrate_task_rq(p);
> >  		p->se.nr_migrations++;
> >  		perf_event_task_migrate(p);
> > +
> > +		arch_task_migrate(p);
> >  	}
> > 
> 
> Does it really count as a "simplification" if you add a
> scheduler callback?
> 
> This code does not seem any easier to understand than
> the old code...

See the extra commit I added on top:

  7deff4369276 x86/fpu: Unify the naming of the FPU register cache validity flags

which makes it clearer, we now have:
	
	->fpregs_owner             [bool]
          fpregs_owner_ctx         [ptr]

That are set to 1 and the context pointer when a task with no FPU state is 
scheduled in and where the state of the previous task is preserved (cached) in the 
FPU registers - and which FPU register state cache can be invalidated after this 
by clearing any of the two flags.

That should make its overall meaning clearer, in that they represent a single 
'cache' where the cache validity flag is split into two copies, where any of which 
can be used to invalidate the cache.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 14:44   ` Rik van Riel
@ 2017-01-26 15:16     ` Ingo Molnar
  2017-01-26 15:45       ` Rik van Riel
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 15:16 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu


* Rik van Riel <riel@redhat.com> wrote:

> On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> > We want to simplify the FPU state machine by eliminating fpu-
> > >fpregs_active,
> > and we can do that because the two state flags (::fpregs_active and
> > ::fpstate_active) are set essentially together.
> > 
> > The old lazy FPU switching code used to make a distinction - but
> > there's
> > no lazy switching code anymore, we always switch in an 'eager'
> > fashion.
> 
> I've been working for a while now to fix that for
> KVM VCPU threads.
> 
> Currently when we switch to a VCPU thread, we first
> load that thread's userspace FPU context, and then
> soon after we save that, and load the guest side FPU
> context.
> 
> When a VCPU thread goes idle, we also go through
> two FPU context transitions.
> 
> In order to skip the unnecessary FPU context switches
> for VCPU threads, I have been relying on separate
> fpstate_active and fpregs_active states.
> 
> Do you have any ideas on how I could implement that
> kind of change without separate fpstate_active and
> fpregs_active states?

So the vCPU threads have host side FPU (user-space) state - whatever FPU state 
Qemu has?

One solution to that overhead, without complicating the FPU state machine in any 
way, would be to add a facility to drop/reacquire that FPU state.

That should automatically result in zero FPU state switching AFAICS: kernel 
threads don't do FPU state switching either.

The vCPU threads sometimes do return to user-space, when they get some deep 
exception that needs to be handled by Qemu, right? This aspect shouldn't be a big 
problem either, because the regular calling convention is to call (synchronous) 
system calls without holding FPU state, right?

I.e. the vCPU /dev/kvm ioctl() could drop/re-map the FPU state with very little 
overhead (i.e. no full save/restore required in that code path either), when it 
enters/exits vCPU mode.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/fpu: Unify the naming of the FPU register cache validity flags
  2017-01-26 15:05       ` [PATCH] x86/fpu: Unify the naming of the FPU register cache validity flags Ingo Molnar
@ 2017-01-26 15:31         ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2017-01-26 15:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rik van Riel, linux-kernel, Andrew Morton, Andy Lutomirski,
	Borislav Petkov, Dave Hansen, Fenghua Yu, H . Peter Anvin,
	Linus Torvalds, Oleg Nesterov, Thomas Gleixner, Yu-cheng Yu

On Thu, Jan 26, 2017 at 04:05:25PM +0100, Ingo Molnar wrote:
> diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch_to.h
> index a7146dadb31d..7a4915dd0547 100644
> --- a/arch/x86/include/asm/switch_to.h
> +++ b/arch/x86/include/asm/switch_to.h
> @@ -78,7 +78,7 @@ do {									\
>   */
>  static inline void arch_task_migrate(struct task_struct *p)
>  {
> -	p->thread.fpu.fpregs_cached = 0;
> +	p->thread.fpu.fpregs_owner = 0;
>  }
>  #define arch_task_migrate arch_task_migrate

I still really dislike having this callback..

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 15:16     ` Ingo Molnar
@ 2017-01-26 15:45       ` Rik van Riel
  2017-01-26 15:53         ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: Rik van Riel @ 2017-01-26 15:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, 2017-01-26 at 16:16 +0100, Ingo Molnar wrote:
> * Rik van Riel <riel@redhat.com> wrote:
> 
> > On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> > > We want to simplify the FPU state machine by eliminating fpu-
> > > > fpregs_active,
> > > 
> > > and we can do that because the two state flags (::fpregs_active
> > > and
> > > ::fpstate_active) are set essentially together.
> > > 
> > > The old lazy FPU switching code used to make a distinction - but
> > > there's
> > > no lazy switching code anymore, we always switch in an 'eager'
> > > fashion.
> > 
> > I've been working for a while now to fix that for
> > KVM VCPU threads.
> > 
> > Currently when we switch to a VCPU thread, we first
> > load that thread's userspace FPU context, and then
> > soon after we save that, and load the guest side FPU
> > context.
> > 
> > When a VCPU thread goes idle, we also go through
> > two FPU context transitions.
> > 
> > In order to skip the unnecessary FPU context switches
> > for VCPU threads, I have been relying on separate
> > fpstate_active and fpregs_active states.
> > 
> > Do you have any ideas on how I could implement that
> > kind of change without separate fpstate_active and
> > fpregs_active states?
> 
> So the vCPU threads have host side FPU (user-space) state - whatever
> FPU state 
> Qemu has?

Indeed.

> I.e. the vCPU /dev/kvm ioctl() could drop/re-map the FPU state with
> very little 
> overhead (i.e. no full save/restore required in that code path
> either), when it 
> enters/exits vCPU mode.

Remapping might be best.  If we remap, we do not need to call
kernel_fpu_begin/end around actually going into the guest, and
we can hang onto the guest FPU context while doing stuff inside
the host kernel, even while going to sleep in the host kernel.

Let me go totally reimplement this whole project in a different
way...

At least I found some good FPU bugs and cleanups along the way.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 15:45       ` Rik van Riel
@ 2017-01-26 15:53         ` Ingo Molnar
  2017-01-26 17:00           ` Andy Lutomirski
  0 siblings, 1 reply; 22+ messages in thread
From: Ingo Molnar @ 2017-01-26 15:53 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu


* Rik van Riel <riel@redhat.com> wrote:

> Let me go totally reimplement this whole project in a different way...

Note that I can still be convinced about complicating the FPU state machine as 
well if that ends up being the best approach for KVM - but it appears to me (from 
a very superficial look) that turning vCPU threads into no-FPU kthreads or 
representing the guest FPU state directly with the host FPU context would be even 
more beneficial, from the simplicity and KVM performance POV?

> At least I found some good FPU bugs and cleanups along the way.

Absolutely, and your efforts are much appreciated! This ptrace state handling 
madness that bit you on SkyLake was something I missed entirely.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/7] x86/fpu: Simplify fpu->fpregs_active use
  2017-01-26 11:26 ` [PATCH 2/7] x86/fpu: Simplify fpu->fpregs_active use Ingo Molnar
@ 2017-01-26 16:30   ` Andy Lutomirski
  0 siblings, 0 replies; 22+ messages in thread
From: Andy Lutomirski @ 2017-01-26 16:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: linux-kernel, Andrew Morton, Borislav Petkov, Dave Hansen,
	Fenghua Yu, H . Peter Anvin, Linus Torvalds, Oleg Nesterov,
	Peter Zijlstra, Rik van Riel, Thomas Gleixner, Yu-cheng Yu

On Thu, Jan 26, 2017 at 3:26 AM, Ingo Molnar <mingo@kernel.org> wrote:
> The fpregs_active() inline function is pretty pointless - in almost
> all the callsites it can be replaced with a direct fpu->fpregs_active
> access.
>
> Do so and eliminate the extra layer of obfuscation.

Reviewed-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached
  2017-01-26 14:54   ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Rik van Riel
  2017-01-26 15:09     ` Ingo Molnar
@ 2017-01-26 16:51     ` Andy Lutomirski
  1 sibling, 0 replies; 22+ messages in thread
From: Andy Lutomirski @ 2017-01-26 16:51 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Ingo Molnar, linux-kernel, Andrew Morton, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, Jan 26, 2017 at 6:54 AM, Rik van Riel <riel@redhat.com> wrote:
> On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
>
>> index c56fb57f2991..7eb2f3041fde 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1253,6 +1253,8 @@ void set_task_cpu(struct task_struct *p,
>> unsigned int new_cpu)
>>                       p->sched_class->migrate_task_rq(p);
>>               p->se.nr_migrations++;
>>               perf_event_task_migrate(p);
>> +
>> +             arch_task_migrate(p);
>>       }
>>
>
> Does it really count as a "simplification" if you add a
> scheduler callback?
>
> This code does not seem any easier to understand than
> the old code...

I think I lean toward liking Ingo's version better.  The old code most
likely saved an instruction, but the new code gets the point across
quite nicely.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 15:53         ` Ingo Molnar
@ 2017-01-26 17:00           ` Andy Lutomirski
  2017-01-26 18:04             ` Rik van Riel
  0 siblings, 1 reply; 22+ messages in thread
From: Andy Lutomirski @ 2017-01-26 17:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rik van Riel, linux-kernel, Andrew Morton, Borislav Petkov,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, Jan 26, 2017 at 7:53 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Rik van Riel <riel@redhat.com> wrote:
>
>> Let me go totally reimplement this whole project in a different way...
>
> Note that I can still be convinced about complicating the FPU state machine as
> well if that ends up being the best approach for KVM - but it appears to me (from
> a very superficial look) that turning vCPU threads into no-FPU kthreads or
> representing the guest FPU state directly with the host FPU context would be even
> more beneficial, from the simplicity and KVM performance POV?

I may be misunderstanding you, but I don't see how this would work
without getting either messy or slow.

But I think that your series may still be a good base for Rik's work.
With your series applied, there are three possible FPU states: regs
active (regs are in the CPU), regs inactive (in memory), and regs
cached (in memory *and* regs).  What Rik's series does doesn't really
complicate the state machine -- there are still just these three
states.  The difference is that it's possible for the regs to be
inactive or cached even for the current task so long as we're not in
user mode.  The point being that the user vCPU thread can enter the
kernel, get its FPU state inactivated, enter the guest, and reenter
the kernel without reactivating its regs.

Rik, if you think about it that way, does your work map cleanly onto
Ingo's patches?

Ingo, as far as I know, the only serious conceptual complication is
that this change has the potential to interact poorly with PKRU, but
that should be manageable.

--Andy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active
  2017-01-26 17:00           ` Andy Lutomirski
@ 2017-01-26 18:04             ` Rik van Riel
  0 siblings, 0 replies; 22+ messages in thread
From: Rik van Riel @ 2017-01-26 18:04 UTC (permalink / raw)
  To: Andy Lutomirski, Ingo Molnar
  Cc: linux-kernel, Andrew Morton, Borislav Petkov, pbonzini,
	Dave Hansen, Fenghua Yu, H . Peter Anvin, Linus Torvalds,
	Oleg Nesterov, Peter Zijlstra, Thomas Gleixner, Yu-cheng Yu

On Thu, 2017-01-26 at 09:00 -0800, Andy Lutomirski wrote:
> On Thu, Jan 26, 2017 at 7:53 AM, Ingo Molnar <mingo@kernel.org>
> wrote:
> > 
> > * Rik van Riel <riel@redhat.com> wrote:
> > 
> > > Let me go totally reimplement this whole project in a different
> > > way...
> > 
> > Note that I can still be convinced about complicating the FPU state
> > machine as
> > well if that ends up being the best approach for KVM - but it
> > appears to me (from
> > a very superficial look) that turning vCPU threads into no-FPU
> > kthreads or
> > representing the guest FPU state directly with the host FPU context
> > would be even
> > more beneficial, from the simplicity and KVM performance POV?
> 
> I may be misunderstanding you, but I don't see how this would work
> without getting either messy or slow.
> 
> But I think that your series may still be a good base for Rik's work.
> With your series applied, there are three possible FPU states: regs
> active (regs are in the CPU), regs inactive (in memory), and regs
> cached (in memory *and* regs).  What Rik's series does doesn't really
> complicate the state machine -- there are still just these three
> states.  The difference is that it's possible for the regs to be
> inactive or cached even for the current task so long as we're not in
> user mode.  The point being that the user vCPU thread can enter the
> kernel, get its FPU state inactivated, enter the guest, and reenter
> the kernel without reactivating its regs.
> 
> Rik, if you think about it that way, does your work map cleanly onto
> Ingo's patches?

It does, but the discussion with Ingo also led me to reconsider
an approach I looked at before.

A task could have multiple FPU structures associated with it.
In kvm_vcpu_ioctl(KVM_RUN) we could save the userspace context,
and load the guest FPU context.

Once we are about ready to return to userspace, we can save the
guest FPU context, and load the userspace FPU context.

The only complication is that signal handling and ptrace need
to access the _userspace_ FPU context, even if it is not the
currently used one for the task.

That means we cannot just swap out the contents of
current->thread.fpu, but we need to keep a pointer to the
currently used FPU in current->thread, and have the signal
and ptrace code always work on the userspace FPU data,
which means the in-register data if it is loaded, or the
memory data if it isn't.

On the KVM side, we should be able to drop kernel_fpu_begin
and kernel_fpu_end from entering/leaving the guest. All we
need to swap out in that spot will be the PKRU keys.

The "is the FPU still loaded?" stuff at context switch time
would ensure that guest FPU state loading can be skipped if
all that was run between guest exit and re-entry is kernel
threads.

I suspect this could be slightly lower complexity than the
approach I had been working on, for essentially the same
performance benefit.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-01-26 18:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-26 11:26 [PATCH 0/7] x86/fpu: Simplify the FPU state machine Ingo Molnar
2017-01-26 11:26 ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Ingo Molnar
2017-01-26 14:23   ` Rik van Riel
2017-01-26 14:53     ` Ingo Molnar
2017-01-26 15:05       ` [PATCH] x86/fpu: Unify the naming of the FPU register cache validity flags Ingo Molnar
2017-01-26 15:31         ` Peter Zijlstra
2017-01-26 14:54   ` [PATCH 1/7] x86/fpu: Simplify the fpu->last_cpu logic and rename it to fpu->fpregs_cached Rik van Riel
2017-01-26 15:09     ` Ingo Molnar
2017-01-26 16:51     ` Andy Lutomirski
2017-01-26 11:26 ` [PATCH 2/7] x86/fpu: Simplify fpu->fpregs_active use Ingo Molnar
2017-01-26 16:30   ` Andy Lutomirski
2017-01-26 11:26 ` [PATCH 3/7] x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic Ingo Molnar
2017-01-26 11:26 ` [PATCH 4/7] x86/fpu: Split the state handling in fpu__drop() Ingo Molnar
2017-01-26 11:26 ` [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active Ingo Molnar
2017-01-26 14:44   ` Rik van Riel
2017-01-26 15:16     ` Ingo Molnar
2017-01-26 15:45       ` Rik van Riel
2017-01-26 15:53         ` Ingo Molnar
2017-01-26 17:00           ` Andy Lutomirski
2017-01-26 18:04             ` Rik van Riel
2017-01-26 11:26 ` [PATCH 6/7] x86/fpu: Decouple fpregs_activate()/fpregs_deactivate() from fpu->fpregs_active Ingo Molnar
2017-01-26 11:26 ` [PATCH 7/7] x86/fpu: Remove struct fpu::fpregs_active Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.