LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement
@ 2016-01-24 22:38 Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 1/5] x86/fpu: Fix math emulation in eager fpu mode Andy Lutomirski
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-24 22:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Dave Hansen, Rik van Riel,
	Linus Torvalds, Andy Lutomirski

Hi all-

Here's v2 to celebrate the end of the merge window :)

Patches 1, 2, and 3 are fixes.

Patch 4 is probably a small speedup.  It also only matters in lazy
FPU mode, which means that, most likely, no one cares.  Apply or
don't -- I don't care much.

Patch 5 is, in some sense, a radical change.  Currently we select
eager or lazy mode depending on CPU type.  I think that lazy mode
sucks and that we should deprecate and remove it.

With patches 1-3 applied, I think that eagerfpu works on all
systems.  Patch 5 will use it on all systems subject to a chicken
flag -- eagerfpu=off will still disable it.

I propose that we apply patch 5, let it soak in -next until the 4.6
merge window opens, possibly let it actually land in 4.6, and then
remove lazy mode entirely for 4.7.  This will open up enormous
cleanup possibilities, and it will make the fpu code vastly more
comprehensible.

Changes from v1:
 - Get rid of cpu_has_fpu (Boris)

Andy Lutomirski (5):
  x86/fpu: Fix math emulation in eager fpu mode
  x86/fpu: Fix FNSAVE usage in eagerfpu mode
  x86/fpu: Fold fpu_copy into fpu__copy
  x86/fpu: Speed up lazy FPU restores slightly
  x86/fpu: Default eagerfpu=on on all CPUs

 arch/x86/include/asm/fpu/internal.h |  3 ++-
 arch/x86/kernel/fpu/core.c          | 52 +++++++++++++++++++------------------
 arch/x86/kernel/fpu/init.c          | 13 ++++------
 arch/x86/kernel/traps.c             |  3 +--
 4 files changed, 35 insertions(+), 36 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/5] x86/fpu: Fix math emulation in eager fpu mode
  2016-01-24 22:38 [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement Andy Lutomirski
@ 2016-01-24 22:38 ` Andy Lutomirski
  2016-02-09 16:10   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode Andy Lutomirski
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-24 22:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Dave Hansen, Rik van Riel,
	Linus Torvalds, Andy Lutomirski

Systems without an FPU are generally old and therefore use lazy FPU
switching.  Unsurprisingly, math emulation in eager FPU mode is a
bit buggy.  Fix it.

There were two bugs involving kernel code trying to use the FPU
registers in eager mode even if they didn't exist and one BUG_ON
that was incorrect.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 3 ++-
 arch/x86/kernel/fpu/core.c          | 2 +-
 arch/x86/kernel/traps.c             | 1 -
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 0fd440df63f1..a1f78a9fbf41 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -589,7 +589,8 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
 	 * If the task has used the math, pre-load the FPU on xsave processors
 	 * or if the past 5 consecutive context-switches used math.
 	 */
-	fpu.preload = new_fpu->fpstate_active &&
+	fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
+		      new_fpu->fpstate_active &&
 		      (use_eager_fpu() || new_fpu->counter > 5);
 
 	if (old_fpu->fpregs_active) {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d25097c3fc1d..08e1e11a05ca 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -423,7 +423,7 @@ void fpu__clear(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
 
-	if (!use_eager_fpu()) {
+	if (!use_eager_fpu() || !static_cpu_has(X86_FEATURE_FPU)) {
 		/* FPU state will be reallocated lazily at the first use. */
 		fpu__drop(fpu);
 	} else {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index ade185a46b1d..87f80febf477 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -750,7 +750,6 @@ dotraplinkage void
 do_device_not_available(struct pt_regs *regs, long error_code)
 {
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
-	BUG_ON(use_eager_fpu());
 
 #ifdef CONFIG_MATH_EMULATION
 	if (read_cr0() & X86_CR0_EM) {
-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode
  2016-01-24 22:38 [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 1/5] x86/fpu: Fix math emulation in eager fpu mode Andy Lutomirski
@ 2016-01-24 22:38 ` Andy Lutomirski
  2016-01-25 15:40   ` Dave Hansen
  2016-02-09 16:10   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 3/5] x86/fpu: Fold fpu_copy into fpu__copy Andy Lutomirski
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-24 22:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Dave Hansen, Rik van Riel,
	Linus Torvalds, Andy Lutomirski

In eager fpu mode, having deactivated fpu without immediately
reloading some other context is illegal.  Therefore, to recover from
FNSAVE, we can't just deactivate the state -- we need to reload it
if we're not actively context switching.

We had this wrong in fpu__save and fpu__copy.  Fix both.
__kernel_fpu_begin was fine -- add a comment.

This fixes a warning triggerable with nofxsr eagerfpu=on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 08e1e11a05ca..7a9244df33e2 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -114,6 +114,10 @@ void __kernel_fpu_begin(void)
 	kernel_fpu_disable();
 
 	if (fpu->fpregs_active) {
+		/*
+		 * Ignore return value -- we don't care if reg state
+		 * is clobbered.
+		 */
 		copy_fpregs_to_fpstate(fpu);
 	} else {
 		this_cpu_write(fpu_fpregs_owner_ctx, NULL);
@@ -189,8 +193,12 @@ void fpu__save(struct fpu *fpu)
 
 	preempt_disable();
 	if (fpu->fpregs_active) {
-		if (!copy_fpregs_to_fpstate(fpu))
-			fpregs_deactivate(fpu);
+		if (!copy_fpregs_to_fpstate(fpu)) {
+			if (use_eager_fpu())
+				copy_kernel_to_fpregs(&fpu->state);
+			else
+				fpregs_deactivate(fpu);
+		}
 	}
 	preempt_enable();
 }
@@ -259,7 +267,11 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	preempt_disable();
 	if (!copy_fpregs_to_fpstate(dst_fpu)) {
 		memcpy(&src_fpu->state, &dst_fpu->state, xstate_size);
-		fpregs_deactivate(src_fpu);
+
+		if (use_eager_fpu())
+			copy_kernel_to_fpregs(&src_fpu->state);
+		else
+			fpregs_deactivate(src_fpu);
 	}
 	preempt_enable();
 }
-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 3/5] x86/fpu: Fold fpu_copy into fpu__copy
  2016-01-24 22:38 [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 1/5] x86/fpu: Fix math emulation in eager fpu mode Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode Andy Lutomirski
@ 2016-01-24 22:38 ` Andy Lutomirski
  2016-02-09 16:10   ` [tip:x86/fpu] x86/fpu: Fold fpu_copy() into fpu__copy() tip-bot for Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 4/5] x86/fpu: Speed up lazy FPU restores slightly Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 5/5] x86/fpu: Default eagerfpu=on on all CPUs Andy Lutomirski
  4 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-24 22:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Dave Hansen, Rik van Riel,
	Linus Torvalds, Andy Lutomirski

Splitting it into two functions needlessly obfuscated the code.
While we're at it, improve the comment slightly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7a9244df33e2..299b58bb975b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -231,14 +231,15 @@ void fpstate_init(union fpregs_state *state)
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
-/*
- * Copy the current task's FPU state to a new task's FPU context.
- *
- * In both the 'eager' and the 'lazy' case we save hardware registers
- * directly to the destination buffer.
- */
-static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
+int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
+	dst_fpu->counter = 0;
+	dst_fpu->fpregs_active = 0;
+	dst_fpu->last_cpu = -1;
+
+	if (!src_fpu->fpstate_active || !cpu_has_fpu)
+		return 0;
+
 	WARN_ON_FPU(src_fpu != &current->thread.fpu);
 
 	/*
@@ -251,10 +252,9 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	/*
 	 * Save current FPU registers directly into the child
 	 * FPU context, without any memory-to-memory copying.
-	 *
-	 * If the FPU context got destroyed in the process (FNSAVE
-	 * done on old CPUs) then copy it back into the source
-	 * context and mark the current task for lazy restore.
+	 * In lazy mode, if the FPU context isn't loaded into
+	 * fpregs, CR0.TS will be set and do_device_not_available
+	 * will load the FPU context.
 	 *
 	 * We have to do all this with preemption disabled,
 	 * mostly because of the FNSAVE case, because in that
@@ -274,16 +274,6 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 			fpregs_deactivate(src_fpu);
 	}
 	preempt_enable();
-}
-
-int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
-{
-	dst_fpu->counter = 0;
-	dst_fpu->fpregs_active = 0;
-	dst_fpu->last_cpu = -1;
-
-	if (src_fpu->fpstate_active && cpu_has_fpu)
-		fpu_copy(dst_fpu, src_fpu);
 
 	return 0;
 }
-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 4/5] x86/fpu: Speed up lazy FPU restores slightly
  2016-01-24 22:38 [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement Andy Lutomirski
                   ` (2 preceding siblings ...)
  2016-01-24 22:38 ` [PATCH v2 3/5] x86/fpu: Fold fpu_copy into fpu__copy Andy Lutomirski
@ 2016-01-24 22:38 ` Andy Lutomirski
  2016-02-09 16:11   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
  2016-01-24 22:38 ` [PATCH v2 5/5] x86/fpu: Default eagerfpu=on on all CPUs Andy Lutomirski
  4 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-24 22:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Dave Hansen, Rik van Riel,
	Linus Torvalds, Andy Lutomirski

If we have an FPU, there's no need to check CR0 for FPU emulation.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 87f80febf477..36a9c017540e 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -752,7 +752,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
 
 #ifdef CONFIG_MATH_EMULATION
-	if (read_cr0() & X86_CR0_EM) {
+	if (!boot_cpu_has(X86_FEATURE_FPU) && (read_cr0() & X86_CR0_EM)) {
 		struct math_emu_info info = { };
 
 		conditional_sti(regs);
-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 5/5] x86/fpu: Default eagerfpu=on on all CPUs
  2016-01-24 22:38 [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement Andy Lutomirski
                   ` (3 preceding siblings ...)
  2016-01-24 22:38 ` [PATCH v2 4/5] x86/fpu: Speed up lazy FPU restores slightly Andy Lutomirski
@ 2016-01-24 22:38 ` Andy Lutomirski
  2016-02-09 16:11   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
  4 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-24 22:38 UTC (permalink / raw)
  To: x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Dave Hansen, Rik van Riel,
	Linus Torvalds, Andy Lutomirski

We have eager and lazy fpu modes, introduced in 304bceda6a18 ("x86,
fpu: use non-lazy fpu restore for processors supporting xsave").

The result is rather messy.  There are two code paths in almost all
of the FPU code, and only one of them (the eager case) is tested
frequently, since most kernel developers have new enough hardware
that we use eagerfpu.

It seems that, on any remotely recent hardware, eagerfpu is a win:
glibc uses SSE2, so laziness is probably overoptimistic, and, in any
case, manipulating TS is far slower that saving and restoring the
full state.  (Stores to CR0.TS are serializing and are poorly
optimized.)

To try to shake out any latent issues on old hardware, this changes
the default to eager on all CPUs.  If no performance or functionality
problems show up, a subsequent patch could remove lazy mode entirely.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/kernel/fpu/init.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index d53ab3d3b8e8..e12cc0ad368e 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -262,7 +262,10 @@ static void __init fpu__init_system_xstate_size_legacy(void)
  * not only saved the restores along the way, but we also have the
  * FPU ready to be used for the original task.
  *
- * 'eager' switching is used on modern CPUs, there we switch the FPU
+ * 'lazy' is deprecated because it's almost never a performance win
+ * and it's much more complicated than 'eager'.
+ *
+ * 'eager' switching is by default on all CPUs, there we switch the FPU
  * state during every context switch, regardless of whether the task
  * has used FPU instructions in that time slice or not. This is done
  * because modern FPU context saving instructions are able to optimize
@@ -273,7 +276,7 @@ static void __init fpu__init_system_xstate_size_legacy(void)
  *   to use 'eager' restores, if we detect that a task is using the FPU
  *   frequently. See the fpu->counter logic in fpu/internal.h for that. ]
  */
-static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO;
+static enum { ENABLE, DISABLE } eagerfpu = ENABLE;
 
 /*
  * Find supported xfeatures based on cpu features and command-line input.
@@ -350,15 +353,9 @@ static void __init fpu__init_system_ctx_switch(void)
  */
 static void __init fpu__init_parse_early_param(void)
 {
-	/*
-	 * No need to check "eagerfpu=auto" again, since it is the
-	 * initial default.
-	 */
 	if (cmdline_find_option_bool(boot_command_line, "eagerfpu=off")) {
 		eagerfpu = DISABLE;
 		fpu__clear_eager_fpu_features();
-	} else if (cmdline_find_option_bool(boot_command_line, "eagerfpu=on")) {
-		eagerfpu = ENABLE;
 	}
 
 	if (cmdline_find_option_bool(boot_command_line, "no387"))
-- 
2.5.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode
  2016-01-24 22:38 ` [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode Andy Lutomirski
@ 2016-01-25 15:40   ` Dave Hansen
  2016-01-25 17:25     ` Andy Lutomirski
  2016-02-09 16:10   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
  1 sibling, 1 reply; 14+ messages in thread
From: Dave Hansen @ 2016-01-25 15:40 UTC (permalink / raw)
  To: Andy Lutomirski, x86, linux-kernel
  Cc: Borislav Petkov, Fenghua Yu, Oleg Nesterov, Peter Zijlstra,
	Sai Praneeth Prakhya, yu-cheng yu, Rik van Riel, Linus Torvalds

On 01/24/2016 02:38 PM, Andy Lutomirski wrote:
>  	if (fpu->fpregs_active) {
> +		/*
> +		 * Ignore return value -- we don't care if reg state
> +		 * is clobbered.
> +		 */
>  		copy_fpregs_to_fpstate(fpu);
>  	} else {
>  		this_cpu_write(fpu_fpregs_owner_ctx, NULL);
> @@ -189,8 +193,12 @@ void fpu__save(struct fpu *fpu)
>  
>  	preempt_disable();
>  	if (fpu->fpregs_active) {
> -		if (!copy_fpregs_to_fpstate(fpu))
> -			fpregs_deactivate(fpu);
> +		if (!copy_fpregs_to_fpstate(fpu)) {
> +			if (use_eager_fpu())
> +				copy_kernel_to_fpregs(&fpu->state);
> +			else
> +				fpregs_deactivate(fpu);
> +		}
>  	}
>  	preempt_enable();

I wonder if we should just make the

> +			if (use_eager_fpu())
> +				copy_kernel_to_fpregs(&fpu->state);
> +			else
> +				fpregs_deactivate(fpu);

behavior the default _inside_ copy_fpregs_to_fpstate(fpu).  We evidently
got it wrong in 2/3 of the call sites that needed it.  It ends up being
an optimization for FNSAVE (because it allows us to avoid an FRSTOR),
but we only take advantage of that in cases of kernel_fpu_begin/end().

FXSAVE has been around since at _least_ 1999, and I'd expect it to get
used in place of FNSAVE everywhere that it is available.

If we don't want to do that, maybe we should add a "clobber" argument to
copy_fpregs_to_fpstate() for when it's allowed to clobber the register
state.

I just hate putting this logic at all the call sites.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode
  2016-01-25 15:40   ` Dave Hansen
@ 2016-01-25 17:25     ` Andy Lutomirski
  2016-01-25 17:26       ` Dave Hansen
  0 siblings, 1 reply; 14+ messages in thread
From: Andy Lutomirski @ 2016-01-25 17:25 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Yu-cheng Yu, Fenghua Yu, Borislav Petkov, linux-kernel,
	Oleg Nesterov, Sai Praneeth Prakhya, X86 ML, Linus Torvalds,
	Rik van Riel, Peter Zijlstra

On Jan 25, 2016 7:41 AM, "Dave Hansen" <dave.hansen@linux.intel.com> wrote:
>
> On 01/24/2016 02:38 PM, Andy Lutomirski wrote:
> >       if (fpu->fpregs_active) {
> > +             /*
> > +              * Ignore return value -- we don't care if reg state
> > +              * is clobbered.
> > +              */
> >               copy_fpregs_to_fpstate(fpu);
> >       } else {
> >               this_cpu_write(fpu_fpregs_owner_ctx, NULL);
> > @@ -189,8 +193,12 @@ void fpu__save(struct fpu *fpu)
> >
> >       preempt_disable();
> >       if (fpu->fpregs_active) {
> > -             if (!copy_fpregs_to_fpstate(fpu))
> > -                     fpregs_deactivate(fpu);
> > +             if (!copy_fpregs_to_fpstate(fpu)) {
> > +                     if (use_eager_fpu())
> > +                             copy_kernel_to_fpregs(&fpu->state);
> > +                     else
> > +                             fpregs_deactivate(fpu);
> > +             }
> >       }
> >       preempt_enable();
>
> I wonder if we should just make the
>
> > +                     if (use_eager_fpu())
> > +                             copy_kernel_to_fpregs(&fpu->state);
> > +                     else
> > +                             fpregs_deactivate(fpu);
>
> behavior the default _inside_ copy_fpregs_to_fpstate(fpu).  We evidently
> got it wrong in 2/3 of the call sites that needed it.  It ends up being
> an optimization for FNSAVE (because it allows us to avoid an FRSTOR),
> but we only take advantage of that in cases of kernel_fpu_begin/end().
>
> FXSAVE has been around since at _least_ 1999, and I'd expect it to get
> used in place of FNSAVE everywhere that it is available.
>
> If we don't want to do that, maybe we should add a "clobber" argument to
> copy_fpregs_to_fpstate() for when it's allowed to clobber the register
> state.
>
> I just hate putting this logic at all the call sites.

Me too.  I was thinking about having a clobber and a non-clobber variant.
The tricky part is that we have to think about preemption, too.  In
theory, copying fpregs to somewhere other then the normal spot can be
okay with preemption on except in the FNSAVE case, but all the callers
probably need preemption off anyway.

Even if we do the cleanup, I think I'd rather fix the bug in place
first so the diff is clearer and then clean it up on top of that.

Does that seem reasonable?


--Andy

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode
  2016-01-25 17:25     ` Andy Lutomirski
@ 2016-01-25 17:26       ` Dave Hansen
  0 siblings, 0 replies; 14+ messages in thread
From: Dave Hansen @ 2016-01-25 17:26 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Yu-cheng Yu, Fenghua Yu, Borislav Petkov, linux-kernel,
	Oleg Nesterov, Sai Praneeth Prakhya, X86 ML, Linus Torvalds,
	Rik van Riel, Peter Zijlstra

On 01/25/2016 09:25 AM, Andy Lutomirski wrote:
> Even if we do the cleanup, I think I'd rather fix the bug in place
> first so the diff is clearer and then clean it up on top of that.
> 
> Does that seem reasonable?

Yup, sounds fine to me.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:x86/fpu] x86/fpu: Fix math emulation in eager fpu mode
  2016-01-24 22:38 ` [PATCH v2 1/5] x86/fpu: Fix math emulation in eager fpu mode Andy Lutomirski
@ 2016-02-09 16:10   ` " tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-09 16:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, peterz, sai.praneeth.prakhya, fenghua.yu, bp, yu-cheng.yu,
	luto, riel, linux-kernel, luto, dave.hansen, hpa,
	quentin.casasnovas, torvalds, oleg, tglx

Commit-ID:  4ecd16ec7059390b430af34bd8bc3ca2b5dcef9a
Gitweb:     http://git.kernel.org/tip/4ecd16ec7059390b430af34bd8bc3ca2b5dcef9a
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Sun, 24 Jan 2016 14:38:06 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 9 Feb 2016 15:42:55 +0100

x86/fpu: Fix math emulation in eager fpu mode

Systems without an FPU are generally old and therefore use lazy FPU
switching. Unsurprisingly, math emulation in eager FPU mode is a
bit buggy. Fix it.

There were two bugs involving kernel code trying to use the FPU
registers in eager mode even if they didn't exist and one BUG_ON()
that was incorrect.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/b4b8d112436bd6fab866e1b4011131507e8d7fbe.1453675014.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/fpu/internal.h | 3 ++-
 arch/x86/kernel/fpu/core.c          | 2 +-
 arch/x86/kernel/traps.c             | 1 -
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index 0fd440d..a1f78a9 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -589,7 +589,8 @@ switch_fpu_prepare(struct fpu *old_fpu, struct fpu *new_fpu, int cpu)
 	 * If the task has used the math, pre-load the FPU on xsave processors
 	 * or if the past 5 consecutive context-switches used math.
 	 */
-	fpu.preload = new_fpu->fpstate_active &&
+	fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
+		      new_fpu->fpstate_active &&
 		      (use_eager_fpu() || new_fpu->counter > 5);
 
 	if (old_fpu->fpregs_active) {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index d25097c..08e1e11 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -423,7 +423,7 @@ void fpu__clear(struct fpu *fpu)
 {
 	WARN_ON_FPU(fpu != &current->thread.fpu); /* Almost certainly an anomaly */
 
-	if (!use_eager_fpu()) {
+	if (!use_eager_fpu() || !static_cpu_has(X86_FEATURE_FPU)) {
 		/* FPU state will be reallocated lazily at the first use. */
 		fpu__drop(fpu);
 	} else {
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index ade185a..87f80fe 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -750,7 +750,6 @@ dotraplinkage void
 do_device_not_available(struct pt_regs *regs, long error_code)
 {
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
-	BUG_ON(use_eager_fpu());
 
 #ifdef CONFIG_MATH_EMULATION
 	if (read_cr0() & X86_CR0_EM) {

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:x86/fpu] x86/fpu: Fix FNSAVE usage in eagerfpu mode
  2016-01-24 22:38 ` [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode Andy Lutomirski
  2016-01-25 15:40   ` Dave Hansen
@ 2016-02-09 16:10   ` " tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 14+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-09 16:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: sai.praneeth.prakhya, bp, riel, fenghua.yu, dave.hansen,
	yu-cheng.yu, linux-kernel, peterz, luto, quentin.casasnovas,
	tglx, oleg, mingo, hpa, torvalds, luto

Commit-ID:  5ed73f40735c68d8a656b46d09b1885d3b8740ae
Gitweb:     http://git.kernel.org/tip/5ed73f40735c68d8a656b46d09b1885d3b8740ae
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Sun, 24 Jan 2016 14:38:07 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 9 Feb 2016 15:42:55 +0100

x86/fpu: Fix FNSAVE usage in eagerfpu mode

In eager fpu mode, having deactivated FPU without immediately
reloading some other context is illegal.  Therefore, to recover from
FNSAVE, we can't just deactivate the state -- we need to reload it
if we're not actively context switching.

We had this wrong in fpu__save() and fpu__copy().  Fix both.
__kernel_fpu_begin() was fine -- add a comment.

This fixes a warning triggerable with nofxsr eagerfpu=on.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/60662444e13c76f06e23c15c5dcdba31b4ac3d67.1453675014.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 08e1e11..7a9244d 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -114,6 +114,10 @@ void __kernel_fpu_begin(void)
 	kernel_fpu_disable();
 
 	if (fpu->fpregs_active) {
+		/*
+		 * Ignore return value -- we don't care if reg state
+		 * is clobbered.
+		 */
 		copy_fpregs_to_fpstate(fpu);
 	} else {
 		this_cpu_write(fpu_fpregs_owner_ctx, NULL);
@@ -189,8 +193,12 @@ void fpu__save(struct fpu *fpu)
 
 	preempt_disable();
 	if (fpu->fpregs_active) {
-		if (!copy_fpregs_to_fpstate(fpu))
-			fpregs_deactivate(fpu);
+		if (!copy_fpregs_to_fpstate(fpu)) {
+			if (use_eager_fpu())
+				copy_kernel_to_fpregs(&fpu->state);
+			else
+				fpregs_deactivate(fpu);
+		}
 	}
 	preempt_enable();
 }
@@ -259,7 +267,11 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	preempt_disable();
 	if (!copy_fpregs_to_fpstate(dst_fpu)) {
 		memcpy(&src_fpu->state, &dst_fpu->state, xstate_size);
-		fpregs_deactivate(src_fpu);
+
+		if (use_eager_fpu())
+			copy_kernel_to_fpregs(&src_fpu->state);
+		else
+			fpregs_deactivate(src_fpu);
 	}
 	preempt_enable();
 }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:x86/fpu] x86/fpu: Fold fpu_copy() into fpu__copy()
  2016-01-24 22:38 ` [PATCH v2 3/5] x86/fpu: Fold fpu_copy into fpu__copy Andy Lutomirski
@ 2016-02-09 16:10   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-09 16:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, luto, bp, fenghua.yu, oleg, tglx, luto,
	sai.praneeth.prakhya, riel, torvalds, dave.hansen, mingo,
	yu-cheng.yu, peterz, quentin.casasnovas

Commit-ID:  a20d7297045f7fdcd676c15243192eb0e95a4306
Gitweb:     http://git.kernel.org/tip/a20d7297045f7fdcd676c15243192eb0e95a4306
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Sun, 24 Jan 2016 14:38:08 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 9 Feb 2016 15:42:55 +0100

x86/fpu: Fold fpu_copy() into fpu__copy()

Splitting it into two functions needlessly obfuscated the code.
While we're at it, improve the comment slightly.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/3eb5a63a9c5c84077b2677a7dfe684eef96fe59e.1453675014.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/core.c | 32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7a9244d..299b58b 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -231,14 +231,15 @@ void fpstate_init(union fpregs_state *state)
 }
 EXPORT_SYMBOL_GPL(fpstate_init);
 
-/*
- * Copy the current task's FPU state to a new task's FPU context.
- *
- * In both the 'eager' and the 'lazy' case we save hardware registers
- * directly to the destination buffer.
- */
-static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
+int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 {
+	dst_fpu->counter = 0;
+	dst_fpu->fpregs_active = 0;
+	dst_fpu->last_cpu = -1;
+
+	if (!src_fpu->fpstate_active || !cpu_has_fpu)
+		return 0;
+
 	WARN_ON_FPU(src_fpu != &current->thread.fpu);
 
 	/*
@@ -251,10 +252,9 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 	/*
 	 * Save current FPU registers directly into the child
 	 * FPU context, without any memory-to-memory copying.
-	 *
-	 * If the FPU context got destroyed in the process (FNSAVE
-	 * done on old CPUs) then copy it back into the source
-	 * context and mark the current task for lazy restore.
+	 * In lazy mode, if the FPU context isn't loaded into
+	 * fpregs, CR0.TS will be set and do_device_not_available
+	 * will load the FPU context.
 	 *
 	 * We have to do all this with preemption disabled,
 	 * mostly because of the FNSAVE case, because in that
@@ -274,16 +274,6 @@ static void fpu_copy(struct fpu *dst_fpu, struct fpu *src_fpu)
 			fpregs_deactivate(src_fpu);
 	}
 	preempt_enable();
-}
-
-int fpu__copy(struct fpu *dst_fpu, struct fpu *src_fpu)
-{
-	dst_fpu->counter = 0;
-	dst_fpu->fpregs_active = 0;
-	dst_fpu->last_cpu = -1;
-
-	if (src_fpu->fpstate_active && cpu_has_fpu)
-		fpu_copy(dst_fpu, src_fpu);
 
 	return 0;
 }

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:x86/fpu] x86/fpu: Speed up lazy FPU restores slightly
  2016-01-24 22:38 ` [PATCH v2 4/5] x86/fpu: Speed up lazy FPU restores slightly Andy Lutomirski
@ 2016-02-09 16:11   ` " tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-09 16:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, fenghua.yu, torvalds, quentin.casasnovas, mingo, dave.hansen,
	luto, sai.praneeth.prakhya, linux-kernel, oleg, luto,
	yu-cheng.yu, peterz, riel, tglx, hpa

Commit-ID:  c6ab109f7e0eae3bae3bb10f8ddb0df67735c150
Gitweb:     http://git.kernel.org/tip/c6ab109f7e0eae3bae3bb10f8ddb0df67735c150
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Sun, 24 Jan 2016 14:38:09 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 9 Feb 2016 15:42:56 +0100

x86/fpu: Speed up lazy FPU restores slightly

If we have an FPU, there's no need to check CR0 for FPU emulation.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/980004297e233c27066d54e71382c44cdd36ef7c.1453675014.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/traps.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 87f80fe..36a9c01 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -752,7 +752,7 @@ do_device_not_available(struct pt_regs *regs, long error_code)
 	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
 
 #ifdef CONFIG_MATH_EMULATION
-	if (read_cr0() & X86_CR0_EM) {
+	if (!boot_cpu_has(X86_FEATURE_FPU) && (read_cr0() & X86_CR0_EM)) {
 		struct math_emu_info info = { };
 
 		conditional_sti(regs);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:x86/fpu] x86/fpu: Default eagerfpu=on on all CPUs
  2016-01-24 22:38 ` [PATCH v2 5/5] x86/fpu: Default eagerfpu=on on all CPUs Andy Lutomirski
@ 2016-02-09 16:11   ` " tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-09 16:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: sai.praneeth.prakhya, quentin.casasnovas, mingo, torvalds, oleg,
	luto, hpa, fenghua.yu, tglx, linux-kernel, bp, dave.hansen, luto,
	riel, peterz, yu-cheng.yu

Commit-ID:  58122bf1d856a4ea9581d62a07c557d997d46a19
Gitweb:     http://git.kernel.org/tip/58122bf1d856a4ea9581d62a07c557d997d46a19
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Sun, 24 Jan 2016 14:38:10 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 9 Feb 2016 15:42:56 +0100

x86/fpu: Default eagerfpu=on on all CPUs

We have eager and lazy FPU modes, introduced in:

  304bceda6a18 ("x86, fpu: use non-lazy fpu restore for processors supporting xsave")

The result is rather messy.  There are two code paths in almost all
of the FPU code, and only one of them (the eager case) is tested
frequently, since most kernel developers have new enough hardware
that we use eagerfpu.

It seems that, on any remotely recent hardware, eagerfpu is a win:
glibc uses SSE2, so laziness is probably overoptimistic, and, in any
case, manipulating TS is far slower that saving and restoring the
full state.  (Stores to CR0.TS are serializing and are poorly
optimized.)

To try to shake out any latent issues on old hardware, this changes
the default to eager on all CPUs.  If no performance or functionality
problems show up, a subsequent patch could remove lazy mode entirely.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: yu-cheng yu <yu-cheng.yu@intel.com>
Link: http://lkml.kernel.org/r/ac290de61bf08d9cfc2664a4f5080257ffc1075a.1453675014.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/fpu/init.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 6d9f0a7..471fe27 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -260,7 +260,10 @@ static void __init fpu__init_system_xstate_size_legacy(void)
  * not only saved the restores along the way, but we also have the
  * FPU ready to be used for the original task.
  *
- * 'eager' switching is used on modern CPUs, there we switch the FPU
+ * 'lazy' is deprecated because it's almost never a performance win
+ * and it's much more complicated than 'eager'.
+ *
+ * 'eager' switching is by default on all CPUs, there we switch the FPU
  * state during every context switch, regardless of whether the task
  * has used FPU instructions in that time slice or not. This is done
  * because modern FPU context saving instructions are able to optimize
@@ -271,7 +274,7 @@ static void __init fpu__init_system_xstate_size_legacy(void)
  *   to use 'eager' restores, if we detect that a task is using the FPU
  *   frequently. See the fpu->counter logic in fpu/internal.h for that. ]
  */
-static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO;
+static enum { ENABLE, DISABLE } eagerfpu = ENABLE;
 
 /*
  * Find supported xfeatures based on cpu features and command-line input.
@@ -348,15 +351,9 @@ static void __init fpu__init_system_ctx_switch(void)
  */
 static void __init fpu__init_parse_early_param(void)
 {
-	/*
-	 * No need to check "eagerfpu=auto" again, since it is the
-	 * initial default.
-	 */
 	if (cmdline_find_option_bool(boot_command_line, "eagerfpu=off")) {
 		eagerfpu = DISABLE;
 		fpu__clear_eager_fpu_features();
-	} else if (cmdline_find_option_bool(boot_command_line, "eagerfpu=on")) {
-		eagerfpu = ENABLE;
 	}
 
 	if (cmdline_find_option_bool(boot_command_line, "no387"))

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-24 22:38 [PATCH v2 0/5] x86/fpu: eagerfpu fixes, speedups, and default enablement Andy Lutomirski
2016-01-24 22:38 ` [PATCH v2 1/5] x86/fpu: Fix math emulation in eager fpu mode Andy Lutomirski
2016-02-09 16:10   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
2016-01-24 22:38 ` [PATCH v2 2/5] x86/fpu: Fix FNSAVE usage in eagerfpu mode Andy Lutomirski
2016-01-25 15:40   ` Dave Hansen
2016-01-25 17:25     ` Andy Lutomirski
2016-01-25 17:26       ` Dave Hansen
2016-02-09 16:10   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
2016-01-24 22:38 ` [PATCH v2 3/5] x86/fpu: Fold fpu_copy into fpu__copy Andy Lutomirski
2016-02-09 16:10   ` [tip:x86/fpu] x86/fpu: Fold fpu_copy() into fpu__copy() tip-bot for Andy Lutomirski
2016-01-24 22:38 ` [PATCH v2 4/5] x86/fpu: Speed up lazy FPU restores slightly Andy Lutomirski
2016-02-09 16:11   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski
2016-01-24 22:38 ` [PATCH v2 5/5] x86/fpu: Default eagerfpu=on on all CPUs Andy Lutomirski
2016-02-09 16:11   ` [tip:x86/fpu] " tip-bot for Andy Lutomirski

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox