From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755320AbcAXWjG (ORCPT ); Sun, 24 Jan 2016 17:39:06 -0500 Received: from mail.kernel.org ([198.145.29.136]:41843 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755438AbcAXWiW (ORCPT ); Sun, 24 Jan 2016 17:38:22 -0500 From: Andy Lutomirski To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Borislav Petkov , Fenghua Yu , Oleg Nesterov , Peter Zijlstra , Sai Praneeth Prakhya , yu-cheng yu , Dave Hansen , Rik van Riel , Linus Torvalds , Andy Lutomirski Subject: [PATCH v2 5/5] x86/fpu: Default eagerfpu=on on all CPUs Date: Sun, 24 Jan 2016 14:38:10 -0800 Message-Id: X-Mailer: git-send-email 2.5.0 In-Reply-To: References: In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We have eager and lazy fpu modes, introduced in 304bceda6a18 ("x86, fpu: use non-lazy fpu restore for processors supporting xsave"). The result is rather messy. There are two code paths in almost all of the FPU code, and only one of them (the eager case) is tested frequently, since most kernel developers have new enough hardware that we use eagerfpu. It seems that, on any remotely recent hardware, eagerfpu is a win: glibc uses SSE2, so laziness is probably overoptimistic, and, in any case, manipulating TS is far slower that saving and restoring the full state. (Stores to CR0.TS are serializing and are poorly optimized.) To try to shake out any latent issues on old hardware, this changes the default to eager on all CPUs. If no performance or functionality problems show up, a subsequent patch could remove lazy mode entirely. Signed-off-by: Andy Lutomirski --- arch/x86/kernel/fpu/init.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c index d53ab3d3b8e8..e12cc0ad368e 100644 --- a/arch/x86/kernel/fpu/init.c +++ b/arch/x86/kernel/fpu/init.c @@ -262,7 +262,10 @@ static void __init fpu__init_system_xstate_size_legacy(void) * not only saved the restores along the way, but we also have the * FPU ready to be used for the original task. * - * 'eager' switching is used on modern CPUs, there we switch the FPU + * 'lazy' is deprecated because it's almost never a performance win + * and it's much more complicated than 'eager'. + * + * 'eager' switching is by default on all CPUs, there we switch the FPU * state during every context switch, regardless of whether the task * has used FPU instructions in that time slice or not. This is done * because modern FPU context saving instructions are able to optimize @@ -273,7 +276,7 @@ static void __init fpu__init_system_xstate_size_legacy(void) * to use 'eager' restores, if we detect that a task is using the FPU * frequently. See the fpu->counter logic in fpu/internal.h for that. ] */ -static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO; +static enum { ENABLE, DISABLE } eagerfpu = ENABLE; /* * Find supported xfeatures based on cpu features and command-line input. @@ -350,15 +353,9 @@ static void __init fpu__init_system_ctx_switch(void) */ static void __init fpu__init_parse_early_param(void) { - /* - * No need to check "eagerfpu=auto" again, since it is the - * initial default. - */ if (cmdline_find_option_bool(boot_command_line, "eagerfpu=off")) { eagerfpu = DISABLE; fpu__clear_eager_fpu_features(); - } else if (cmdline_find_option_bool(boot_command_line, "eagerfpu=on")) { - eagerfpu = ENABLE; } if (cmdline_find_option_bool(boot_command_line, "no387")) -- 2.5.0