From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933606Ab2K3Twx (ORCPT ); Fri, 30 Nov 2012 14:52:53 -0500 Received: from mail-ea0-f202.google.com ([209.85.215.202]:34866 "EHLO mail-ea0-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932197Ab2K3Twv (ORCPT ); Fri, 30 Nov 2012 14:52:51 -0500 From: Vincent Palatin To: Ingo Molnar , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Linus Torvalds Cc: Thomas Gleixner , x86@kernel.org, Peter Zijlstra , Jarkko Sakkinen , Vincent Palatin , Duncan Laurie , Olof Johansson Subject: [PATCH v2] x86, fpu: avoid FPU lazy restore after suspend Date: Fri, 30 Nov 2012 11:52:43 -0800 Message-Id: <1354305164-10601-1-git-send-email-vpalatin@chromium.org> X-Mailer: git-send-email 1.7.7.3 In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a cpu enters S3 state, the FPU state is lost. After resuming for S3, if we try to lazy restore the FPU for a process running on the same CPU, this will result in a corrupted FPU context. Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU, so nobody will try to lazy restore a state which doesn't exist in the hardware. Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off, by doing thousands of suspend/resume cycles with 4 processes doing FPU operations running. Without the patch, a process is killed after a few hundreds cycles by a SIGFPE. The issue seems to exist since 3.4 (after the FPU lazy restore was actually implemented), to apply the change to 3.4, "this_cpu_write" needs to be replaced by percpu_write. Cc: Duncan Laurie Cc: Olof Johansson Cc: [v3.4+] # for 3.4 need to replace this_cpu_write by percpu_write Signed-off-by: Vincent Palatin --- arch/x86/include/asm/fpu-internal.h | 15 +++++++++------ arch/x86/kernel/smpboot.c | 5 +++++ 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h index 831dbb9..41ab26e 100644 --- a/arch/x86/include/asm/fpu-internal.h +++ b/arch/x86/include/asm/fpu-internal.h @@ -399,14 +399,17 @@ static inline void drop_init_fpu(struct task_struct *tsk) typedef struct { int preload; } fpu_switch_t; /* - * FIXME! We could do a totally lazy restore, but we need to - * add a per-cpu "this was the task that last touched the FPU - * on this CPU" variable, and the task needs to have a "I last - * touched the FPU on this CPU" and check them. + * Must be run with preemption disabled: this clears the fpu_owner_task, + * on this CPU. * - * We don't do that yet, so "fpu_lazy_restore()" always returns - * false, but some day.. + * This will disable any lazy FPU state restore of the current FPU state, + * but if the current thread owns the FPU, it will still be saved by. */ +static inline void __cpu_disable_lazy_restore(unsigned int cpu) +{ + per_cpu(fpu_owner_task, cpu) = NULL; +} + static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu) { return new == this_cpu_read_stable(fpu_owner_task) && diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index c80a33b..f3e2ec8 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -68,6 +68,8 @@ #include #include #include +#include +#include #include #include #include @@ -818,6 +820,9 @@ int __cpuinit native_cpu_up(unsigned int cpu, struct task_struct *tidle) per_cpu(cpu_state, cpu) = CPU_UP_PREPARE; + /* the FPU context is blank, nobody can own it */ + __cpu_disable_lazy_restore(cpu); + err = do_boot_cpu(apicid, cpu, tidle); if (err) { pr_debug("do_boot_cpu failed %d\n", err); -- 1.7.7.3