* Re: [PATCH] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 18:52 ` [PATCH] x86, fpu: avoid FPU lazy restore after suspend Vincent Palatin
@ 2012-11-30 18:57 ` H. Peter Anvin
2012-11-30 19:25 ` Linus Torvalds
2012-11-30 19:26 ` tip-bot for Vincent Palatin
2 siblings, 0 replies; 13+ messages in thread
From: H. Peter Anvin @ 2012-11-30 18:57 UTC (permalink / raw)
To: Vincent Palatin
Cc: Ingo Molnar, linux-kernel, Linus Torvalds, Thomas Gleixner, x86,
Peter Zijlstra, Jarkko Sakkinen, Duncan Laurie, Olof Johansson
On 11/30/2012 10:52 AM, Vincent Palatin wrote:
> When a cpu enters S3 state, the FPU state is lost.
> After resuming for S3, if we try to lazy restore the FPU for a process running
> on the same CPU, this will result in a corrupted FPU context.
>
> We can just invalidate the "fpu_owner_task", so nobody will try to
> lazy restore a state which no longer exists in the hardware.
>
> Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
> by doing thousands of suspend/resume cycles with 4 processes doing FPU
> operations running. Without the patch, a process is killed after a
> few hundreds cycles by a SIGFPE.
>
> The issue seems to exist since 3.4 (after the FPU lazy restore was actually implemented),
> to apply the change to 3.4, "this_cpu_write" needs to be replaced by percpu_write.
>
> Cc: Duncan Laurie <dlaurie@chromium.org>
> Cc: Olof Johansson <olofj@chromium.org>
> Cc: <stable@kernel.org> [v3.4+] # for 3.4 need to replace this_cpu_write by percpu_write
> Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Ouch! Thank you for catching this!
-hpa
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 18:52 ` [PATCH] x86, fpu: avoid FPU lazy restore after suspend Vincent Palatin
2012-11-30 18:57 ` H. Peter Anvin
@ 2012-11-30 19:25 ` Linus Torvalds
2012-11-30 19:38 ` H. Peter Anvin
2012-11-30 19:26 ` tip-bot for Vincent Palatin
2 siblings, 1 reply; 13+ messages in thread
From: Linus Torvalds @ 2012-11-30 19:25 UTC (permalink / raw)
To: Vincent Palatin
Cc: Ingo Molnar, H. Peter Anvin, Linux Kernel Mailing List,
Thomas Gleixner, the arch/x86 maintainers, Peter Zijlstra,
Jarkko Sakkinen, Duncan Laurie, Olof Johansson
On Fri, Nov 30, 2012 at 10:52 AM, Vincent Palatin <vpalatin@chromium.org> wrote:
> When a cpu enters S3 state, the FPU state is lost.
> After resuming for S3, if we try to lazy restore the FPU for a process running
> on the same CPU, this will result in a corrupted FPU context.
Good catch, and I think the patch is technically correct, but:
> We can just invalidate the "fpu_owner_task", so nobody will try to
> lazy restore a state which no longer exists in the hardware.
I think this works well, but is a tiny bit subtle.
And what I _don't_ necessarily think is the right thing to do is to
only comment on it in the commit message, because it means that later
generations will not necessarily ever notice. And it does result in a
new combination that I don't think we've had before: if the current
task is the owner, we now have "tsk->thread.fpu.has_fpu = 1" but with
"fpu_owner_task" being NULL.
Which gets us the semantics we want (we will save the current CPU info
when switching away, but we will not restore when switching back), but
my gut feel is that we really want to comment on that exact thing. And
possibly even make a helper function for this in <sm/fpu-internal.h>,
something like
/*
* Must be run with preemption disabled: this clears the fpu_owner_task,
* on this CPU.
*
* This will disable any lazy FPU state restore of the current FPU state,
* but if the current thread owns the FPU, it will still be saved by.
*/
static inline void __cpu_disable_lazy_restore(void)
{
this_cpu_write(fpu_owner_task, NULL);
}
and in fact I think the right place to do this *might* be in
"native_cpu_die()" instead, at which point it would actually be
something like
per_cpu(fpu_owner_task, cpu) = NULL;
*after* the CPU is dead, so that nothing ever can actually see the
state where a process is still running on the CPU and might possibly
use the FPU.
I dunno. I think doing it after really killing the CPU (ie in the
native_cpu_die() function) might be easier to think about, but I don't
really hate your patch either (it does make me go "ok, we need to
guarantee no scheduling or FP use after" - which is probably true, but
it's still some non-local thing). Either way, a comment about it and
abstracting whatever the invalidation sequence is in fpu-internal.h
sounds like a good idea.
Hmm?
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 19:25 ` Linus Torvalds
@ 2012-11-30 19:38 ` H. Peter Anvin
2012-11-30 19:41 ` Linus Torvalds
0 siblings, 1 reply; 13+ messages in thread
From: H. Peter Anvin @ 2012-11-30 19:38 UTC (permalink / raw)
To: Linus Torvalds
Cc: Vincent Palatin, Ingo Molnar, Linux Kernel Mailing List,
Thomas Gleixner, the arch/x86 maintainers, Peter Zijlstra,
Jarkko Sakkinen, Duncan Laurie, Olof Johansson
On 11/30/2012 11:25 AM, Linus Torvalds wrote:
>
> and in fact I think the right place to do this *might* be in
> "native_cpu_die()" instead, at which point it would actually be
> something like
>
> per_cpu(fpu_owner_task, cpu) = NULL;
>
> *after* the CPU is dead, so that nothing ever can actually see the
> state where a process is still running on the CPU and might possibly
> use the FPU.
>
> I dunno. I think doing it after really killing the CPU (ie in the
> native_cpu_die() function) might be easier to think about, but I don't
> really hate your patch either (it does make me go "ok, we need to
> guarantee no scheduling or FP use after" - which is probably true, but
> it's still some non-local thing). Either way, a comment about it and
> abstracting whatever the invalidation sequence is in fpu-internal.h
> sounds like a good idea.
>
Hmm... from my point of view it would almost seem saner to do this on
the way *up*... as part of CPU (re-)initialization. After all, the
"nothing is currently running on this CPU" is part of the initial state
of the CPU, regardless of if we have ever been online before or not.
-hpa
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 19:38 ` H. Peter Anvin
@ 2012-11-30 19:41 ` Linus Torvalds
2012-11-30 19:51 ` H. Peter Anvin
2012-11-30 19:52 ` [PATCH v2] " Vincent Palatin
0 siblings, 2 replies; 13+ messages in thread
From: Linus Torvalds @ 2012-11-30 19:41 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Vincent Palatin, Ingo Molnar, Linux Kernel Mailing List,
Thomas Gleixner, the arch/x86 maintainers, Peter Zijlstra,
Jarkko Sakkinen, Duncan Laurie, Olof Johansson
On Fri, Nov 30, 2012 at 11:38 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 11/30/2012 11:25 AM, Linus Torvalds wrote:
>>
>> and in fact I think the right place to do this *might* be in
>> "native_cpu_die()" instead, at which point it would actually be
>> something like
>>
>> per_cpu(fpu_owner_task, cpu) = NULL;
>>
>> *after* the CPU is dead, so that nothing ever can actually see the
>> state where a process is still running on the CPU and might possibly
>> use the FPU.
>>
>> I dunno. I think doing it after really killing the CPU (ie in the
>> native_cpu_die() function) might be easier to think about, but I don't
>> really hate your patch either (it does make me go "ok, we need to
>> guarantee no scheduling or FP use after" - which is probably true, but
>> it's still some non-local thing). Either way, a comment about it and
>> abstracting whatever the invalidation sequence is in fpu-internal.h
>> sounds like a good idea.
>>
>
> Hmm... from my point of view it would almost seem saner to do this on
> the way *up*... as part of CPU (re-)initialization. After all, the
> "nothing is currently running on this CPU" is part of the initial state
> of the CPU, regardless of if we have ever been online before or not.
That works for me too, and has the similar advantage of being easier
to think about than "the cpu is actually still being used, and we're
playing tricks with the FPU state cache" approach.
Regardless, much credit to Vincent (and presumably others on the
chromium team) for finding and debugging this.
Linus
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 19:41 ` Linus Torvalds
@ 2012-11-30 19:51 ` H. Peter Anvin
[not found] ` <CAP_ceTxmMhQeDi=x9HmYke85hKMg3_YhbXSnfDC12rOcocQJpA@mail.gmail.com>
2012-11-30 19:52 ` [PATCH v2] " Vincent Palatin
1 sibling, 1 reply; 13+ messages in thread
From: H. Peter Anvin @ 2012-11-30 19:51 UTC (permalink / raw)
To: Linus Torvalds
Cc: Vincent Palatin, Ingo Molnar, Linux Kernel Mailing List,
Thomas Gleixner, the arch/x86 maintainers, Peter Zijlstra,
Jarkko Sakkinen, Duncan Laurie, Olof Johansson
On 11/30/2012 11:41 AM, Linus Torvalds wrote:
>>
>> Hmm... from my point of view it would almost seem saner to do this on
>> the way *up*... as part of CPU (re-)initialization. After all, the
>> "nothing is currently running on this CPU" is part of the initial state
>> of the CPU, regardless of if we have ever been online before or not.
>
> That works for me too, and has the similar advantage of being easier
> to think about than "the cpu is actually still being used, and we're
> playing tricks with the FPU state cache" approach.
>
> Regardless, much credit to Vincent (and presumably others on the
> chromium team) for finding and debugging this.
>
OK, how do you want to handle this, given that 3.7 is presumably
imminent? We can apply Vincent's patch now, and then restructure it
later, or we can go for the gold now, but replicating the testing
Vincent has done will probably take well into next week.
-hpa
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 19:41 ` Linus Torvalds
2012-11-30 19:51 ` H. Peter Anvin
@ 2012-11-30 19:52 ` Vincent Palatin
2012-11-30 20:15 ` [PATCH v3] " Vincent Palatin
1 sibling, 1 reply; 13+ messages in thread
From: Vincent Palatin @ 2012-11-30 19:52 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, linux-kernel, Linus Torvalds
Cc: Thomas Gleixner, x86, Peter Zijlstra, Jarkko Sakkinen,
Vincent Palatin, Duncan Laurie, Olof Johansson
When a cpu enters S3 state, the FPU state is lost.
After resuming for S3, if we try to lazy restore the FPU for a process running
on the same CPU, this will result in a corrupted FPU context.
Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU,
so nobody will try to lazy restore a state which doesn't exist in the hardware.
Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
by doing thousands of suspend/resume cycles with 4 processes doing FPU
operations running. Without the patch, a process is killed after a
few hundreds cycles by a SIGFPE.
The issue seems to exist since 3.4 (after the FPU lazy restore was actually implemented),
to apply the change to 3.4, "this_cpu_write" needs to be replaced by percpu_write.
Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: <stable@kernel.org> [v3.4+] # for 3.4 need to replace this_cpu_write by percpu_write
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
---
arch/x86/include/asm/fpu-internal.h | 15 +++++++++------
arch/x86/kernel/smpboot.c | 5 +++++
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 831dbb9..41ab26e 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -399,14 +399,17 @@ static inline void drop_init_fpu(struct task_struct *tsk)
typedef struct { int preload; } fpu_switch_t;
/*
- * FIXME! We could do a totally lazy restore, but we need to
- * add a per-cpu "this was the task that last touched the FPU
- * on this CPU" variable, and the task needs to have a "I last
- * touched the FPU on this CPU" and check them.
+ * Must be run with preemption disabled: this clears the fpu_owner_task,
+ * on this CPU.
*
- * We don't do that yet, so "fpu_lazy_restore()" always returns
- * false, but some day..
+ * This will disable any lazy FPU state restore of the current FPU state,
+ * but if the current thread owns the FPU, it will still be saved by.
*/
+static inline void __cpu_disable_lazy_restore(unsigned int cpu)
+{
+ per_cpu(fpu_owner_task, cpu) = NULL;
+}
+
static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu)
{
return new == this_cpu_read_stable(fpu_owner_task) &&
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c80a33b..f3e2ec8 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -68,6 +68,8 @@
#include <asm/mwait.h>
#include <asm/apic.h>
#include <asm/io_apic.h>
+#include <asm/i387.h>
+#include <asm/fpu-internal.h>
#include <asm/setup.h>
#include <asm/uv/uv.h>
#include <linux/mc146818rtc.h>
@@ -818,6 +820,9 @@ int __cpuinit native_cpu_up(unsigned int cpu, struct task_struct *tidle)
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
+ /* the FPU context is blank, nobody can own it */
+ __cpu_disable_lazy_restore(cpu);
+
err = do_boot_cpu(apicid, cpu, tidle);
if (err) {
pr_debug("do_boot_cpu failed %d\n", err);
--
1.7.7.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3] x86, fpu: avoid FPU lazy restore after suspend
2012-11-30 19:52 ` [PATCH v2] " Vincent Palatin
@ 2012-11-30 20:15 ` Vincent Palatin
2012-11-30 22:10 ` [tip:x86/urgent] x86, fpu: Avoid " tip-bot for Vincent Palatin
0 siblings, 1 reply; 13+ messages in thread
From: Vincent Palatin @ 2012-11-30 20:15 UTC (permalink / raw)
To: Ingo Molnar, H. Peter Anvin, linux-kernel, Linus Torvalds
Cc: Thomas Gleixner, x86, Peter Zijlstra, Jarkko Sakkinen,
Vincent Palatin, Duncan Laurie, Olof Johansson
When a cpu enters S3 state, the FPU state is lost.
After resuming for S3, if we try to lazy restore the FPU for a process running
on the same CPU, this will result in a corrupted FPU context.
Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU,
so nobody will try to lazy restore a state which doesn't exist in the hardware.
Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
by doing thousands of suspend/resume cycles with 4 processes doing FPU
operations running. Without the patch, a process is killed after a
few hundreds cycles by a SIGFPE.
Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: <stable@kernel.org> [v3.4+] # for 3.4 need to replace this_cpu_write by percpu_write
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
---
Hi,
The patch updated according the HPA and Linus comments.
I'm still re-running the testing on v3.
Change in v3:
- remove misleading comment about 3.4 in the description.
Change in v2:
- add an helper function and comment in fpu-internal.h as described by Linus
- do the cleaning in the native_cpu_up function as suggested by HPA
Vincent
arch/x86/include/asm/fpu-internal.h | 15 +++++++++------
arch/x86/kernel/smpboot.c | 5 +++++
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 831dbb9..41ab26e 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -399,14 +399,17 @@ static inline void drop_init_fpu(struct task_struct *tsk)
typedef struct { int preload; } fpu_switch_t;
/*
- * FIXME! We could do a totally lazy restore, but we need to
- * add a per-cpu "this was the task that last touched the FPU
- * on this CPU" variable, and the task needs to have a "I last
- * touched the FPU on this CPU" and check them.
+ * Must be run with preemption disabled: this clears the fpu_owner_task,
+ * on this CPU.
*
- * We don't do that yet, so "fpu_lazy_restore()" always returns
- * false, but some day..
+ * This will disable any lazy FPU state restore of the current FPU state,
+ * but if the current thread owns the FPU, it will still be saved by.
*/
+static inline void __cpu_disable_lazy_restore(unsigned int cpu)
+{
+ per_cpu(fpu_owner_task, cpu) = NULL;
+}
+
static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu)
{
return new == this_cpu_read_stable(fpu_owner_task) &&
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c80a33b..f3e2ec8 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -68,6 +68,8 @@
#include <asm/mwait.h>
#include <asm/apic.h>
#include <asm/io_apic.h>
+#include <asm/i387.h>
+#include <asm/fpu-internal.h>
#include <asm/setup.h>
#include <asm/uv/uv.h>
#include <linux/mc146818rtc.h>
@@ -818,6 +820,9 @@ int __cpuinit native_cpu_up(unsigned int cpu, struct task_struct *tidle)
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
+ /* the FPU context is blank, nobody can own it */
+ __cpu_disable_lazy_restore(cpu);
+
err = do_boot_cpu(apicid, cpu, tidle);
if (err) {
pr_debug("do_boot_cpu failed %d\n", err);
--
1.7.7.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip:x86/urgent] x86, fpu: Avoid FPU lazy restore after suspend
2012-11-30 20:15 ` [PATCH v3] " Vincent Palatin
@ 2012-11-30 22:10 ` tip-bot for Vincent Palatin
0 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Vincent Palatin @ 2012-11-30 22:10 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, olofj, dlaurie, vpalatin, tglx, hpa
Commit-ID: 644c154186386bb1fa6446bc5e037b9ed098db46
Gitweb: http://git.kernel.org/tip/644c154186386bb1fa6446bc5e037b9ed098db46
Author: Vincent Palatin <vpalatin@chromium.org>
AuthorDate: Fri, 30 Nov 2012 12:15:32 -0800
Committer: H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 30 Nov 2012 13:48:05 -0800
x86, fpu: Avoid FPU lazy restore after suspend
When a cpu enters S3 state, the FPU state is lost.
After resuming for S3, if we try to lazy restore the FPU for a process running
on the same CPU, this will result in a corrupted FPU context.
Ensure that "fpu_owner_task" is properly invalided when (re-)initializing a CPU,
so nobody will try to lazy restore a state which doesn't exist in the hardware.
Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
by doing thousands of suspend/resume cycles with 4 processes doing FPU
operations running. Without the patch, a process is killed after a
few hundreds cycles by a SIGFPE.
Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: <stable@kernel.org> v3.4+ # for 3.4 need to replace this_cpu_write by percpu_write
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Link: http://lkml.kernel.org/r/1354306532-1014-1-git-send-email-vpalatin@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
arch/x86/include/asm/fpu-internal.h | 15 +++++++++------
arch/x86/kernel/smpboot.c | 5 +++++
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 831dbb9..41ab26e 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -399,14 +399,17 @@ static inline void drop_init_fpu(struct task_struct *tsk)
typedef struct { int preload; } fpu_switch_t;
/*
- * FIXME! We could do a totally lazy restore, but we need to
- * add a per-cpu "this was the task that last touched the FPU
- * on this CPU" variable, and the task needs to have a "I last
- * touched the FPU on this CPU" and check them.
+ * Must be run with preemption disabled: this clears the fpu_owner_task,
+ * on this CPU.
*
- * We don't do that yet, so "fpu_lazy_restore()" always returns
- * false, but some day..
+ * This will disable any lazy FPU state restore of the current FPU state,
+ * but if the current thread owns the FPU, it will still be saved by.
*/
+static inline void __cpu_disable_lazy_restore(unsigned int cpu)
+{
+ per_cpu(fpu_owner_task, cpu) = NULL;
+}
+
static inline int fpu_lazy_restore(struct task_struct *new, unsigned int cpu)
{
return new == this_cpu_read_stable(fpu_owner_task) &&
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c80a33b..f3e2ec8 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -68,6 +68,8 @@
#include <asm/mwait.h>
#include <asm/apic.h>
#include <asm/io_apic.h>
+#include <asm/i387.h>
+#include <asm/fpu-internal.h>
#include <asm/setup.h>
#include <asm/uv/uv.h>
#include <linux/mc146818rtc.h>
@@ -818,6 +820,9 @@ int __cpuinit native_cpu_up(unsigned int cpu, struct task_struct *tidle)
per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
+ /* the FPU context is blank, nobody can own it */
+ __cpu_disable_lazy_restore(cpu);
+
err = do_boot_cpu(apicid, cpu, tidle);
if (err) {
pr_debug("do_boot_cpu failed %d\n", err);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [tip:x86/urgent] x86, fpu: Avoid FPU lazy restore after suspend
2012-11-30 18:52 ` [PATCH] x86, fpu: avoid FPU lazy restore after suspend Vincent Palatin
2012-11-30 18:57 ` H. Peter Anvin
2012-11-30 19:25 ` Linus Torvalds
@ 2012-11-30 19:26 ` tip-bot for Vincent Palatin
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Vincent Palatin @ 2012-11-30 19:26 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, hpa, mingo, olofj, dlaurie, vpalatin, tglx, hpa
Commit-ID: c9370d1039848a813a63c3fd44b6e4833a78246a
Gitweb: http://git.kernel.org/tip/c9370d1039848a813a63c3fd44b6e4833a78246a
Author: Vincent Palatin <vpalatin@chromium.org>
AuthorDate: Fri, 30 Nov 2012 10:52:03 -0800
Committer: H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 30 Nov 2012 10:58:14 -0800
x86, fpu: Avoid FPU lazy restore after suspend
When a cpu enters S3 state, the FPU state is lost.
After resuming for S3, if we try to lazy restore the FPU for a process running
on the same CPU, this will result in a corrupted FPU context.
We can just invalidate the "fpu_owner_task", so nobody will try to
lazy restore a state which no longer exists in the hardware.
Tested with a 64-bit kernel on a 4-core Ivybridge CPU with eagerfpu=off,
by doing thousands of suspend/resume cycles with 4 processes doing FPU
operations running. Without the patch, a process is killed after a
few hundreds cycles by a SIGFPE.
The issue seems to exist since 3.4 (after the FPU lazy restore was actually implemented),
to apply the change to 3.4, "this_cpu_write" needs to be replaced by percpu_write.
Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Olof Johansson <olofj@chromium.org>
Cc: <stable@kernel.org> [v3.4+] # for 3.4 need to replace this_cpu_write by percpu_write
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Link: http://lkml.kernel.org/r/1354301523-5252-2-git-send-email-vpalatin@chromium.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
arch/x86/kernel/smpboot.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index c80a33b..7610c58 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -68,6 +68,8 @@
#include <asm/mwait.h>
#include <asm/apic.h>
#include <asm/io_apic.h>
+#include <asm/i387.h>
+#include <asm/fpu-internal.h>
#include <asm/setup.h>
#include <asm/uv/uv.h>
#include <linux/mc146818rtc.h>
@@ -1230,6 +1232,9 @@ int native_cpu_disable(void)
clear_local_APIC();
cpu_disable_common();
+
+ /* the FPU context will be lost, nobody owns it */
+ this_cpu_write(fpu_owner_task, NULL);
return 0;
}
^ permalink raw reply related [flat|nested] 13+ messages in thread