* WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
@ 2016-02-11 19:27 Borislav Petkov
2016-02-11 23:47 ` Andy Lutomirski
0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2016-02-11 19:27 UTC (permalink / raw)
To: Andy Lutomirski; +Cc: x86-ml, lkml
Hey Andy,
can you make any sense of this:
[ 90.573923] ------------[ cut here ]------------
[ 90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
[ 90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal
[ 90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
[ 90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
[ 90.582325] 00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
[ 90.583179] c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
[ 90.584142] f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
[ 90.585002] Call Trace:
[ 90.585854] [<c12cce56>] dump_stack+0x5f/0x89
[ 90.586703] [<c10567fb>] warn_slowpath_common+0x8b/0xc0
[ 90.587559] [<c1025ab0>] ? fpu__restore+0x90/0x130
[ 90.588520] [<c1025ab0>] ? fpu__restore+0x90/0x130
[ 90.589353] [<c10568d2>] warn_slowpath_null+0x22/0x30
[ 90.590175] [<c1025ab0>] fpu__restore+0x90/0x130
[ 90.590993] [<c1027098>] __fpu__restore_sig+0x268/0x4c0
[ 90.591816] [<c102751f>] fpu__restore_sig+0x2f/0x50
[ 90.592636] [<c101a6c9>] restore_sigcontext+0xe9/0x110
[ 90.593449] [<c101af3c>] sys_sigreturn+0x9c/0xb0
[ 90.594263] [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340
[ 90.595079] [<c169979d>] entry_INT80_32+0x31/0x31
[ 90.595922] ---[ end trace be617bef2982f473 ]---
This is rc3 + latest tip/master and it happened when I did "make
mrproper" in the kernel repo.
>From a quick stare, it looks to me we're running do_syscall_32_irqs_on()
with IRQs on, sys_sigreturn() does current_pt_regs() but
__fpu__restore_sig() derefs current again and could be that that second
"current" is another current which already has ->fpregs_active set ?
FPU + signal handling code in a single backtrace. My favourite!
:-\
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-11 19:27 WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() Borislav Petkov @ 2016-02-11 23:47 ` Andy Lutomirski 2016-02-12 1:16 ` Andy Lutomirski 2016-02-12 11:17 ` Borislav Petkov 0 siblings, 2 replies; 15+ messages in thread From: Andy Lutomirski @ 2016-02-11 23:47 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml On Thu, Feb 11, 2016 at 11:27 AM, Borislav Petkov <bp@alien8.de> wrote: > Hey Andy, > > can you make any sense of this: > > [ 90.573923] ------------[ cut here ]------------ > [ 90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() > [ 90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal > [ 90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1 > [ 90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012 > [ 90.582325] 00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb > [ 90.583179] c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0 > [ 90.584142] f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4 > [ 90.585002] Call Trace: > [ 90.585854] [<c12cce56>] dump_stack+0x5f/0x89 > [ 90.586703] [<c10567fb>] warn_slowpath_common+0x8b/0xc0 > [ 90.587559] [<c1025ab0>] ? fpu__restore+0x90/0x130 > [ 90.588520] [<c1025ab0>] ? fpu__restore+0x90/0x130 > [ 90.589353] [<c10568d2>] warn_slowpath_null+0x22/0x30 > [ 90.590175] [<c1025ab0>] fpu__restore+0x90/0x130 > [ 90.590993] [<c1027098>] __fpu__restore_sig+0x268/0x4c0 > [ 90.591816] [<c102751f>] fpu__restore_sig+0x2f/0x50 > [ 90.592636] [<c101a6c9>] restore_sigcontext+0xe9/0x110 > [ 90.593449] [<c101af3c>] sys_sigreturn+0x9c/0xb0 > [ 90.594263] [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340 > [ 90.595079] [<c169979d>] entry_INT80_32+0x31/0x31 > [ 90.595922] ---[ end trace be617bef2982f473 ]--- > > This is rc3 + latest tip/master and it happened when I did "make > mrproper" in the kernel repo. > > From a quick stare, it looks to me we're running do_syscall_32_irqs_on() > with IRQs on, sys_sigreturn() does current_pt_regs() but > __fpu__restore_sig() derefs current again and could be that that second > "current" is another current which already has ->fpregs_active set ? > > FPU + signal handling code in a single backtrace. My favourite! > > :-\ Ugh. Can you send all the fpu info that the kernel prints really early when it boots? --Andy ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-11 23:47 ` Andy Lutomirski @ 2016-02-12 1:16 ` Andy Lutomirski 2016-02-12 17:00 ` Borislav Petkov 2016-02-15 19:05 ` Borislav Petkov 2016-02-12 11:17 ` Borislav Petkov 1 sibling, 2 replies; 15+ messages in thread From: Andy Lutomirski @ 2016-02-12 1:16 UTC (permalink / raw) To: Borislav Petkov; +Cc: x86-ml, lkml On Thu, Feb 11, 2016 at 3:47 PM, Andy Lutomirski <luto@amacapital.net> wrote: > On Thu, Feb 11, 2016 at 11:27 AM, Borislav Petkov <bp@alien8.de> wrote: >> Hey Andy, >> >> can you make any sense of this: >> >> [ 90.573923] ------------[ cut here ]------------ >> [ 90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() >> [ 90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal >> [ 90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1 >> [ 90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012 >> [ 90.582325] 00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb >> [ 90.583179] c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0 >> [ 90.584142] f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4 >> [ 90.585002] Call Trace: >> [ 90.585854] [<c12cce56>] dump_stack+0x5f/0x89 >> [ 90.586703] [<c10567fb>] warn_slowpath_common+0x8b/0xc0 >> [ 90.587559] [<c1025ab0>] ? fpu__restore+0x90/0x130 >> [ 90.588520] [<c1025ab0>] ? fpu__restore+0x90/0x130 >> [ 90.589353] [<c10568d2>] warn_slowpath_null+0x22/0x30 >> [ 90.590175] [<c1025ab0>] fpu__restore+0x90/0x130 >> [ 90.590993] [<c1027098>] __fpu__restore_sig+0x268/0x4c0 >> [ 90.591816] [<c102751f>] fpu__restore_sig+0x2f/0x50 >> [ 90.592636] [<c101a6c9>] restore_sigcontext+0xe9/0x110 >> [ 90.593449] [<c101af3c>] sys_sigreturn+0x9c/0xb0 >> [ 90.594263] [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340 >> [ 90.595079] [<c169979d>] entry_INT80_32+0x31/0x31 >> [ 90.595922] ---[ end trace be617bef2982f473 ]--- >> >> This is rc3 + latest tip/master and it happened when I did "make >> mrproper" in the kernel repo. >> >> From a quick stare, it looks to me we're running do_syscall_32_irqs_on() >> with IRQs on, sys_sigreturn() does current_pt_regs() but >> __fpu__restore_sig() derefs current again and could be that that second >> "current" is another current which already has ->fpregs_active set ? >> >> FPU + signal handling code in a single backtrace. My favourite! >> >> :-\ > > Ugh. > > Can you send all the fpu info that the kernel prints really early when it boots? > Are you running 32-bit userspace by any chance? I'm guessing you're hitting this in __fpu_restore_sig: fpu__drop(fpu); if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) || __copy_from_user(&env, buf, sizeof(env))) { fpstate_init(&fpu->state); err = -1; } else { sanitize_restored_xstate(tsk, &env, xfeatures, fx_only); } fpu->fpstate_active = 1; <-- preempted right here if (use_eager_fpu()) { preempt_disable(); fpu__restore(fpu); preempt_enable(); } I don't see why this code deserves to work. If I'm right, it can be fixed by pulling the preempt_disable out of the if (use_eager_fpu()) to right above the fpstate_active = 1 line. Don't bother trying to optimize the !use_eager_fpu() case. Once someone gets around to eagerly *allocating* the FPU context and dropping CR0.TS usage entirely, then even that won't be enough unless we do my suggesting of deferring FPU restore to prepare_exit_to_usermode. (Doing that will make all of this much, much more sane.) --Andy ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-12 1:16 ` Andy Lutomirski @ 2016-02-12 17:00 ` Borislav Petkov 2016-02-15 19:14 ` Borislav Petkov 2016-02-15 19:05 ` Borislav Petkov 1 sibling, 1 reply; 15+ messages in thread From: Borislav Petkov @ 2016-02-12 17:00 UTC (permalink / raw) To: Andy Lutomirski; +Cc: x86-ml, lkml On Thu, Feb 11, 2016 at 05:16:00PM -0800, Andy Lutomirski wrote: > fpu__drop(fpu); > if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) || > __copy_from_user(&env, buf, sizeof(env))) { > fpstate_init(&fpu->state); > err = -1; > } else { > sanitize_restored_xstate(tsk, &env, xfeatures, fx_only); > } > > fpu->fpstate_active = 1; > > <-- preempted right here >From IRC: <amluto> you could try sticking a schedule() right after that fpu->fpstate_active = 1; line > if (use_eager_fpu()) { > preempt_disable(); > fpu__restore(fpu); > preempt_enable(); Something for me to try when I get a chance. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-12 17:00 ` Borislav Petkov @ 2016-02-15 19:14 ` Borislav Petkov 2016-02-16 2:25 ` Andy Lutomirski 0 siblings, 1 reply; 15+ messages in thread From: Borislav Petkov @ 2016-02-15 19:14 UTC (permalink / raw) To: Andy Lutomirski; +Cc: x86-ml, lkml On Fri, Feb 12, 2016 at 06:00:10PM +0100, Borislav Petkov wrote: > Something for me to try when I get a chance. Ok, so I wanted to know what happens in detail. Here's some ftracing (debug patch at the end). Now pay attention to this udevadm thing [ 3.816977] rcu_pree-7 0d..2 4058241us : __switch_to: prev: rcu_preempt <-> next: udevadm [ 3.816977] rcu_pree-7 0d..2 4058241us : __switch_to: set ->fpregs_active [ 3.816977] udevadm-982 0.... 4058258us : __fpu__restore_sig: fpregs_active 0, f443d7c0 We're in __fpu__restore_sig() about to call schedule() [ 3.816977] udevadm-982 0d..2 4058260us : __switch_to: prev: udevadm <-> next: usb_id [ 3.816977] udevadm-982 0d..2 4058260us : __switch_to: set ->fpregs_active [ 3.816977] usb_id-987 0d..2 4059684us : __switch_to: prev: usb_id <-> next: udevd [ 3.816977] usb_id-987 0d..2 4059685us : __switch_to: set ->fpregs_active [ 3.816977] udevd-843 0d..2 4059697us : __switch_to: prev: udevd <-> next: udevd [ 3.816977] udevd-843 0d..2 4059697us : __switch_to: set ->fpregs_active [ 3.816977] alsa-uti-989 0d..2 4060452us : __switch_to: prev: alsa-utils <-> next: udevd [ 3.816977] alsa-uti-989 0d..2 4060452us : __switch_to: set ->fpregs_active [ 3.816977] udevd-840 0d..2 4060521us : __switch_to: prev: udevd <-> next: udevd [ 3.816977] udevd-840 0d..2 4060522us : __switch_to: set ->fpregs_active [ 3.816977] udevd-829 0d..2 4060557us : __switch_to: prev: udevd <-> next: udevd [ 3.816977] udevd-829 0d..2 4060558us : __switch_to: set ->fpregs_active [ 3.816977] udevd-840 0d..2 4060862us : __switch_to: prev: udevd <-> next: blkid [ 3.816977] udevd-840 0d..2 4060862us : __switch_to: set ->fpregs_active [ 3.816977] blkid-985 0d..2 4061148us : __switch_to: prev: blkid <-> next: udevadm [ 3.816977] blkid-985 0d..2 4061148us : __switch_to: set ->fpregs_active Now we're switching back to udevadm which is @next_p of __switch_to(). There we do: fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu); which does: /* Don't change CR0.TS if we just switch! */ if (fpu.preload) { new_fpu->counter++; __fpregs_activate(new_fpu); prefetch(&new_fpu->state); __fpregs_activate() sets ->fpregs_active of @new_fpu, i.e. udevadm's one. [ 3.816977] udevadm-982 0.... 4061149us : __fpu__restore_sig: after schedule: fpregs_active: 1 f443d7c0 __fpu__restore_sig() -> fpu__restore() sets ->fpregs_active again. [ 3.816977] udevadm-982 0.N.1 4185386us : fpu__restore: WARN: fpu: f443d7c0 Boom! [ 3.816977] udevadm-982 0.N.1 4185392us : <stack trace> [ 3.816977] => fpu__restore [ 3.816977] => __fpu__restore_sig [ 3.816977] => fpu__restore_sig [ 3.816977] => restore_sigcontext [ 3.816977] => sys_sigreturn [ 3.816977] => do_syscall_32_irqs_on [ 3.816977] => restore_all [ 3.816977] --------------------------------- [ 3.816977] Kernel Offset: disabled [ 3.816977] ---[ end Kernel panic - not syncing: Outta here... So yeah, we probably should enlarge the preemption-off region to contain ->fpstate_active. Here's what you basically suggested but with a *looot* of explanatory text. Which might be really wrong or completely unparseable or both. So holler what should be changed. Thanks! --- From: Borislav Petkov <bp@suse.de> Date: Mon, 15 Feb 2016 19:50:33 +0100 Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with interrupts enabled. And it can happen that we get preempted right after setting ->fpstate_active in a task's FPU. After we get preempted, we switch between tasks merrily and eventually are about to switch to that task above whose ->fpstate_active we set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets ->fpregs_active set, we find ourselves back on the call stack below and especially in __fpu__restore_sig() which sets ->fpregs_active again. Leading to that whoops below. So let's enlarge the preemption-off region so that we set ->fpstate_active with preemption disabled and thus not trigger fpu.preload: switch_fpu_prepare ... fpu.preload = static_cpu_has(X86_FEATURE_FPU) && new_fpu->fpstate_active && ^^^^^^^^^^^^^^^^^^^^^^ prematurely. WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support emp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core 211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1 Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012 00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0 f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4 Call Trace: dump_stack warn_slowpath_common ? fpu__restore ? fpu__restore warn_slowpath_null fpu__restore __fpu__restore_sig fpu__restore_sig restore_sigcontext sys_sigreturn do_syscall_32_irqs_on entry_INT80_32 Suggested-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Borislav Petkov <bp@suse.de> --- arch/x86/kernel/fpu/signal.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index 31c6a60505e6..408e5a1c6fdd 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -316,12 +316,11 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) sanitize_restored_xstate(tsk, &env, xfeatures, fx_only); } + preempt_disable(); fpu->fpstate_active = 1; - if (use_eager_fpu()) { - preempt_disable(); + if (use_eager_fpu()) fpu__restore(fpu); - preempt_enable(); - } + preempt_enable(); return err; } else { -- 2.3.5 Tracing patch: --- diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index a2124343edf5..2cbc3bf34928 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -527,9 +527,16 @@ static inline void __fpregs_deactivate(struct fpu *fpu) /* Must be paired with a 'clts' (fpregs_activate_hw()) before! */ static inline void __fpregs_activate(struct fpu *fpu) { - WARN_ON_FPU(fpu->fpregs_active); + if (WARN_ON_FPU(fpu->fpregs_active)) { + trace_printk("WARN: fpu: %p\n", fpu); + trace_dump_stack(0); + tracing_off(); + panic("Outta here...\n"); + } fpu->fpregs_active = 1; + trace_printk("set ->fpregs_active\n"); + this_cpu_write(fpu_fpregs_owner_ctx, fpu); } diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c index 31c6a60505e6..bb40f02cdfdd 100644 --- a/arch/x86/kernel/fpu/signal.c +++ b/arch/x86/kernel/fpu/signal.c @@ -317,6 +317,14 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size) } fpu->fpstate_active = 1; + + trace_printk("fpregs_active %d, %p\n", fpu->fpregs_active, fpu); + + schedule(); + + trace_printk("after schedule: fpregs_active: %d %p\n", + fpu->fpregs_active, fpu); + if (use_eager_fpu()) { preempt_disable(); fpu__restore(fpu); diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 9f950917528b..ce768c728f38 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -249,6 +249,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) struct tss_struct *tss = &per_cpu(cpu_tss, cpu); fpu_switch_t fpu_switch; + trace_printk("prev: %s <-> next: %s\n", prev_p->comm, next_p->comm); + /* never put a printk in __switch_to... printk() calls wake_up*() indirectly */ fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu); -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-15 19:14 ` Borislav Petkov @ 2016-02-16 2:25 ` Andy Lutomirski 2016-02-17 8:16 ` Ingo Molnar 0 siblings, 1 reply; 15+ messages in thread From: Andy Lutomirski @ 2016-02-16 2:25 UTC (permalink / raw) To: Borislav Petkov; +Cc: linux-kernel, X86 ML On Feb 15, 2016 12:14 PM, "Borislav Petkov" <bp@alien8.de> wrote: > > --- > From: Borislav Petkov <bp@suse.de> > Date: Mon, 15 Feb 2016 19:50:33 +0100 > Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation > > On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with > interrupts enabled. I would change this a little bit. sys_sigreturn calls fpu__restore_sig with interrupts enabled. When restoring a 32-bit signal frame, it can happen that... > And it can happen that we get preempted right after > setting ->fpstate_active in a task's FPU. > > After we get preempted, we switch between tasks merrily and eventually > are about to switch to that task above whose ->fpstate_active we > set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets > ->fpregs_active set, we find ourselves back on the call stack below and > especially in __fpu__restore_sig() which sets ->fpregs_active again. > > Leading to that whoops below. --Andy ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-16 2:25 ` Andy Lutomirski @ 2016-02-17 8:16 ` Ingo Molnar 2016-02-17 9:29 ` Borislav Petkov 0 siblings, 1 reply; 15+ messages in thread From: Ingo Molnar @ 2016-02-17 8:16 UTC (permalink / raw) To: Andy Lutomirski; +Cc: Borislav Petkov, linux-kernel, X86 ML * Andy Lutomirski <luto@amacapital.net> wrote: > On Feb 15, 2016 12:14 PM, "Borislav Petkov" <bp@alien8.de> wrote: > > > > --- > > From: Borislav Petkov <bp@suse.de> > > Date: Mon, 15 Feb 2016 19:50:33 +0100 > > Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation > > > > On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with > > interrupts enabled. > > I would change this a little bit. > > sys_sigreturn calls fpu__restore_sig with interrupts enabled. When > restoring a 32-bit signal frame, it can happen that... > > > And it can happen that we get preempted right after > > setting ->fpstate_active in a task's FPU. > > > > After we get preempted, we switch between tasks merrily and eventually > > are about to switch to that task above whose ->fpstate_active we > > set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets > > ->fpregs_active set, we find ourselves back on the call stack below and > > especially in __fpu__restore_sig() which sets ->fpregs_active again. > > > > Leading to that whoops below. So I'm wondering why this started triggering only now. Is this a pre-existing bug that somehow got triggered via: 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs ? If yes then we need a plausible theory of how that never triggered on modern Intel CPUs that had eagerfpu enabled for years. Or perhaps was it caused by one of the other changes in tip:x86/fpu: c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy() 5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode 4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode ? Which would make this a recently introduced regression. Thanks, Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-17 8:16 ` Ingo Molnar @ 2016-02-17 9:29 ` Borislav Petkov 2016-02-17 9:35 ` Ingo Molnar 2016-02-17 17:52 ` Andy Lutomirski 0 siblings, 2 replies; 15+ messages in thread From: Borislav Petkov @ 2016-02-17 9:29 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andy Lutomirski, linux-kernel, X86 ML On Wed, Feb 17, 2016 at 09:16:46AM +0100, Ingo Molnar wrote: > So I'm wondering why this started triggering only now. Is this a pre-existing bug > that somehow got triggered via: > > 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs > > ? Well, that's an interesting question. See, the thing is, I triggered this only *once* by accident and I haven't seen it ever since. The "reliable" "reproducer" I used to debug this was Andy's suggestion to stick a schedule() in __fpu__restore_sig(). So the answer to that question is not easy. BUT(!), regardless, the bug still needs to be fixed because my tracing here https://lkml.kernel.org/r/20160215191422.GB32716@pd.tnic showed that getting preempted after setting fpu->fpstate_active = 1; leads to the WARN. Because - and please doublecheck me on that - when we're in __switch_to() and the task which already has ->fpstate_active set and it is the next task to which we're going to switch to, when it enters switch_fpu_prepare(), it does: fpu.preload = static_cpu_has(X86_FEATURE_FPU) && new_fpu->fpstate_active && ^^^^^^^^^^^^^^^^^^^^^^^ so that fpu.preload is set now. A bit later in that same function: /* Don't change CR0.TS if we just switch! */ if (fpu.preload) { new_fpu->counter++; __fpregs_activate(new_fpu); ^^^^^^^^^^^^^^^^^ ->fpregs_active gets set here and when the task returns to __fpu__restore_sig(), fpu__restore() sets it again, leading to the WARN. Mind you, this happens on 32-bit only because there we sigreturn with irqs enabled. Look at the call trace. > If yes then we need a plausible theory of how that never triggered on > modern Intel CPUs that had eagerfpu enabled for years. AFAICT, it triggers - and the window is very small at that - only on 32-bit. If at all. > Or perhaps was it caused by one of the other changes in tip:x86/fpu: > > c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly > a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy() > 5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode > 4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode > > ? I can certainly try to test all those but I don't have a reliable reproducer. The only thing I could do is check out each of those commits and stick a schedule() in __fpu__restore_sig() and see what happens. But if my analysis above is right, none of those would matter because of the mechanism of how the warn happens... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-17 9:29 ` Borislav Petkov @ 2016-02-17 9:35 ` Ingo Molnar 2016-02-17 10:31 ` Borislav Petkov 2016-02-17 17:52 ` Andy Lutomirski 1 sibling, 1 reply; 15+ messages in thread From: Ingo Molnar @ 2016-02-17 9:35 UTC (permalink / raw) To: Borislav Petkov; +Cc: Andy Lutomirski, linux-kernel, X86 ML * Borislav Petkov <bp@alien8.de> wrote: > On Wed, Feb 17, 2016 at 09:16:46AM +0100, Ingo Molnar wrote: > > So I'm wondering why this started triggering only now. Is this a pre-existing bug > > that somehow got triggered via: > > > > 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs > > > > ? > > Well, that's an interesting question. See, the thing is, I triggered > this only *once* by accident and I haven't seen it ever since. > > The "reliable" "reproducer" I used to debug this was Andy's suggestion > to stick a schedule() in __fpu__restore_sig(). > > So the answer to that question is not easy. > > BUT(!), regardless, the bug still needs to be fixed because my tracing > here The fix is absolutely needed, I just would like deeper analysis about how it wasn't seen before. > > If yes then we need a plausible theory of how that never triggered on modern > > Intel CPUs that had eagerfpu enabled for years. > > AFAICT, it triggers - and the window is very small at that - only on > 32-bit. If at all. So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no recent FPU bits applied? > I can certainly try to test all those but I don't have a reliable reproducer. > The only thing I could do is check out each of those commits and stick a > schedule() in __fpu__restore_sig() and see what happens. > > But if my analysis above is right, none of those would matter because of the > mechanism of how the warn happens... So if you stick a schedule() into vanilla and it triggers then I think we can declare it an existing bug. (and then the fix also needs Cc: stable) Thanks, Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-17 9:35 ` Ingo Molnar @ 2016-02-17 10:31 ` Borislav Petkov 2016-02-17 11:06 ` Ingo Molnar 2016-02-17 11:41 ` Borislav Petkov 0 siblings, 2 replies; 15+ messages in thread From: Borislav Petkov @ 2016-02-17 10:31 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andy Lutomirski, linux-kernel, X86 ML On Wed, Feb 17, 2016 at 10:35:12AM +0100, Ingo Molnar wrote: > So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no > recent FPU bits applied? Nope, 4.4 doesn't trigger. Lemme try some of those other commits... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-17 10:31 ` Borislav Petkov @ 2016-02-17 11:06 ` Ingo Molnar 2016-02-17 11:41 ` Borislav Petkov 1 sibling, 0 replies; 15+ messages in thread From: Ingo Molnar @ 2016-02-17 11:06 UTC (permalink / raw) To: Borislav Petkov; +Cc: Andy Lutomirski, linux-kernel, X86 ML * Borislav Petkov <bp@alien8.de> wrote: > On Wed, Feb 17, 2016 at 10:35:12AM +0100, Ingo Molnar wrote: > > So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no > > recent FPU bits applied? > > Nope, 4.4 doesn't trigger. Lemme try some of those other commits... My guess is: 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs but could be one of the others as well ... Thanks, Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-17 10:31 ` Borislav Petkov 2016-02-17 11:06 ` Ingo Molnar @ 2016-02-17 11:41 ` Borislav Petkov 1 sibling, 0 replies; 15+ messages in thread From: Borislav Petkov @ 2016-02-17 11:41 UTC (permalink / raw) To: Ingo Molnar; +Cc: Andy Lutomirski, linux-kernel, X86 ML On Wed, Feb 17, 2016 at 11:31:41AM +0100, Borislav Petkov wrote: > On Wed, Feb 17, 2016 at 10:35:12AM +0100, Ingo Molnar wrote: > > So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no > > recent FPU bits applied? > > Nope, 4.4 doesn't trigger. Lemme try some of those other commits... Results: tip/x86/fpu: 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs - BAD c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly - BAD a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy() - BAD 5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode - BAD 4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode - BAD 765bdb406de4 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm") - BAD ... v4.4: GOOD so it must be something else causing it AFAICT. I'll continue trying the -rc's and the tip/master merges to see what causes it. Let me know if you have a better idea. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-17 9:29 ` Borislav Petkov 2016-02-17 9:35 ` Ingo Molnar @ 2016-02-17 17:52 ` Andy Lutomirski 1 sibling, 0 replies; 15+ messages in thread From: Andy Lutomirski @ 2016-02-17 17:52 UTC (permalink / raw) To: Borislav Petkov; +Cc: X86 ML, linux-kernel, Ingo Molnar On Feb 17, 2016 1:29 AM, "Borislav Petkov" <bp@alien8.de> wrote: > > On Wed, Feb 17, 2016 at 09:16:46AM +0100, Ingo Molnar wrote: > > So I'm wondering why this started triggering only now. Is this a pre-existing bug > > that somehow got triggered via: > > > > 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs > > > > ? > > Well, that's an interesting question. See, the thing is, I triggered > this only *once* by accident and I haven't seen it ever since. > > The "reliable" "reproducer" I used to debug this was Andy's suggestion > to stick a schedule() in __fpu__restore_sig(). > > So the answer to that question is not easy. > > BUT(!), regardless, the bug still needs to be fixed because my tracing > here > > https://lkml.kernel.org/r/20160215191422.GB32716@pd.tnic > > showed that getting preempted after setting > > fpu->fpstate_active = 1; > > leads to the WARN. Because - and please doublecheck me on that - when > we're in __switch_to() and the task which already has ->fpstate_active > set and it is the next task to which we're going to switch to, when it > enters switch_fpu_prepare(), it does: > > fpu.preload = static_cpu_has(X86_FEATURE_FPU) && > new_fpu->fpstate_active && > ^^^^^^^^^^^^^^^^^^^^^^^ > > so that fpu.preload is set now. > > A bit later in that same function: > > /* Don't change CR0.TS if we just switch! */ > if (fpu.preload) { > new_fpu->counter++; > __fpregs_activate(new_fpu); > ^^^^^^^^^^^^^^^^^ > > ->fpregs_active gets set here and when the task returns to > __fpu__restore_sig(), fpu__restore() sets it again, leading to the WARN. > > Mind you, this happens on 32-bit only because there we sigreturn with > irqs enabled. Look at the call trace. Not quite. IRQs are on in the 64-bit case, too, but the problematic code doesn't run because the 64-bit sigreturn case doesn't need to fiddle with the fpu layout. --Andy ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-12 1:16 ` Andy Lutomirski 2016-02-12 17:00 ` Borislav Petkov @ 2016-02-15 19:05 ` Borislav Petkov 1 sibling, 0 replies; 15+ messages in thread From: Borislav Petkov @ 2016-02-15 19:05 UTC (permalink / raw) To: Andy Lutomirski; +Cc: x86-ml, lkml On Thu, Feb 11, 2016 at 05:16:00PM -0800, Andy Lutomirski wrote: > Are you running 32-bit userspace by any chance? Sure, that's a 32-bit kernel testing partition. :) > I'm guessing you're hitting this in __fpu_restore_sig: Yeah, I was looking at that too. > fpu__drop(fpu); > if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) || > __copy_from_user(&env, buf, sizeof(env))) { > fpstate_init(&fpu->state); > err = -1; > } else { > sanitize_restored_xstate(tsk, &env, xfeatures, fx_only); > } > > fpu->fpstate_active = 1; > > <-- preempted right here Yeah, that could explain why I'm seeing it. > if (use_eager_fpu()) { > preempt_disable(); > fpu__restore(fpu); > preempt_enable(); > } > > I don't see why this code deserves to work. If I'm right, it can be > fixed by pulling the preempt_disable out of the if (use_eager_fpu()) > to right above the fpstate_active = 1 line. Don't bother trying to > optimize the !use_eager_fpu() case. Right. > Once someone gets around to eagerly *allocating* the FPU context and > dropping CR0.TS usage entirely, then even that won't be enough unless > we do my suggesting of deferring FPU restore to > prepare_exit_to_usermode. (Doing that will make all of this much, > much more sane.) Sounds good to me. So the thing with this issue is, is that I don't have a reproducer yet. It happened randomly. So let me ask it this way: can anything go wrong if we pull up the preemption disabled region? I mean, do we even care about the !eager FPU case? I'd much prefer that vs FPU state corruption... -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() 2016-02-11 23:47 ` Andy Lutomirski 2016-02-12 1:16 ` Andy Lutomirski @ 2016-02-12 11:17 ` Borislav Petkov 1 sibling, 0 replies; 15+ messages in thread From: Borislav Petkov @ 2016-02-12 11:17 UTC (permalink / raw) To: Andy Lutomirski; +Cc: x86-ml, lkml On Thu, Feb 11, 2016 at 03:47:12PM -0800, Andy Lutomirski wrote: > Can you send all the fpu info that the kernel prints really early when it boots? $ dmesg | grep -i fpu [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers' [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] x86/fpu: Using 'eager' FPU context switches. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2016-02-17 17:53 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-02-11 19:27 WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() Borislav Petkov 2016-02-11 23:47 ` Andy Lutomirski 2016-02-12 1:16 ` Andy Lutomirski 2016-02-12 17:00 ` Borislav Petkov 2016-02-15 19:14 ` Borislav Petkov 2016-02-16 2:25 ` Andy Lutomirski 2016-02-17 8:16 ` Ingo Molnar 2016-02-17 9:29 ` Borislav Petkov 2016-02-17 9:35 ` Ingo Molnar 2016-02-17 10:31 ` Borislav Petkov 2016-02-17 11:06 ` Ingo Molnar 2016-02-17 11:41 ` Borislav Petkov 2016-02-17 17:52 ` Andy Lutomirski 2016-02-15 19:05 ` Borislav Petkov 2016-02-12 11:17 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).