linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
@ 2016-02-11 19:27 Borislav Petkov
  2016-02-11 23:47 ` Andy Lutomirski
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2016-02-11 19:27 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: x86-ml, lkml

Hey Andy,

can you make any sense of this:

[   90.573923] ------------[ cut here ]------------
[   90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
[   90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal
[   90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
[   90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
[   90.582325]  00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
[   90.583179]  c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
[   90.584142]  f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
[   90.585002] Call Trace:
[   90.585854]  [<c12cce56>] dump_stack+0x5f/0x89
[   90.586703]  [<c10567fb>] warn_slowpath_common+0x8b/0xc0
[   90.587559]  [<c1025ab0>] ? fpu__restore+0x90/0x130
[   90.588520]  [<c1025ab0>] ? fpu__restore+0x90/0x130
[   90.589353]  [<c10568d2>] warn_slowpath_null+0x22/0x30
[   90.590175]  [<c1025ab0>] fpu__restore+0x90/0x130
[   90.590993]  [<c1027098>] __fpu__restore_sig+0x268/0x4c0
[   90.591816]  [<c102751f>] fpu__restore_sig+0x2f/0x50
[   90.592636]  [<c101a6c9>] restore_sigcontext+0xe9/0x110
[   90.593449]  [<c101af3c>] sys_sigreturn+0x9c/0xb0
[   90.594263]  [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340
[   90.595079]  [<c169979d>] entry_INT80_32+0x31/0x31
[   90.595922] ---[ end trace be617bef2982f473 ]---

This is rc3 + latest tip/master and it happened when I did "make
mrproper" in the kernel repo.

>From a quick stare, it looks to me we're running do_syscall_32_irqs_on()
with IRQs on, sys_sigreturn() does current_pt_regs() but
__fpu__restore_sig() derefs current again and could be that that second
"current" is another current which already has ->fpregs_active set ?

FPU + signal handling code in a single backtrace. My favourite!

:-\

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-11 19:27 WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() Borislav Petkov
@ 2016-02-11 23:47 ` Andy Lutomirski
  2016-02-12  1:16   ` Andy Lutomirski
  2016-02-12 11:17   ` Borislav Petkov
  0 siblings, 2 replies; 15+ messages in thread
From: Andy Lutomirski @ 2016-02-11 23:47 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: x86-ml, lkml

On Thu, Feb 11, 2016 at 11:27 AM, Borislav Petkov <bp@alien8.de> wrote:
> Hey Andy,
>
> can you make any sense of this:
>
> [   90.573923] ------------[ cut here ]------------
> [   90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
> [   90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal
> [   90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
> [   90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
> [   90.582325]  00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
> [   90.583179]  c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
> [   90.584142]  f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
> [   90.585002] Call Trace:
> [   90.585854]  [<c12cce56>] dump_stack+0x5f/0x89
> [   90.586703]  [<c10567fb>] warn_slowpath_common+0x8b/0xc0
> [   90.587559]  [<c1025ab0>] ? fpu__restore+0x90/0x130
> [   90.588520]  [<c1025ab0>] ? fpu__restore+0x90/0x130
> [   90.589353]  [<c10568d2>] warn_slowpath_null+0x22/0x30
> [   90.590175]  [<c1025ab0>] fpu__restore+0x90/0x130
> [   90.590993]  [<c1027098>] __fpu__restore_sig+0x268/0x4c0
> [   90.591816]  [<c102751f>] fpu__restore_sig+0x2f/0x50
> [   90.592636]  [<c101a6c9>] restore_sigcontext+0xe9/0x110
> [   90.593449]  [<c101af3c>] sys_sigreturn+0x9c/0xb0
> [   90.594263]  [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340
> [   90.595079]  [<c169979d>] entry_INT80_32+0x31/0x31
> [   90.595922] ---[ end trace be617bef2982f473 ]---
>
> This is rc3 + latest tip/master and it happened when I did "make
> mrproper" in the kernel repo.
>
> From a quick stare, it looks to me we're running do_syscall_32_irqs_on()
> with IRQs on, sys_sigreturn() does current_pt_regs() but
> __fpu__restore_sig() derefs current again and could be that that second
> "current" is another current which already has ->fpregs_active set ?
>
> FPU + signal handling code in a single backtrace. My favourite!
>
> :-\

Ugh.

Can you send all the fpu info that the kernel prints really early when it boots?

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-11 23:47 ` Andy Lutomirski
@ 2016-02-12  1:16   ` Andy Lutomirski
  2016-02-12 17:00     ` Borislav Petkov
  2016-02-15 19:05     ` Borislav Petkov
  2016-02-12 11:17   ` Borislav Petkov
  1 sibling, 2 replies; 15+ messages in thread
From: Andy Lutomirski @ 2016-02-12  1:16 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: x86-ml, lkml

On Thu, Feb 11, 2016 at 3:47 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Feb 11, 2016 at 11:27 AM, Borislav Petkov <bp@alien8.de> wrote:
>> Hey Andy,
>>
>> can you make any sense of this:
>>
>> [   90.573923] ------------[ cut here ]------------
>> [   90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
>> [   90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal
>> [   90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
>> [   90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
>> [   90.582325]  00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
>> [   90.583179]  c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
>> [   90.584142]  f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
>> [   90.585002] Call Trace:
>> [   90.585854]  [<c12cce56>] dump_stack+0x5f/0x89
>> [   90.586703]  [<c10567fb>] warn_slowpath_common+0x8b/0xc0
>> [   90.587559]  [<c1025ab0>] ? fpu__restore+0x90/0x130
>> [   90.588520]  [<c1025ab0>] ? fpu__restore+0x90/0x130
>> [   90.589353]  [<c10568d2>] warn_slowpath_null+0x22/0x30
>> [   90.590175]  [<c1025ab0>] fpu__restore+0x90/0x130
>> [   90.590993]  [<c1027098>] __fpu__restore_sig+0x268/0x4c0
>> [   90.591816]  [<c102751f>] fpu__restore_sig+0x2f/0x50
>> [   90.592636]  [<c101a6c9>] restore_sigcontext+0xe9/0x110
>> [   90.593449]  [<c101af3c>] sys_sigreturn+0x9c/0xb0
>> [   90.594263]  [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340
>> [   90.595079]  [<c169979d>] entry_INT80_32+0x31/0x31
>> [   90.595922] ---[ end trace be617bef2982f473 ]---
>>
>> This is rc3 + latest tip/master and it happened when I did "make
>> mrproper" in the kernel repo.
>>
>> From a quick stare, it looks to me we're running do_syscall_32_irqs_on()
>> with IRQs on, sys_sigreturn() does current_pt_regs() but
>> __fpu__restore_sig() derefs current again and could be that that second
>> "current" is another current which already has ->fpregs_active set ?
>>
>> FPU + signal handling code in a single backtrace. My favourite!
>>
>> :-\
>
> Ugh.
>
> Can you send all the fpu info that the kernel prints really early when it boots?
>

Are you running 32-bit userspace by any chance?  I'm guessing you're
hitting this in __fpu_restore_sig:

        fpu__drop(fpu);
        if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
            __copy_from_user(&env, buf, sizeof(env))) {
            fpstate_init(&fpu->state);
            err = -1;
        } else {
            sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
        }

        fpu->fpstate_active = 1;

<-- preempted right here

        if (use_eager_fpu()) {
            preempt_disable();
            fpu__restore(fpu);
            preempt_enable();
        }

I don't see why this code deserves to work.  If I'm right, it can be
fixed by pulling the preempt_disable out of the if (use_eager_fpu())
to right above the fpstate_active = 1 line.  Don't bother trying to
optimize the !use_eager_fpu() case.

Once someone gets around to eagerly *allocating* the FPU context and
dropping CR0.TS usage entirely, then even that won't be enough unless
we do my suggesting of deferring FPU restore to
prepare_exit_to_usermode.  (Doing that will make all of this much,
much more sane.)


--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-11 23:47 ` Andy Lutomirski
  2016-02-12  1:16   ` Andy Lutomirski
@ 2016-02-12 11:17   ` Borislav Petkov
  1 sibling, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2016-02-12 11:17 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: x86-ml, lkml

On Thu, Feb 11, 2016 at 03:47:12PM -0800, Andy Lutomirski wrote:
> Can you send all the fpu info that the kernel prints really early when it boots?

$ dmesg | grep -i fpu
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers'
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] x86/fpu: Using 'eager' FPU context switches.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-12  1:16   ` Andy Lutomirski
@ 2016-02-12 17:00     ` Borislav Petkov
  2016-02-15 19:14       ` Borislav Petkov
  2016-02-15 19:05     ` Borislav Petkov
  1 sibling, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2016-02-12 17:00 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: x86-ml, lkml

On Thu, Feb 11, 2016 at 05:16:00PM -0800, Andy Lutomirski wrote:
>         fpu__drop(fpu);
>         if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
>             __copy_from_user(&env, buf, sizeof(env))) {
>             fpstate_init(&fpu->state);
>             err = -1;
>         } else {
>             sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
>         }
> 
>         fpu->fpstate_active = 1;
> 
> <-- preempted right here

>From IRC:

<amluto> you could try sticking a schedule() right after that fpu->fpstate_active = 1; line

>         if (use_eager_fpu()) {
>             preempt_disable();
>             fpu__restore(fpu);
>             preempt_enable();

Something for me to try when I get a chance.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-12  1:16   ` Andy Lutomirski
  2016-02-12 17:00     ` Borislav Petkov
@ 2016-02-15 19:05     ` Borislav Petkov
  1 sibling, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2016-02-15 19:05 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: x86-ml, lkml

On Thu, Feb 11, 2016 at 05:16:00PM -0800, Andy Lutomirski wrote:
> Are you running 32-bit userspace by any chance?

Sure, that's a 32-bit kernel testing partition. :)

> I'm guessing you're hitting this in __fpu_restore_sig:

Yeah, I was looking at that too.

>         fpu__drop(fpu);
>         if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
>             __copy_from_user(&env, buf, sizeof(env))) {
>             fpstate_init(&fpu->state);
>             err = -1;
>         } else {
>             sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
>         }
> 
>         fpu->fpstate_active = 1;
> 
> <-- preempted right here

Yeah, that could explain why I'm seeing it.

>         if (use_eager_fpu()) {
>             preempt_disable();
>             fpu__restore(fpu);
>             preempt_enable();
>         }
> 
> I don't see why this code deserves to work.  If I'm right, it can be
> fixed by pulling the preempt_disable out of the if (use_eager_fpu())
> to right above the fpstate_active = 1 line.  Don't bother trying to
> optimize the !use_eager_fpu() case.

Right.

> Once someone gets around to eagerly *allocating* the FPU context and
> dropping CR0.TS usage entirely, then even that won't be enough unless
> we do my suggesting of deferring FPU restore to
> prepare_exit_to_usermode.  (Doing that will make all of this much,
> much more sane.)

Sounds good to me.

So the thing with this issue is, is that I don't have a reproducer yet.
It happened randomly.

So let me ask it this way: can anything go wrong if we pull up the
preemption disabled region? I mean, do we even care about the !eager FPU
case? I'd much prefer that vs FPU state corruption...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-12 17:00     ` Borislav Petkov
@ 2016-02-15 19:14       ` Borislav Petkov
  2016-02-16  2:25         ` Andy Lutomirski
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2016-02-15 19:14 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: x86-ml, lkml

On Fri, Feb 12, 2016 at 06:00:10PM +0100, Borislav Petkov wrote:
> Something for me to try when I get a chance.

Ok, so I wanted to know what happens in detail. Here's some ftracing
(debug patch at the end).

Now pay attention to this udevadm thing

[    3.816977] rcu_pree-7       0d..2 4058241us : __switch_to: prev: rcu_preempt <-> next: udevadm
[    3.816977] rcu_pree-7       0d..2 4058241us : __switch_to: set ->fpregs_active
[    3.816977]  udevadm-982     0.... 4058258us : __fpu__restore_sig: fpregs_active 0, f443d7c0

We're in __fpu__restore_sig() about to call schedule()

[    3.816977]  udevadm-982     0d..2 4058260us : __switch_to: prev: udevadm <-> next: usb_id
[    3.816977]  udevadm-982     0d..2 4058260us : __switch_to: set ->fpregs_active
[    3.816977]   usb_id-987     0d..2 4059684us : __switch_to: prev: usb_id <-> next: udevd
[    3.816977]   usb_id-987     0d..2 4059685us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-843     0d..2 4059697us : __switch_to: prev: udevd <-> next: udevd
[    3.816977]    udevd-843     0d..2 4059697us : __switch_to: set ->fpregs_active
[    3.816977] alsa-uti-989     0d..2 4060452us : __switch_to: prev: alsa-utils <-> next: udevd
[    3.816977] alsa-uti-989     0d..2 4060452us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-840     0d..2 4060521us : __switch_to: prev: udevd <-> next: udevd
[    3.816977]    udevd-840     0d..2 4060522us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-829     0d..2 4060557us : __switch_to: prev: udevd <-> next: udevd
[    3.816977]    udevd-829     0d..2 4060558us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-840     0d..2 4060862us : __switch_to: prev: udevd <-> next: blkid
[    3.816977]    udevd-840     0d..2 4060862us : __switch_to: set ->fpregs_active
[    3.816977]    blkid-985     0d..2 4061148us : __switch_to: prev: blkid <-> next: udevadm
[    3.816977]    blkid-985     0d..2 4061148us : __switch_to: set ->fpregs_active

Now we're switching back to udevadm which is @next_p of __switch_to().

There we do:

	fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu);

which does:

                /* Don't change CR0.TS if we just switch! */
                if (fpu.preload) {
                        new_fpu->counter++;
                        __fpregs_activate(new_fpu);
                        prefetch(&new_fpu->state);

__fpregs_activate() sets ->fpregs_active of @new_fpu, i.e. udevadm's one.

[    3.816977]  udevadm-982     0.... 4061149us : __fpu__restore_sig: after schedule: fpregs_active: 1 f443d7c0

__fpu__restore_sig() -> fpu__restore() sets ->fpregs_active again.

[    3.816977]  udevadm-982     0.N.1 4185386us : fpu__restore: WARN: fpu: f443d7c0

Boom!

[    3.816977]  udevadm-982     0.N.1 4185392us : <stack trace>
[    3.816977]  => fpu__restore
[    3.816977]  => __fpu__restore_sig
[    3.816977]  => fpu__restore_sig
[    3.816977]  => restore_sigcontext
[    3.816977]  => sys_sigreturn
[    3.816977]  => do_syscall_32_irqs_on
[    3.816977]  => restore_all
[    3.816977] ---------------------------------
[    3.816977] Kernel Offset: disabled
[    3.816977] ---[ end Kernel panic - not syncing: Outta here...

So yeah, we probably should enlarge the preemption-off region to contain
->fpstate_active. Here's what you basically suggested but with a
*looot* of explanatory text. Which might be really wrong or completely
unparseable or both. So holler what should be changed.

Thanks!

---
From: Borislav Petkov <bp@suse.de>
Date: Mon, 15 Feb 2016 19:50:33 +0100
Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation

On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with
interrupts enabled. And it can happen that we get preempted right after
setting ->fpstate_active in a task's FPU.

After we get preempted, we switch between tasks merrily and eventually
are about to switch to that task above whose ->fpstate_active we
set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets
->fpregs_active set, we find ourselves back on the call stack below and
especially in __fpu__restore_sig() which sets ->fpregs_active again.

Leading to that whoops below.

So let's enlarge the preemption-off region so that we set ->fpstate_active with
preemption disabled and thus not trigger fpu.preload:

  switch_fpu_prepare

  ...

        fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
                      new_fpu->fpstate_active &&
		      ^^^^^^^^^^^^^^^^^^^^^^

prematurely.

  WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi
  snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support
  emp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel
  iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586
  sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd
  ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core 211 lpc_ich
  mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd
  battery soundcore led_class ac thermal
  CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
  Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
   00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
   c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
   f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
  Call Trace:
    dump_stack
    warn_slowpath_common
    ? fpu__restore
    ? fpu__restore
    warn_slowpath_null
    fpu__restore
    __fpu__restore_sig
    fpu__restore_sig
    restore_sigcontext
    sys_sigreturn
    do_syscall_32_irqs_on
    entry_INT80_32

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/signal.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 31c6a60505e6..408e5a1c6fdd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -316,12 +316,11 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
 		}
 
+		preempt_disable();
 		fpu->fpstate_active = 1;
-		if (use_eager_fpu()) {
-			preempt_disable();
+		if (use_eager_fpu())
 			fpu__restore(fpu);
-			preempt_enable();
-		}
+		preempt_enable();
 
 		return err;
 	} else {
-- 
2.3.5



Tracing patch:
---
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a2124343edf5..2cbc3bf34928 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -527,9 +527,16 @@ static inline void __fpregs_deactivate(struct fpu *fpu)
 /* Must be paired with a 'clts' (fpregs_activate_hw()) before! */
 static inline void __fpregs_activate(struct fpu *fpu)
 {
-	WARN_ON_FPU(fpu->fpregs_active);
+	if (WARN_ON_FPU(fpu->fpregs_active)) {
+		trace_printk("WARN: fpu: %p\n", fpu);
+		trace_dump_stack(0);
+		tracing_off();
+		panic("Outta here...\n");
+	}
 
 	fpu->fpregs_active = 1;
+	trace_printk("set ->fpregs_active\n");
+
 	this_cpu_write(fpu_fpregs_owner_ctx, fpu);
 }
 
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 31c6a60505e6..bb40f02cdfdd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -317,6 +317,14 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		}
 
 		fpu->fpstate_active = 1;
+
+		trace_printk("fpregs_active %d, %p\n", fpu->fpregs_active, fpu);
+
+		schedule();
+
+		trace_printk("after schedule: fpregs_active: %d %p\n",
+			     fpu->fpregs_active, fpu);
+
 		if (use_eager_fpu()) {
 			preempt_disable();
 			fpu__restore(fpu);
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 9f950917528b..ce768c728f38 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -249,6 +249,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
 	fpu_switch_t fpu_switch;
 
+	trace_printk("prev: %s <-> next: %s\n", prev_p->comm, next_p->comm);
+
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
 	fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu);


-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-15 19:14       ` Borislav Petkov
@ 2016-02-16  2:25         ` Andy Lutomirski
  2016-02-17  8:16           ` Ingo Molnar
  0 siblings, 1 reply; 15+ messages in thread
From: Andy Lutomirski @ 2016-02-16  2:25 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: linux-kernel, X86 ML

On Feb 15, 2016 12:14 PM, "Borislav Petkov" <bp@alien8.de> wrote:
>
> ---
> From: Borislav Petkov <bp@suse.de>
> Date: Mon, 15 Feb 2016 19:50:33 +0100
> Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation
>
> On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with
> interrupts enabled.

I would change this a little bit.

sys_sigreturn calls fpu__restore_sig with interrupts enabled.  When
restoring a 32-bit signal frame, it can happen that...

> And it can happen that we get preempted right after
> setting ->fpstate_active in a task's FPU.
>
> After we get preempted, we switch between tasks merrily and eventually
> are about to switch to that task above whose ->fpstate_active we
> set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets
> ->fpregs_active set, we find ourselves back on the call stack below and
> especially in __fpu__restore_sig() which sets ->fpregs_active again.
>
> Leading to that whoops below.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-16  2:25         ` Andy Lutomirski
@ 2016-02-17  8:16           ` Ingo Molnar
  2016-02-17  9:29             ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2016-02-17  8:16 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Borislav Petkov, linux-kernel, X86 ML


* Andy Lutomirski <luto@amacapital.net> wrote:

> On Feb 15, 2016 12:14 PM, "Borislav Petkov" <bp@alien8.de> wrote:
> >
> > ---
> > From: Borislav Petkov <bp@suse.de>
> > Date: Mon, 15 Feb 2016 19:50:33 +0100
> > Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation
> >
> > On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with
> > interrupts enabled.
> 
> I would change this a little bit.
> 
> sys_sigreturn calls fpu__restore_sig with interrupts enabled.  When
> restoring a 32-bit signal frame, it can happen that...
> 
> > And it can happen that we get preempted right after
> > setting ->fpstate_active in a task's FPU.
> >
> > After we get preempted, we switch between tasks merrily and eventually
> > are about to switch to that task above whose ->fpstate_active we
> > set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets
> > ->fpregs_active set, we find ourselves back on the call stack below and
> > especially in __fpu__restore_sig() which sets ->fpregs_active again.
> >
> > Leading to that whoops below.

So I'm wondering why this started triggering only now. Is this a pre-existing bug 
that somehow got triggered via:

  58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs

? If yes then we need a plausible theory of how that never triggered on modern 
Intel CPUs that had eagerfpu enabled for years.

Or perhaps was it caused by one of the other changes in tip:x86/fpu:

  c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly
  a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy()
  5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode
  4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode

?

Which would make this a recently introduced regression.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-17  8:16           ` Ingo Molnar
@ 2016-02-17  9:29             ` Borislav Petkov
  2016-02-17  9:35               ` Ingo Molnar
  2016-02-17 17:52               ` Andy Lutomirski
  0 siblings, 2 replies; 15+ messages in thread
From: Borislav Petkov @ 2016-02-17  9:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andy Lutomirski, linux-kernel, X86 ML

On Wed, Feb 17, 2016 at 09:16:46AM +0100, Ingo Molnar wrote:
> So I'm wondering why this started triggering only now. Is this a pre-existing bug 
> that somehow got triggered via:
> 
>   58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs
> 
> ?

Well, that's an interesting question. See, the thing is, I triggered
this only *once* by accident and I haven't seen it ever since.

The "reliable" "reproducer" I used to debug this was Andy's suggestion
to stick a schedule() in __fpu__restore_sig().

So the answer to that question is not easy.

BUT(!), regardless, the bug still needs to be fixed because my tracing
here

https://lkml.kernel.org/r/20160215191422.GB32716@pd.tnic

showed that getting preempted after setting

	fpu->fpstate_active = 1;

leads to the WARN. Because - and please doublecheck me on that - when
we're in __switch_to() and the task which already has ->fpstate_active
set and it is the next task to which we're going to switch to, when it
enters switch_fpu_prepare(), it does:

        fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
                      new_fpu->fpstate_active &&
		      ^^^^^^^^^^^^^^^^^^^^^^^

so that fpu.preload is set now.

A bit later in that same function:

                /* Don't change CR0.TS if we just switch! */
                if (fpu.preload) {
                        new_fpu->counter++;
                        __fpregs_activate(new_fpu);
			^^^^^^^^^^^^^^^^^

->fpregs_active gets set here and when the task returns to
__fpu__restore_sig(), fpu__restore() sets it again, leading to the WARN.

Mind you, this happens on 32-bit only because there we sigreturn with
irqs enabled. Look at the call trace.

> If yes then we need a plausible theory of how that never triggered on
> modern Intel CPUs that had eagerfpu enabled for years.

AFAICT, it triggers - and the window is very small at that - only on
32-bit. If at all.

> Or perhaps was it caused by one of the other changes in tip:x86/fpu:
> 
>   c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly
>   a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy()
>   5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode
>   4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode
> 
> ?

I can certainly try to test all those but I don't have a reliable
reproducer. The only thing I could do is check out each of those commits
and stick a schedule() in __fpu__restore_sig() and see what happens.

But if my analysis above is right, none of those would matter because of
the mechanism of how the warn happens...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-17  9:29             ` Borislav Petkov
@ 2016-02-17  9:35               ` Ingo Molnar
  2016-02-17 10:31                 ` Borislav Petkov
  2016-02-17 17:52               ` Andy Lutomirski
  1 sibling, 1 reply; 15+ messages in thread
From: Ingo Molnar @ 2016-02-17  9:35 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, linux-kernel, X86 ML


* Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Feb 17, 2016 at 09:16:46AM +0100, Ingo Molnar wrote:
> > So I'm wondering why this started triggering only now. Is this a pre-existing bug 
> > that somehow got triggered via:
> > 
> >   58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs
> > 
> > ?
> 
> Well, that's an interesting question. See, the thing is, I triggered
> this only *once* by accident and I haven't seen it ever since.
> 
> The "reliable" "reproducer" I used to debug this was Andy's suggestion
> to stick a schedule() in __fpu__restore_sig().
> 
> So the answer to that question is not easy.
> 
> BUT(!), regardless, the bug still needs to be fixed because my tracing
> here

The fix is absolutely needed, I just would like deeper analysis about how it 
wasn't seen before.

> > If yes then we need a plausible theory of how that never triggered on modern 
> > Intel CPUs that had eagerfpu enabled for years.
> 
> AFAICT, it triggers - and the window is very small at that - only on
> 32-bit. If at all.

So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no recent FPU 
bits applied?

> I can certainly try to test all those but I don't have a reliable reproducer. 
> The only thing I could do is check out each of those commits and stick a 
> schedule() in __fpu__restore_sig() and see what happens.
> 
> But if my analysis above is right, none of those would matter because of the 
> mechanism of how the warn happens...

So if you stick a schedule() into vanilla and it triggers then I think we can 
declare it an existing bug. (and then the fix also needs Cc: stable)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-17  9:35               ` Ingo Molnar
@ 2016-02-17 10:31                 ` Borislav Petkov
  2016-02-17 11:06                   ` Ingo Molnar
  2016-02-17 11:41                   ` Borislav Petkov
  0 siblings, 2 replies; 15+ messages in thread
From: Borislav Petkov @ 2016-02-17 10:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andy Lutomirski, linux-kernel, X86 ML

On Wed, Feb 17, 2016 at 10:35:12AM +0100, Ingo Molnar wrote:
> So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no
> recent FPU bits applied?

Nope, 4.4 doesn't trigger. Lemme try some of those other commits...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-17 10:31                 ` Borislav Petkov
@ 2016-02-17 11:06                   ` Ingo Molnar
  2016-02-17 11:41                   ` Borislav Petkov
  1 sibling, 0 replies; 15+ messages in thread
From: Ingo Molnar @ 2016-02-17 11:06 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Andy Lutomirski, linux-kernel, X86 ML


* Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Feb 17, 2016 at 10:35:12AM +0100, Ingo Molnar wrote:
> > So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no
> > recent FPU bits applied?
> 
> Nope, 4.4 doesn't trigger. Lemme try some of those other commits...

My guess is:

  58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs

but could be one of the others as well ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-17 10:31                 ` Borislav Petkov
  2016-02-17 11:06                   ` Ingo Molnar
@ 2016-02-17 11:41                   ` Borislav Petkov
  1 sibling, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2016-02-17 11:41 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andy Lutomirski, linux-kernel, X86 ML

On Wed, Feb 17, 2016 at 11:31:41AM +0100, Borislav Petkov wrote:
> On Wed, Feb 17, 2016 at 10:35:12AM +0100, Ingo Molnar wrote:
> > So it probably triggers on vanilla v4.4 (or v4.5-rc4) as well, with no
> > recent FPU bits applied?
> 
> Nope, 4.4 doesn't trigger. Lemme try some of those other commits...

Results:

  tip/x86/fpu:

  58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs		- BAD
  c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly	- BAD
  a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy()	- BAD
  5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode	- BAD
  4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode	- BAD
  765bdb406de4 ("Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm")	- BAD

  ...

  v4.4: GOOD

so it must be something else causing it AFAICT.

I'll continue trying the -rc's and the tip/master merges to see what
causes it. Let me know if you have a better idea.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  2016-02-17  9:29             ` Borislav Petkov
  2016-02-17  9:35               ` Ingo Molnar
@ 2016-02-17 17:52               ` Andy Lutomirski
  1 sibling, 0 replies; 15+ messages in thread
From: Andy Lutomirski @ 2016-02-17 17:52 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: X86 ML, linux-kernel, Ingo Molnar

On Feb 17, 2016 1:29 AM, "Borislav Petkov" <bp@alien8.de> wrote:
>
> On Wed, Feb 17, 2016 at 09:16:46AM +0100, Ingo Molnar wrote:
> > So I'm wondering why this started triggering only now. Is this a pre-existing bug
> > that somehow got triggered via:
> >
> >   58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs
> >
> > ?
>
> Well, that's an interesting question. See, the thing is, I triggered
> this only *once* by accident and I haven't seen it ever since.
>
> The "reliable" "reproducer" I used to debug this was Andy's suggestion
> to stick a schedule() in __fpu__restore_sig().
>
> So the answer to that question is not easy.
>
> BUT(!), regardless, the bug still needs to be fixed because my tracing
> here
>
> https://lkml.kernel.org/r/20160215191422.GB32716@pd.tnic
>
> showed that getting preempted after setting
>
>         fpu->fpstate_active = 1;
>
> leads to the WARN. Because - and please doublecheck me on that - when
> we're in __switch_to() and the task which already has ->fpstate_active
> set and it is the next task to which we're going to switch to, when it
> enters switch_fpu_prepare(), it does:
>
>         fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
>                       new_fpu->fpstate_active &&
>                       ^^^^^^^^^^^^^^^^^^^^^^^
>
> so that fpu.preload is set now.
>
> A bit later in that same function:
>
>                 /* Don't change CR0.TS if we just switch! */
>                 if (fpu.preload) {
>                         new_fpu->counter++;
>                         __fpregs_activate(new_fpu);
>                         ^^^^^^^^^^^^^^^^^
>
> ->fpregs_active gets set here and when the task returns to
> __fpu__restore_sig(), fpu__restore() sets it again, leading to the WARN.
>
> Mind you, this happens on 32-bit only because there we sigreturn with
> irqs enabled. Look at the call trace.

Not quite.  IRQs are on in the 64-bit case, too, but the problematic
code doesn't run because the 64-bit sigreturn case doesn't need to
fiddle with the fpu layout.

--Andy

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-02-17 17:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-11 19:27 WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() Borislav Petkov
2016-02-11 23:47 ` Andy Lutomirski
2016-02-12  1:16   ` Andy Lutomirski
2016-02-12 17:00     ` Borislav Petkov
2016-02-15 19:14       ` Borislav Petkov
2016-02-16  2:25         ` Andy Lutomirski
2016-02-17  8:16           ` Ingo Molnar
2016-02-17  9:29             ` Borislav Petkov
2016-02-17  9:35               ` Ingo Molnar
2016-02-17 10:31                 ` Borislav Petkov
2016-02-17 11:06                   ` Ingo Molnar
2016-02-17 11:41                   ` Borislav Petkov
2016-02-17 17:52               ` Andy Lutomirski
2016-02-15 19:05     ` Borislav Petkov
2016-02-12 11:17   ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).