linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Andy Lutomirski <luto@amacapital.net>
Cc: x86-ml <x86@kernel.org>, lkml <linux-kernel@vger.kernel.org>
Subject: Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
Date: Mon, 15 Feb 2016 20:14:22 +0100	[thread overview]
Message-ID: <20160215191422.GB32716@pd.tnic> (raw)
In-Reply-To: <20160212170010.GE4099@pd.tnic>

On Fri, Feb 12, 2016 at 06:00:10PM +0100, Borislav Petkov wrote:
> Something for me to try when I get a chance.

Ok, so I wanted to know what happens in detail. Here's some ftracing
(debug patch at the end).

Now pay attention to this udevadm thing

[    3.816977] rcu_pree-7       0d..2 4058241us : __switch_to: prev: rcu_preempt <-> next: udevadm
[    3.816977] rcu_pree-7       0d..2 4058241us : __switch_to: set ->fpregs_active
[    3.816977]  udevadm-982     0.... 4058258us : __fpu__restore_sig: fpregs_active 0, f443d7c0

We're in __fpu__restore_sig() about to call schedule()

[    3.816977]  udevadm-982     0d..2 4058260us : __switch_to: prev: udevadm <-> next: usb_id
[    3.816977]  udevadm-982     0d..2 4058260us : __switch_to: set ->fpregs_active
[    3.816977]   usb_id-987     0d..2 4059684us : __switch_to: prev: usb_id <-> next: udevd
[    3.816977]   usb_id-987     0d..2 4059685us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-843     0d..2 4059697us : __switch_to: prev: udevd <-> next: udevd
[    3.816977]    udevd-843     0d..2 4059697us : __switch_to: set ->fpregs_active
[    3.816977] alsa-uti-989     0d..2 4060452us : __switch_to: prev: alsa-utils <-> next: udevd
[    3.816977] alsa-uti-989     0d..2 4060452us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-840     0d..2 4060521us : __switch_to: prev: udevd <-> next: udevd
[    3.816977]    udevd-840     0d..2 4060522us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-829     0d..2 4060557us : __switch_to: prev: udevd <-> next: udevd
[    3.816977]    udevd-829     0d..2 4060558us : __switch_to: set ->fpregs_active
[    3.816977]    udevd-840     0d..2 4060862us : __switch_to: prev: udevd <-> next: blkid
[    3.816977]    udevd-840     0d..2 4060862us : __switch_to: set ->fpregs_active
[    3.816977]    blkid-985     0d..2 4061148us : __switch_to: prev: blkid <-> next: udevadm
[    3.816977]    blkid-985     0d..2 4061148us : __switch_to: set ->fpregs_active

Now we're switching back to udevadm which is @next_p of __switch_to().

There we do:

	fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu);

which does:

                /* Don't change CR0.TS if we just switch! */
                if (fpu.preload) {
                        new_fpu->counter++;
                        __fpregs_activate(new_fpu);
                        prefetch(&new_fpu->state);

__fpregs_activate() sets ->fpregs_active of @new_fpu, i.e. udevadm's one.

[    3.816977]  udevadm-982     0.... 4061149us : __fpu__restore_sig: after schedule: fpregs_active: 1 f443d7c0

__fpu__restore_sig() -> fpu__restore() sets ->fpregs_active again.

[    3.816977]  udevadm-982     0.N.1 4185386us : fpu__restore: WARN: fpu: f443d7c0

Boom!

[    3.816977]  udevadm-982     0.N.1 4185392us : <stack trace>
[    3.816977]  => fpu__restore
[    3.816977]  => __fpu__restore_sig
[    3.816977]  => fpu__restore_sig
[    3.816977]  => restore_sigcontext
[    3.816977]  => sys_sigreturn
[    3.816977]  => do_syscall_32_irqs_on
[    3.816977]  => restore_all
[    3.816977] ---------------------------------
[    3.816977] Kernel Offset: disabled
[    3.816977] ---[ end Kernel panic - not syncing: Outta here...

So yeah, we probably should enlarge the preemption-off region to contain
->fpstate_active. Here's what you basically suggested but with a
*looot* of explanatory text. Which might be really wrong or completely
unparseable or both. So holler what should be changed.

Thanks!

---
From: Borislav Petkov <bp@suse.de>
Date: Mon, 15 Feb 2016 19:50:33 +0100
Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation

On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with
interrupts enabled. And it can happen that we get preempted right after
setting ->fpstate_active in a task's FPU.

After we get preempted, we switch between tasks merrily and eventually
are about to switch to that task above whose ->fpstate_active we
set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets
->fpregs_active set, we find ourselves back on the call stack below and
especially in __fpu__restore_sig() which sets ->fpregs_active again.

Leading to that whoops below.

So let's enlarge the preemption-off region so that we set ->fpstate_active with
preemption disabled and thus not trigger fpu.preload:

  switch_fpu_prepare

  ...

        fpu.preload = static_cpu_has(X86_FEATURE_FPU) &&
                      new_fpu->fpstate_active &&
		      ^^^^^^^^^^^^^^^^^^^^^^

prematurely.

  WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
  Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi
  snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support
  emp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel
  iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586
  sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd
  ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core 211 lpc_ich
  mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd
  battery soundcore led_class ac thermal
  CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
  Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
   00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
   c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
   f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
  Call Trace:
    dump_stack
    warn_slowpath_common
    ? fpu__restore
    ? fpu__restore
    warn_slowpath_null
    fpu__restore
    __fpu__restore_sig
    fpu__restore_sig
    restore_sigcontext
    sys_sigreturn
    do_syscall_32_irqs_on
    entry_INT80_32

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/kernel/fpu/signal.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 31c6a60505e6..408e5a1c6fdd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -316,12 +316,11 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 			sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
 		}
 
+		preempt_disable();
 		fpu->fpstate_active = 1;
-		if (use_eager_fpu()) {
-			preempt_disable();
+		if (use_eager_fpu())
 			fpu__restore(fpu);
-			preempt_enable();
-		}
+		preempt_enable();
 
 		return err;
 	} else {
-- 
2.3.5



Tracing patch:
---
diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a2124343edf5..2cbc3bf34928 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -527,9 +527,16 @@ static inline void __fpregs_deactivate(struct fpu *fpu)
 /* Must be paired with a 'clts' (fpregs_activate_hw()) before! */
 static inline void __fpregs_activate(struct fpu *fpu)
 {
-	WARN_ON_FPU(fpu->fpregs_active);
+	if (WARN_ON_FPU(fpu->fpregs_active)) {
+		trace_printk("WARN: fpu: %p\n", fpu);
+		trace_dump_stack(0);
+		tracing_off();
+		panic("Outta here...\n");
+	}
 
 	fpu->fpregs_active = 1;
+	trace_printk("set ->fpregs_active\n");
+
 	this_cpu_write(fpu_fpregs_owner_ctx, fpu);
 }
 
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 31c6a60505e6..bb40f02cdfdd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -317,6 +317,14 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
 		}
 
 		fpu->fpstate_active = 1;
+
+		trace_printk("fpregs_active %d, %p\n", fpu->fpregs_active, fpu);
+
+		schedule();
+
+		trace_printk("after schedule: fpregs_active: %d %p\n",
+			     fpu->fpregs_active, fpu);
+
 		if (use_eager_fpu()) {
 			preempt_disable();
 			fpu__restore(fpu);
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 9f950917528b..ce768c728f38 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -249,6 +249,8 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	struct tss_struct *tss = &per_cpu(cpu_tss, cpu);
 	fpu_switch_t fpu_switch;
 
+	trace_printk("prev: %s <-> next: %s\n", prev_p->comm, next_p->comm);
+
 	/* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
 
 	fpu_switch = switch_fpu_prepare(prev_fpu, next_fpu, cpu);


-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

  reply	other threads:[~2016-02-15 19:14 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-11 19:27 WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130() Borislav Petkov
2016-02-11 23:47 ` Andy Lutomirski
2016-02-12  1:16   ` Andy Lutomirski
2016-02-12 17:00     ` Borislav Petkov
2016-02-15 19:14       ` Borislav Petkov [this message]
2016-02-16  2:25         ` Andy Lutomirski
2016-02-17  8:16           ` Ingo Molnar
2016-02-17  9:29             ` Borislav Petkov
2016-02-17  9:35               ` Ingo Molnar
2016-02-17 10:31                 ` Borislav Petkov
2016-02-17 11:06                   ` Ingo Molnar
2016-02-17 11:41                   ` Borislav Petkov
2016-02-17 17:52               ` Andy Lutomirski
2016-02-15 19:05     ` Borislav Petkov
2016-02-12 11:17   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160215191422.GB32716@pd.tnic \
    --to=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).