From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752195AbbE0BWs (ORCPT ); Tue, 26 May 2015 21:22:48 -0400 Received: from mail-oi0-f53.google.com ([209.85.218.53]:35622 "EHLO mail-oi0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751428AbbE0BWq (ORCPT ); Tue, 26 May 2015 21:22:46 -0400 MIME-Version: 1.0 In-Reply-To: <20150519214145.GA319@gmail.com> References: <1430848712-28064-1-git-send-email-mingo@kernel.org> <20150519214145.GA319@gmail.com> Date: Tue, 26 May 2015 21:22:45 -0400 Message-ID: Subject: Re: [PATCH 000/208] big x86 FPU code rewrite From: Bobby Powers To: Ingo Molnar Cc: Kernel development list , Andy Lutomirski , Borislav Petkov , Dave Hansen , Fenghua Yu , "H. Peter Anvin" , Linus Torvalds , Oleg Nesterov , Thomas Gleixner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Ingo Molnar wrote: > Please have a look. I've been running this for ~ 2 weeks. I've only seen one issue, when emerging mesa 10.5.6: [May26 20:41] traps: aclocal-1.15[27452] trap invalid opcode ip:7f6331031ab0 sp:7ffe73ece880 error:0 in libperl.so.5.20.2[7f6330f18000+19e000] [ +0.000051] ------------[ cut here ]------------ [ +0.000005] WARNING: CPU: 0 PID: 27452 at arch/x86/kernel/fpu/core.c:324 fpu__activate_stopped+0x8a/0xa0() [ +0.000002] Modules linked in: bnep iwlmvm btusb btintel bluetooth iwlwifi [ +0.000007] CPU: 0 PID: 27452 Comm: aclocal-1.15 Not tainted 4.1.0-rc5+ #163 [ +0.000001] Hardware name: LENOVO 20BSCTO1WW/20BSCTO1WW, BIOS N14ET24W (1.02 ) 10/27/2014 [ +0.000001] ffffffff82172735 ffff88017cccb998 ffffffff81c4f534 0000000080000000 [ +0.000002] 0000000000000000 ffff88017cccb9d8 ffffffff8112611a ffff88017cccb9f8 [ +0.000002] ffff88018e352400 0000000000000000 0000000000000000 ffff8801ef813a00 [ +0.000002] Call Trace: [ +0.000004] [] dump_stack+0x4f/0x7b [ +0.000003] [] warn_slowpath_common+0x8a/0xc0 [ +0.000003] [] warn_slowpath_null+0x1a/0x20 [ +0.000002] [] fpu__activate_stopped+0x8a/0xa0 [ +0.000002] [] xfpregs_get+0x31/0x90 [ +0.000001] [] ? getreg+0xa9/0x130 [ +0.000003] [] elf_core_dump+0x531/0x1490 [ +0.000003] [] do_coredump+0xbd1/0xef0 [ +0.000004] [] ? try_to_wake_up+0x1f8/0x350 [ +0.000002] [] get_signal+0x38c/0x700 [ +0.000003] [] do_signal+0x28/0x760 [ +0.000002] [] ? do_trap+0x6d/0x150 [ +0.000002] [] ? vfs_read+0x11e/0x140 [ +0.000003] [] ? trace_hardirqs_off_thunk+0x17/0x19 [ +0.000002] [] do_notify_resume+0x70/0x80 [ +0.000002] [] retint_signal+0x42/0x80 [ +0.000002] ---[ end trace 8baea2e2110d6ca1 ]--- This trace is a bit off - the path to fpu__activate_stopped from elf_core_dump looks like: fpu__activate_stopped xfgregs_get fill_thread_core_info fill_note_info elf_core_dump It looks like the WARN_ON_FPU there is just invalid? If we're dumping, we have a valid case for curr == target. I can reproduce this and I have the coredump, but I have no hope in creating a test case out of this. yours, Bobby