From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2992443AbcB0NNq (ORCPT ); Sat, 27 Feb 2016 08:13:46 -0500 Received: from mail.skyhub.de ([78.46.96.112]:51136 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756241AbcB0NNp (ORCPT ); Sat, 27 Feb 2016 08:13:45 -0500 Date: Sat, 27 Feb 2016 14:13:37 +0100 From: Borislav Petkov To: Ingo Molnar Cc: kernel test robot , Andy Lutomirski , lkp@01.org, LKML , yu-cheng yu , Thomas Gleixner , Sai Praneeth Prakhya , Rik van Riel , Quentin Casasnovas , Peter Zijlstra , Oleg Nesterov , Linus Torvalds , "H. Peter Anvin" , Fenghua Yu , Dave Hansen , Andy Lutomirski Subject: Re: [lkp] [x86/fpu] 58122bf1d8: WARNING: CPU: 0 PID: 1 at arch/x86/include/asm/fpu/internal.h:529 fpu__restore+0x28f/0x9ab() Message-ID: <20160227131337.GB5261@pd.tnic> References: <87d1rk9str.fsf@yhuang-dev.intel.com> <20160226074940.GA28911@pd.tnic> <20160227120211.GA25164@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160227120211.GA25164@gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 27, 2016 at 01:02:11PM +0100, Ingo Molnar wrote: > So I'm wondering, why did this commit: > > 58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs > Hmm, so looking at switch_fpu_prepare(): /* * If the task has used the math, pre-load the FPU on xsave processors * or if the past 5 consecutive context-switches used math. */ fpu.preload = static_cpu_has(X86_FEATURE_FPU) && new_fpu->fpstate_active && (use_eager_fpu() || new_fpu->counter > 5); ^^^^^^^^^^^^^^ and later: if (old_fpu->fpregs_active) { ... /* Don't change CR0.TS if we just switch! */ if (fpu.preload) { ... __fpregs_activate(new_fpu); so I can see a possible link between 58122bf1d856 and what we're seeing. But as I've told you offlist, I couldn't confirm that this commit was the culprit due to my simulated reproducer. So I'm thinking the 0day guys have a more reliable one. > trigger the warning, while it never triggered on CPUs that were already > eagerfpu=on for years? That I can't explain... yet. FWIW, the one time splat I saw, happened on an IVB machine on 32-bit which has always been eagerfpu=on. > There must be something we are still missing I think. Yeah. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply.