From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752227AbeBSJUj (ORCPT ); Mon, 19 Feb 2018 04:20:39 -0500 Received: from bombadil.infradead.org ([198.137.202.133]:34888 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752006AbeBSJUi (ORCPT ); Mon, 19 Feb 2018 04:20:38 -0500 Date: Mon, 19 Feb 2018 10:20:17 +0100 From: Peter Zijlstra To: Ingo Molnar Cc: Tim Chen , David Woodhouse , hpa@zytor.com, tglx@linutronix.de, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, linux-tip-commits@vger.kernel.org, Borislav Petkov Subject: Re: [tip:x86/pti] x86/speculation: Use IBRS if available before calling into firmware Message-ID: <20180219092017.GN25201@hirez.programming.kicks-ass.net> References: <1518362359-1005-1-git-send-email-dwmw@amazon.co.uk> <1518808600.7876.49.camel@infradead.org> <66f94cb1-8160-56e0-680c-2e847ae05893@linux.intel.com> <20180217102616.vcwatxsgj2vunlew@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180217102616.vcwatxsgj2vunlew@gmail.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 17, 2018 at 11:26:16AM +0100, Ingo Molnar wrote: > Note that PeterZ was struggling with intermittent boot hangs yesterday as well, > which hangs came and went during severeal (fruitless) bisection attempts. Then at > a certain point the hangs went away altogether. > > The symptoms for both his and your hangs are consistent with an alignment > dependent bug. Mine would consistently hang right after "Freeing SMP alternatives memory: 44K" At one point I bisected it to commit: a06cc94f3f8d ("x86/build: Drop superfluous ALIGN from the linker script") But shortly thereafter I started having trouble reproducing, and now I can run kernels that before would consistently fail to boot :/ > My other guess is that it's perhaps somehow microcode related? I did not update or otherwise change packages while I was bisecting; the machine is: vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz stepping : 4 microcode : 0x428 Like I wrote on IRC; what _seems_ to have 'cured' things is clearing out my /boot. The amount of kernels generated by the bisect was immense and running 'update-grub' was taking _much_ longer than actually building a new kernel. What I have not tried is again generating and installing that many kernels to see if that will make it go 'funny' again.