From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755695AbcLZVT0 (ORCPT ); Mon, 26 Dec 2016 16:19:26 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:39956 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751012AbcLZVTX (ORCPT ); Mon, 26 Dec 2016 16:19:23 -0500 Date: Mon, 26 Dec 2016 22:16:26 +0100 (CET) From: Thomas Gleixner To: Borislav Petkov cc: Boris Ostrovsky , Markus Trippelsdorf , Linus Torvalds , LKML , Ingo Molnar , "H. Peter Anvin" , Sebastian Andrzej Siewior Subject: Re: [GIT pull] smp/hotplug: Removal of notifiers In-Reply-To: <20161226210015.GA2945@nazgul.tnic> Message-ID: References: <20161226074530.GA297@x4> <20161226110600.GB297@x4> <20161226154502.GA287@x4> <53e3b52b-f353-63c8-f96f-649d754596bc@oracle.com> <20161226210015.GA2945@nazgul.tnic> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 26 Dec 2016, Borislav Petkov wrote: > On Mon, Dec 26, 2016 at 07:21:44PM +0100, Thomas Gleixner wrote: > > Is there anything interesting error message before the BUG hits? I'll try > > to reproduce on a AMD box tomorrow. > > Hmm, so lemme see if I see it correctly: > > threshold_create_bank() does kobject_create_and_add(name, &dev->kobj); > and that dev thing is > > struct device *dev = per_cpu(mce_device, cpu); > > BUT(!), those mce_device per-CPU things get initialized in > > mce_cpu_online() > |-> mce_device_create(cpu); > > With a CONFIG_HOTPLUG_CPU=n .config that doesn't happen, right? > > Oh, and I see what could've changed that: > > 8c0eeac819c8 ("x86/mcheck: Move CPU_ONLINE and CPU_DOWN_PREPARE to hotplug state machine") > > And before that, we did call mce_device_create(cpu) in > mcheck_init_device() which is a device initcall and not dependent on CPU > hotplug. > > And frankly, flipping back to the for_each_online_cpu(i) is yucky as > hell but I don't see any other/better solution besides pulling up > mce_device_create() into mcheck_init_device()... The hotplug callbacks are invoked even with HOTPLUG=n. So that's not the problem. I can reproduce it. Will post info once I understand it. Thanks, tglx