From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751782AbdJCK45 (ORCPT ); Tue, 3 Oct 2017 06:56:57 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:52859 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751442AbdJCK4z (ORCPT ); Tue, 3 Oct 2017 06:56:55 -0400 Date: Tue, 3 Oct 2017 12:56:50 +0200 (CEST) From: Thomas Gleixner To: Nicholas Piggin cc: Michael Ellerman , LKML , Ingo Molnar , Peter Zijlstra , Borislav Petkov , Andrew Morton , Sebastian Siewior , Don Zickus , Chris Metcalf , Ulrich Obergfell , Benjamin Herrenschmidt , linuxppc-dev@lists.ozlabs.org Subject: Re: [patch V2 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage In-Reply-To: <20171003200126.358155b7@roar.ozlabs.ibm.com> Message-ID: References: <20170912193654.321505854@linutronix.de> <20170912194147.862865570@linutronix.de> <87d165dqew.fsf@concordia.ellerman.id.au> <20171003200126.358155b7@roar.ozlabs.ibm.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 3 Oct 2017, Nicholas Piggin wrote: > On Tue, 3 Oct 2017 09:04:03 +0200 (CEST) > Thomas Gleixner wrote: > > > On Tue, 3 Oct 2017, Thomas Gleixner wrote: > > > On Tue, 3 Oct 2017, Michael Ellerman wrote: > > > > Hi Thomas, > > > > Unfortunately this is hitting the WARN_ON in start_wd_cpu() on powerpc > > > > because we're calling it multiple times for the boot CPU. > > > > > > > > The first call is via: > > > > > > > > start_wd_on_cpu+0x80/0x2f0 > > > > watchdog_nmi_reconfigure+0x124/0x170 > > > > softlockup_reconfigure_threads+0x110/0x130 > > > > lockup_detector_init+0xbc/0xe0 > > > > kernel_init_freeable+0x18c/0x37c > > > > kernel_init+0x2c/0x160 > > > > ret_from_kernel_thread+0x5c/0xbc > > > > > > > > And then again via the CPU hotplug registration: > > > > > > > > start_wd_on_cpu+0x80/0x2f0 > > > > cpuhp_invoke_callback+0x194/0x620 > > > > cpuhp_thread_fun+0x7c/0x1b0 > > > > smpboot_thread_fn+0x290/0x2a0 > > > > kthread+0x168/0x1b0 > > > > ret_from_kernel_thread+0x5c/0xbc > > > > > > > > > > > > The first call is new because previously watchdog_nmi_reconfigure() > > > > wasn't called from softlockup_reconfigure_threads(). > > > > > > Hmm, don't you have the same problem with CPU hotplug or do you just get > > > lucky because the hotplug callback in your code is ordered vs. the > > > softlockup thread hotplug callback in a way that this does not hit? > > I had the idea that it watchdog_nmi_reconfigure() being only called > with get_online_cpus held would prevent hotplug callbacks running. > > > > > Which leads me to the question why you need the hotplug state at all if the > > softlockup detector is enabled. Wouldn't it make more sense to only > > register the state if softlockup detector is turned off in Kconfig and > > actually move it to the core code? > > I don't understand what you mean exactly, but it was done to avoid > relying on the softlockup detector at all, because it wasn't needed > for anything else (unlike the perf lockup detector). If the softlockup detector is enabled along with your hardlockup detector then the current code in mainline invokes watchdog_nmi_enable(cpu), which is a weak function and as I just noticed not implemented by powerpc. So it's a non issue because it's not implemented. Thanks, tglx